The Next Evolution in AI Scaling
Tokens are the smallest atomic units a language model reads.
Between "cat" and every other token. Higher score = more relevant context.
6 tokens × 6 attention scores = 36 computations.
Every cell is one computation. n tokens = n² computations.
As context size grows, compute explodes — the fundamental wall.