@Hesamation: 3Blue1Brown’s new video explains why every LLM is actually a compression machine. everyone describes pre-training as “n…

X AI KOLs Timeline 06/08/26, 03:48 PM News

compression llm pre-training next-token-prediction reasoning understanding

Summary

3Blue1Brown's new video explains that LLMs are fundamentally compression machines, linking next-token prediction to efficient encoding of human knowledge, which leads to better abstraction and reasoning.

3Blue1Brown’s new video explains why every LLM is actually a compression machine. everyone describes pre-training as “next token prediction” but that’s just the surface-level objective. in reality it is a means to making the most efficient text compressor. prediction and compression are two sides of the same coin. when you train the model to predict the next token you’re not just teaching it to guess the next word but how to best encode the human knowledge it sees. better compression means better abstraction means better reasoning at some point, compression stops looking like storage or a database (as some like to call it on X) and looks like an approximation of understanding.

Original Article

View Cached Full Text

Cached at: 06/09/26, 12:47 PM

3Blue1Brown’s new video explains why every LLM is actually a compression machine.

everyone describes pre-training as “next token prediction” but that’s just the surface-level objective.

in reality it is a means to making the most efficient text compressor.

prediction and compression are two sides of the same coin.

when you train the model to predict the next token you’re not just teaching it to guess the next word but how to best encode the human knowledge it sees.

better compression means better abstraction means better reasoning

at some point, compression stops looking like storage or a database (as some like to call it on X) and looks like an approximation of understanding.

@Hesamation: 3Blue1Brown’s new video explains why every LLM is actually a compression machine. everyone describes pre-training as “n…

Similar Articles

@techNmak: I finally found someone who explained why LLM inference is fundamentally different from regular inference… without over…

@techNmak: This is the best way to learn how LLMs work. Interactive. 3D. Step-by-step. Covers: → Embedding → Layer Norm → Self-Att…

Rant: Stop saying LLMs are just “next token predictors.”

LiteFrame Scales Video LLM Efficiency (6 minute read)

Learning to reason with LLMs

Submit Feedback

Similar Articles

@techNmak: I finally found someone who explained why LLM inference is fundamentally different from regular inference… without over…

@techNmak: This is the best way to learn how LLMs work. Interactive. 3D. Step-by-step. Covers: → Embedding → Layer Norm → Self-Att…

Rant: Stop saying LLMs are just “next token predictors.”

LiteFrame Scales Video LLM Efficiency (6 minute read)