@dwarkesh_sp: We pre-train LLMs on the whole of the internet. You might think this explains how they learn so many emergent capabilit…
Summary
Dwarkesh Patel tweets about Sergey Levine's argument that emergent capabilities in LLMs arise from compositionality, not just from training data.
Similar Articles
LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs
PropMe is a propensity-aware framework for evaluating LLM memorization, distinguishing between forced reproduction capabilities and natural propensity using SimpleTrace for deterministic attribution across open models and datasets.
@neural_avb: If you think about it, LLM training in 2026 is really a 3-step loop : - train it on some data - dogfood it/run categori…
The tweet outlines a 3-step loop for LLM training in 2026: train on data, run evals, and add synthetic data for underperforming tasks. It emphasizes the accessibility of legal distillation via open source models and cheap APIs, noting that training on reasoning traces alone can achieve high scores.
@haider1: Yann LeCun says LLMs are strongest in domains where language itself is the substrate of reasoning, like math and code T…
Yann LeCun states that LLMs are strongest in domains where language is the substrate of reasoning, like math and code, but they are not creative mathematicians, software architects, or computer scientists.
@Hesamation: 3Blue1Brown’s new video explains why every LLM is actually a compression machine. everyone describes pre-training as “n…
3Blue1Brown's new video explains that LLMs are fundamentally compression machines, linking next-token prediction to efficient encoding of human knowledge, which leads to better abstraction and reasoning.
@GaryMarcus: Am old enough to remember when @GeoffreyHinton told me I was stupid for saying that LLMs regurgitate training data. He …
Gary Marcus highlights recent DeepMind research confirming that LLMs frequently memorize and regurgitate training data, countering past criticism from Geoffrey Hinton. The post underscores ongoing debates about LLM limitations and their real-world capabilities.