@natolambert: New lecture for the book! Nominally about synthetic data, but mostly is a walk through of the distillation literature f…
Summary
Natolambert announces a new lecture covering synthetic data and the history of distillation, from Hinton 2015 to modern on-policy distillation, with over 7 hours of video content.
View Cached Full Text
Cached at: 06/23/26, 04:12 PM
New lecture for the book! Nominally about synthetic data, but mostly is a walk through of the distillation literature from the Hinton 2015 paper to multi-teach on-policy distillation of today!
At 7.4 hours of video in my post-training brain dump and counting :)
It was fun to stare at the math long enough and talk through the 3-4 core changes that needed to be made to the original formulation to have on-policy distillation be ready for the mainstream like it is today (and in RL frameworks).
Otherwise, I include a bit of a history lesson for how synthetic data generally slowly took over all post-training data research (it wasn’t always the case)! Then I do some 101 review on constitutional AI, rubrics, and other popular methods.
00:00 The emergence of synthetic data 10:50 Background on teacher-student knowledge-distillation 24:47: On-policy distillation (OPD, MOPD, and OPSD) 37:11 Constitutional AI & AI Feedback 45:50 Rubrics as rewards & conclusions
Ofc, watch on YouTube etc.
Similar Articles
@natolambert: New podcast with @finbarrtimbers! We survey the latest post-training recipes, from GLM 5.1, Kimi K2.6, DeepSeek V4, Xia…
Nathan Lambert and Finbarr Timbers discuss the latest post-training recipes for large language models, including DeepSeek V4, GLM 5.1, Kimi K2.6, and the industry shift to multi-teacher on-policy distillation.
@zhaisf: These were some magical results from distillation by @geoffreyhinton that really shocked me when I first saw them, and …
The article discusses surprising robustness of model distillation with respect to training distribution, even with little overlap with target distribution, and its implications for on/off-policy distillation.
@neural_avb: If yall are interested in On Policy Distillation, check this specific repo. Somebody put together a curated collection …
A curated collection of papers and tools for On Policy Distillation, organized and annotated with a getting-started section, shared via a GitHub repo.
@NielsRogge: One of the hottest terms in AI right now is "On-policy distillation". It is a post-training technique in which a studen…
On-policy distillation is highlighted as a hot post-training technique combining distillation with online RL, now listed on PapersWithCode with 183 citing papers.
@yacinelearning: okay folks buckle up because this thursday we have @joelniklaus from @huggingface that will join us on stream to teach …
Joel Niklaus from Hugging Face will give a live stream on synthetic data's role in advancing pretraining; the team has also published a playbook on the topic.