@natolambert: New podcast with @finbarrtimbers! We survey the latest post-training recipes, from GLM 5.1, Kimi K2.6, DeepSeek V4, Xia…

X AI KOLs Timeline 06/16/26, 01:44 PM Events

podcast post-training llm model-recipes deepseek glm ai-discussion

Summary

Nathan Lambert and Finbarr Timbers discuss the latest post-training recipes for large language models, including DeepSeek V4, GLM 5.1, Kimi K2.6, and the industry shift to multi-teacher on-policy distillation.

New podcast with @finbarrtimbers! We survey the latest post-training recipes, from GLM 5.1, Kimi K2.6, DeepSeek V4, Xiaomi MiMo V2.5, Nemotron Ultra, etc. and discuss: - Why the industry slowly shifted to multi-teacher on-policy distillation (MOPD). - What an Olmo-style recipe would need improvements in - How post-training works / suits larger organizational efforts - Career advice in the foothills of the singularity - and other topics I heard y'all wanted me to start doing this, so making some time when I'm in funemployment! Chapters: 00:00 Introduction & Olmo reflections 06:28 Post-train recipes review (history) 23:00 2026’s model recipes (MiMo Flash, DeepSeek V4, GLM 5, Kimi K2.6, etc.) 39:05 Open-ended post-training discussions 48:22 Career advice in the LLM race Links below, please follow @interconnectsai and like and subscribe and buy my book?

Original Article

View Cached Full Text

Cached at: 06/17/26, 01:44 AM

New podcast with @finbarrtimbers! We survey the latest post-training recipes, from GLM 5.1, Kimi K2.6, DeepSeek V4, Xiaomi MiMo V2.5, Nemotron Ultra, etc. and discuss:

Why the industry slowly shifted to multi-teacher on-policy distillation (MOPD).
What an Olmo-style recipe would need improvements in
How post-training works / suits larger organizational efforts
Career advice in the foothills of the singularity
and other topics

I heard y’all wanted me to start doing this, so making some time when I’m in funemployment!

Chapters:

00:00 Introduction & Olmo reflections 06:28 Post-train recipes review (history) 23:00 2026’s model recipes (MiMo Flash, DeepSeek V4, GLM 5, Kimi K2.6, etc.) 39:05 Open-ended post-training discussions 48:22 Career advice in the LLM race

Links below, please follow @interconnectsai and like and subscribe and buy my book?

@natolambert: New podcast with @finbarrtimbers! We survey the latest post-training recipes, from GLM 5.1, Kimi K2.6, DeepSeek V4, Xia…

Similar Articles

@cjzafir: Models that I'm using daily: > Codex 5.5 high (fast) > Deepseek v4 pro via API > Kimi 2.6 via API Models that I am fine…

@DJLougen: Proud to introduce a new 27B post-trained model After being impressed by both Fable and Kimi 2.7 Coder, I wanted to see…

Deepseek, kimi etc..

@ziv_ravid: 1/I read the Nemotron 3 Ultra report and it's interesting to compare their post-training to DeepSeek V4's. Both now do …

@tom_doerr: Trained on 13M hours of mixed audio and text data https://github.com/MoonshotAI/Kimi-Audio…

Submit Feedback

Similar Articles

@cjzafir: Models that I'm using daily: > Codex 5.5 high (fast) > Deepseek v4 pro via API > Kimi 2.6 via API Models that I am fine…

@DJLougen: Proud to introduce a new 27B post-trained model After being impressed by both Fable and Kimi 2.7 Coder, I wanted to see…
Introduces a new 27B post-trained model that distills positives from Fable and Kimi 2.7 Coder, with links to download.

@ziv_ravid: 1/I read the Nemotron 3 Ultra report and it's interesting to compare their post-training to DeepSeek V4's. Both now do …

@tom_doerr: Trained on 13M hours of mixed audio and text data https://github.com/MoonshotAI/Kimi-Audio…