[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han
Summary
At the AI Engineer World Congress, Daniel Han delivered an in-depth talk on the practical experiences of reinforcement learning, model fine-tuning, quantization, and agents. He reviewed the evolution of open-source models from Llama to DeepSeek R1 and analyzed the five key stages of modern model training.
View Cached Full Text
Cached at: 06/25/26, 01:32 PM
Similar Articles
@danielhanchen: I’m running a 3 hour advanced workshop at AI Engineer World’s Fair! 2026 has greatly changed how one should learn lower…
Daniel Han is hosting a 3-hour advanced workshop at the AI Engineer World's Fair, sharing insights on the history of open-source large models, classification of training stages (pre-training, intermediate training, supervised fine-tuning, post-training, reinforcement fine-tuning), and the leap in reasoning models. He also introduced his team's open-source contributions to fine-tuning optimization.
@Michaelzsguo: This is one of the best deep discussions I've seen recently about the fundamentals of reinforcement learning and its relationship to modern AI. Eric Jang and Dwarkesh turned a seemingly retro exercise—rebuilding AlphaGo with today's tools—into a very clear masterclass: why 'search +...'
A detailed discussion on reinforcement learning and its connection to modern AI, using the reconstruction of AlphaGo with modern tools as a clear example of search and self-play. Key takeaways include neural network amortization of search, credit assignment challenges in LLMs vs AlphaGo, and implications for automated research.
@dair_ai: https://x.com/dair_ai/status/2053495521243799717
DAIR AI's weekly roundup highlights top research papers including HeavySkill, which improves model performance via internalized parallel reasoning, and Sakana AI's Conductor, which uses RL to optimize agent orchestration. It also covers Meta FAIR's work on self-improving pretraining.
@snowboat84: https://x.com/snowboat84/status/2065215177029787705
This article is the middle part of the AI Engineering Landscape series, detailing core techniques such as inference optimization, model slimming (quantization, distillation, pruning, MoE), and speculative decoding, while reviewing the latest advances from hardware to the engineering stack.
@danintheory: Great conversation and a fun way to learn about an important open AI problem!
Sequoia Capital highlights the gap between current AI models that train once and human continuous learning, and points to EngramLab's work on AI that never stops learning with memory inside the model.