@rasbt: After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out! My first …

X AI KOLs Timeline 06/30/26, 01:16 PM Products

book-release reasoning-model from-scratch reinforcement-learning distillation inference-scaling

Summary

Sebastian Raschka announces the release of his book 'Build a Reasoning Model (From Scratch)' after 18 months of work, covering inference scaling, reinforcement learning, and distillation from scratch.

After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out! My first copies just arrived! 📚 440 full-color pages. Inference scaling, reinforcement learning, and distillation from scratch. https://t.co/647ksI7sLc

Original Article

View Cached Full Text

Cached at: 06/30/26, 01:41 PM

After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out!

My first copies just arrived! 📚

440 full-color pages. Inference scaling, reinforcement learning, and distillation from scratch. https://t.co/647ksI7sLc

Similar Articles

@rohanpaul_ai: A Primer paper about how reasoning models improve after training Shows that better reasoning models depend less on raw …

X AI KOLs Following

This primer paper explores how reasoning models improve after training, arguing that effective reasoning data relies more on checkable training evidence than raw data size. It categorizes reasoning data by verification methods and emphasizes preserving messy agent data for learning signals.

@dair_ai: Nice primer on post-training reasoning data. (bookmark it) This is one of the first primers to pull the scattered post-…

X AI KOLs Timeline

A comprehensive primer synthesizing over 150 public studies on post-training reasoning data, organizing the field around four key questions about data objects, usefulness, construction, and scaling.

@jiqizhixin: Awesome blog! State of RL for reasoning LLMs https://aweers.de/blog/2026/rl-for-llms/…

X AI KOLs Timeline

A comprehensive blog post reviewing the state of reinforcement learning for reasoning LLMs, covering methods from REINFORCE and PPO to GRPO and beyond, with connections to key models like InstructGPT and DeepSeek-R1.

Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning

arXiv cs.CL

This paper presents a full-pipeline recipe for teaching thinking models to reason with tools, achieving state-of-the-art performance on benchmarks like AIME 2025 when applied to Qwen3 models.

RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning

arXiv cs.LG

RASFT is a novel supervised fine-tuning framework for large language models that adapts expert supervision based on the model's own reasoning capabilities, achieving better performance on mathematical and code reasoning benchmarks compared to standard SFT and reinforcement learning methods.