@rasbt: After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out! My first …
Summary
Sebastian Raschka announces the release of his book 'Build a Reasoning Model (From Scratch)' after 18 months of work, covering inference scaling, reinforcement learning, and distillation from scratch.
View Cached Full Text
Cached at: 06/30/26, 01:41 PM
After 18 months of writing, coding, and experimenting, Build a Reasoning Model (From Scratch) is finally out!
My first copies just arrived! 📚
440 full-color pages. Inference scaling, reinforcement learning, and distillation from scratch. https://t.co/647ksI7sLc
Similar Articles
@rohanpaul_ai: A Primer paper about how reasoning models improve after training Shows that better reasoning models depend less on raw …
This primer paper explores how reasoning models improve after training, arguing that effective reasoning data relies more on checkable training evidence than raw data size. It categorizes reasoning data by verification methods and emphasizes preserving messy agent data for learning signals.
@dair_ai: Nice primer on post-training reasoning data. (bookmark it) This is one of the first primers to pull the scattered post-…
A comprehensive primer synthesizing over 150 public studies on post-training reasoning data, organizing the field around four key questions about data objects, usefulness, construction, and scaling.
@jiqizhixin: Awesome blog! State of RL for reasoning LLMs https://aweers.de/blog/2026/rl-for-llms/…
A comprehensive blog post reviewing the state of reinforcement learning for reasoning LLMs, covering methods from REINFORCE and PPO to GRPO and beyond, with connections to key models like InstructGPT and DeepSeek-R1.
Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning
This paper presents a full-pipeline recipe for teaching thinking models to reason with tools, achieving state-of-the-art performance on benchmarks like AIME 2025 when applied to Qwen3 models.
RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning
RASFT is a novel supervised fine-tuning framework for large language models that adapts expert supervision based on the model's own reasoning capabilities, achieving better performance on mathematical and code reasoning benchmarks compared to standard SFT and reinforcement learning methods.