Tag
This primer paper explores how reasoning models improve after training, arguing that effective reasoning data relies more on checkable training evidence than raw data size. It categorizes reasoning data by verification methods and emphasizes preserving messy agent data for learning signals.
A comprehensive primer synthesizing over 150 public studies on post-training reasoning data, organizing the field around four key questions about data objects, usefulness, construction, and scaling.
Recommend 'The Little Book of Generative AI Foundations', a generative AI math fundamentals book covering core threads like PCA, SVD, VAE, diffusion models, targeted at agentic engineering practitioners.