@KaiZhang_CS: Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work o…

X AI KOLs Timeline 06/17/26, 11:10 PM Models

open-source search-agent deep-research synthetic-data reinforcement-learning academic-budget

Summary

Yu Su's team trained a frontier Deep Research Agent on an academic budget using 8K synthetic samples and RL, releasing fully open training infrastructure and models from 2B to 35B parameters.

Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work on frontier agents!😀

Original Article

View Cached Full Text

Cached at: 06/18/26, 10:12 AM

Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work on frontier agents!😀

Yu Su (@ysu_nlp): We trained a ~frontier Deep Research Agent on academic budget

> 32 H100s > 8K synthetic samples > fully open training infra + recipe (SFT, mid-training, RL) > models of diff sizes (2B -> 35B) ready to use out of the box

This is yet another demonstration of how the frontier of

Similar Articles

@Ex0byt: A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fu…

X AI KOLs Timeline

A small team trained a frontier-level Deep Research Agent on an academic budget using only 32 H100s and 8K synthetic samples, releasing fully open weights, code, and paper for models from 2B to 35B that match or beat closed frontier agents on key benchmarks.

@tom_doerr: Fully open sources training data for 30B scale search agents https://github.com/PolarSeeker/OpenSeeker…

X AI KOLs Timeline

OpenSeeker fully open-sources training data and models for 30B-scale ReAct-based search agents, achieving state-of-the-art performance on multiple benchmarks including BrowseComp and Humanity's Last Exam. It is the first purely academic project to reach frontier search benchmark performance while releasing complete training data.

S1-DeepResearch: Beyond Search, Toward Real-World Long-Horizon Research Agents

arXiv cs.AI

This paper introduces S1-DeepResearch-32B, an open-source model and 15K trajectory dataset for deep research agents, achieving state-of-the-art performance across 20 benchmarks by jointly modeling information acquisition, knowledge synthesis, and planning.

QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

Hugging Face Daily Papers

QUEST is an open family of deep research agents trained with synthetic data and reinforcement learning, achieving strong performance across diverse long-horizon search tasks, approaching frontier closed-source agents.

@VukRosic99: A DeepSeek researcher just open-sourced his AutoResearch personal project. For the first time, the AutoResearch Agent a…

X AI KOLs Timeline

A DeepSeek researcher open-sourced AutoResearch, an autonomous framework that can plan, execute, and debug RL experiments on the DeepSeek 285B model without human intervention, accompanied by a self-play survey paper.

Similar Articles

@Ex0byt: A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fu…

@tom_doerr: Fully open sources training data for 30B scale search agents https://github.com/PolarSeeker/OpenSeeker…

S1-DeepResearch: Beyond Search, Toward Real-World Long-Horizon Research Agents

QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

@VukRosic99: A DeepSeek researcher just open-sourced his AutoResearch personal project. For the first time, the AutoResearch Agent a…

Submit Feedback