@KaiZhang_CS: Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work o…

X AI KOLs Timeline Models

Summary

Yu Su's team trained a frontier Deep Research Agent on an academic budget using 8K synthetic samples and RL, releasing fully open training infrastructure and models from 2B to 35B parameters.

Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work on frontier agents!😀
Original Article
View Cached Full Text

Cached at: 06/18/26, 10:12 AM

Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work on frontier agents!😀

Yu Su (@ysu_nlp): We trained a ~frontier Deep Research Agent on academic budget

> 32 H100s > 8K synthetic samples > fully open training infra + recipe (SFT, mid-training, RL) > models of diff sizes (2B -> 35B) ready to use out of the box

This is yet another demonstration of how the frontier of

Similar Articles