@Ex0byt: A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fu…

X AI KOLs Timeline Models

Summary

A small team trained a frontier-level Deep Research Agent on an academic budget using only 32 H100s and 8K synthetic samples, releasing fully open weights, code, and paper for models from 2B to 35B that match or beat closed frontier agents on key benchmarks.

A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fully open weights, code and paper; out comes a 35B model that matches or beats closed frontier deep research agents on key benchmarks. the traces this model produces is gold.
Original Article
View Cached Full Text

Cached at: 06/18/26, 02:17 PM

A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fully open weights, code and paper; out comes a 35B model that matches or beats closed frontier deep research agents on key benchmarks. the traces this model produces is gold.

Yu Su (@ysu_nlp): We trained a ~frontier Deep Research Agent on academic budget

> 32 H100s > 8K synthetic samples > fully open training infra + recipe (SFT, mid-training, RL) > models of diff sizes (2B -> 35B) ready to use out of the box

This is yet another demonstration of how the frontier of

Similar Articles

@Apodex_AI: Dive in Blog: https://apodex.com/blog/apodex-1.0 Tech report: http://apodex.com/pdf/20260608 Github: https://github.com…

X AI KOLs Following

ApodexAI releases Apodex-1.0, a deep-research model that operates as a tool-using ReAct agent. Its heavy-duty mode, Apodex-1.0-H, uses an asynchronous agent team with up to 150 sub-agents and achieves new state-of-the-art results on deep-research benchmarks including BrowseComp, DeepSearchQA, HLE, and FrontierScience, surpassing models like GPT-5.5-pro and Claude-Opus-4.8.