@Ex0byt: A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fu…

X AI KOLs Timeline 06/18/26, 01:02 AM Models

open-source deep-research agent synthetic-data training-recipe 35b-model academic-budget

Summary

A small team trained a frontier-level Deep Research Agent on an academic budget using only 32 H100s and 8K synthetic samples, releasing fully open weights, code, and paper for models from 2B to 35B that match or beat closed frontier agents on key benchmarks.

A must bookmark.. tiny cracked team, 4 H100 nodes, open source 3 stage recipe, trained on 8k synthetic rubric tasks, fully open weights, code and paper; out comes a 35B model that matches or beats closed frontier deep research agents on key benchmarks. the traces this model produces is gold.

Original Article

View Cached Full Text

Cached at: 06/18/26, 02:17 PM

Yu Su (@ysu_nlp): We trained a ~frontier Deep Research Agent on academic budget

> 32 H100s > 8K synthetic samples > fully open training infra + recipe (SFT, mid-training, RL) > models of diff sizes (2B -> 35B) ready to use out of the box

This is yet another demonstration of how the frontier of

Similar Articles

@KaiZhang_CS: Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work o…

X AI KOLs Timeline

Yu Su's team trained a frontier Deep Research Agent on an academic budget using 8K synthetic samples and RL, releasing fully open training infrastructure and models from 2B to 35B parameters.

Researchers trained a Deep Research agent with 32 H100s and open-sourced everything

Reddit r/LocalLLaMA

Researchers trained a Deep Research agent using 32 H100 GPUs and open-sourced all components, enabling community access and further development.

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Hugging Face Daily Papers

DR-Venus-4B is a 4B-parameter deep-research agent trained on only 10K open samples via agentic SFT+RL with turn-level rewards, outrunning prior sub-9B agents and rivaling 30B models on research benchmarks while staying deployable on edge devices.

@Apodex_AI: Dive in Blog: https://apodex.com/blog/apodex-1.0 Tech report: http://apodex.com/pdf/20260608 Github: https://github.com…

X AI KOLs Following

ApodexAI releases Apodex-1.0, a deep-research model that operates as a tool-using ReAct agent. Its heavy-duty mode, Apodex-1.0-H, uses an asynchronous agent team with up to 150 sub-agents and achieves new state-of-the-art results on deep-research benchmarks including BrowseComp, DeepSearchQA, HLE, and FrontierScience, surpassing models like GPT-5.5-pro and Claude-Opus-4.8.

@heyshrutimishra: Apodex 1.0 dropped and the architecture is genuinely different. It's post-trained on Qwen3.5 as a self-evolving system:…