DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Hugging Face Daily Papers 04/21/26, 12:00 AM Papers

Summary

DR-Venus-4B is a 4B-parameter deep-research agent trained on only 10K open samples via agentic SFT+RL with turn-level rewards, outrunning prior sub-9B agents and rivaling 30B models on research benchmarks while staying deployable on edge devices.

Edge-scale deep research agents based on small language models are attractive for real-world deployment due to their advantages in cost, latency, and privacy. In this work, we study how to train a strong small deep research agent under limited open-data by improving both data quality and data utilization. We present DR-Venus, a frontier 4B deep research agent for edge-scale deployment, built entirely on open data. Our training recipe consists of two stages. In the first stage, we use agentic supervised fine-tuning (SFT) to establish basic agentic capability, combining strict data cleaning with resampling of long-horizon trajectories to improve data quality and utilization. In the second stage, we apply agentic reinforcement learning (RL) to further improve execution reliability on long-horizon deep research tasks. To make RL effective for small agents in this setting, we build on IGPO and design turn-level rewards based on information gain and format-aware regularization, thereby enhancing supervision density and turn-level credit assignment. Built entirely on roughly 10K open-data, DR-Venus-4B significantly outperforms prior agentic models under 9B parameters on multiple deep research benchmarks, while also narrowing the gap to much larger 30B-class systems. Our further analysis shows that 4B agents already possess surprisingly strong performance potential, highlighting both the deployment promise of small models and the value of test-time scaling in this setting. We release our models, code, and key recipes to support reproducible research on edge-scale deep research agents.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/23/26, 03:35 AM

Paper page - DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Source: https://huggingface.co/papers/2604.19859 Published on Apr 21

#2 Paper of the day Authors:

Abstract

DR-Venus-4B is a 4-billion-parameter deep research agent trained entirely on open data using agentic supervised fine-tuning and reinforcement learning with turn-level rewards to achieve superior performance on research benchmarks while maintaining edge-scale deployment advantages.

Edge-scaledeep research agents based on small language models are attractive for real-world deployment due to their advantages in cost, latency, and privacy. In this work, we study how to train a strong smalldeep research agentunder limited open-data by improving both data quality and data utilization. We present DR-Venus, a frontier 4Bdeep research agentforedge-scale deployment, built entirely on open data. Our training recipe consists of two stages. In the first stage, we useagentic supervised fine-tuning(SFT) to establish basic agentic capability, combining strict data cleaning with resampling of long-horizon trajectories to improve data quality and utilization. In the second stage, we applyagentic reinforcement learning(RL) to further improve execution reliability on long-horizon deep research tasks. To make RL effective for small agents in this setting, we build on IGPO and designturn-level rewardsbased oninformation gainandformat-aware regularization, thereby enhancing supervision density and turn-level credit assignment. Built entirely on roughly 10K open-data, DR-Venus-4B significantly outperforms prior agentic models under 9B parameters on multiple deep research benchmarks, while also narrowing the gap to much larger 30B-class systems. Our further analysis shows that 4B agents already possess surprisingly strong performance potential, highlighting both the deployment promise of small models and the value oftest-time scalingin this setting. We release our models, code, and key recipes to support reproducible research on edge-scaledeep research agents.

View arXiv page View PDF Project page GitHub10 Add to collection

Get this paper in your agent:

hf papers read 2604\.19859

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.19859 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.19859 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.19859 in a Space README.md to link it from this page.

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Paper page - DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper1

Similar Articles

Mind DeepResearch Technical Report

@tom_doerr: Fully open sources training data for 30B scale search agents https://github.com/PolarSeeker/OpenSeeker…

Deep Research Max: a step change for autonomous research agents | New from Deepmind

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Deep research System Card

Submit Feedback

Similar Articles

Mind DeepResearch Technical Report

@tom_doerr: Fully open sources training data for 30B scale search agents https://github.com/PolarSeeker/OpenSeeker…

Deep Research Max: a step change for autonomous research agents | New from Deepmind

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation