Scaling Agents via Continual Pre-training
Summary
Proposes Agentic Continual Pre-training to build agentic foundation models, achieving state-of-the-art results on 10 benchmarks with AgentFounder-30B, including 39.9% on BrowseComp-en and 43.3% on BrowseComp-zh.
View Cached Full Text
Cached at: 06/01/26, 01:01 PM
Paper page - Scaling Agents via Continual Pre-training
Source: https://huggingface.co/papers/2509.13310 Published on Sep 16, 2025
#1 Paper of the day Authors:
,
,
,
,
,
,
,
,
,
,
,
Abstract
AgentFounder, a deep research agent model incorporating Agentic Continual Pre-training, achieves state-of-the-art performance in agentic tasks while maintaining strong tool-use ability.
Large language models(LLMs) have evolved intoagentic systemscapable ofautonomous tool useandmulti-step reasoningfor complex problem-solving. However,post-training approachesbuilding upon general-purpose foundation models consistently underperform in agentic tasks, particularly in open-source implementations. We identify the root cause: the absence of robust agentic foundation models forces models during post-training to simultaneously learn diverse agentic behaviors while aligning them to expert demonstrations, thereby creating fundamental optimization tensions. To this end, we are the first to propose incorporatingAgentic Continual Pre-training(Agentic CPT) into thedeep research agentstraining pipeline to build powerful agentic foundational models. Based on this approach, we develop a deep research agent model namedAgentFounder. We evaluate ourAgentFounder-30B on 10 benchmarks and achieve state-of-the-art performance while retains strong tool-use ability, notably 39.9% onBrowseComp-en, 43.3% onBrowseComp-zh, and 31.5% Pass@1 onHLE.
View arXiv pageView PDFProject pageGitHub19.1kAdd to collection
Get this paper in your agent:
hf papers read 2509\.13310
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2509.13310 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2509.13310 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2509.13310 in a Space README.md to link it from this page.
Collections including this paper19
Similar Articles
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
Agent-World introduces a self-evolving training framework for general agent intelligence that autonomously discovers real-world environments and tasks via the Model Context Protocol, enabling continuous learning. Agent-World-8B and 14B models outperform strong proprietary models across 23 challenging agent benchmarks.
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent
Introduces Agents-A1, a 35B Mixture-of-Experts agentic model that achieves trillion-parameter-level performance through long-horizon trajectory scaling and a three-stage training approach including SFT, domain-level teachers, and multi-teacher distillation. The model outperforms or matches much larger models on long-horizon agent benchmarks.
Position: Agentic AI System Is a Foreseeable Pathway to AGI
This paper argues that monolithic scaling of a single model is insufficient for achieving AGI and proposes Agentic AI with multi-agent collaboration as a necessary paradigm, demonstrating theoretically that agentic systems achieve exponentially superior generalization and sample efficiency.
Turning local agents into self-optimizing agents
A self-optimizing agentic pipeline that improves benchmark performance from ~30% to ~90% on TerminalBench, and can be extended to everyday chats by logging interactions, reflecting with a local model, and injecting lessons into future system prompts.
@KaiZhang_CS: Check out one of the best open-source search agents trained by @jianxie_ !! glad to see early experience methods work o…
Yu Su's team trained a frontier Deep Research Agent on an academic budget using 8K synthetic samples and RL, releasing fully open training infrastructure and models from 2B to 35B parameters.