EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Hugging Face Daily Papers 04/19/26, 12:00 AM Papers

autonomous-agents scientific-discovery llm-agents self-evolving agentic-ai open-source benchmark

Summary

EvoMaster is a scalable, self-evolving agent framework for large-scale scientific discovery that enables iterative hypothesis refinement and knowledge accumulation across experimental cycles. It achieves state-of-the-art results on four benchmarks including Humanity's Last Exam (41.1%) and MLE-Bench Lite (75.8%), outperforming general-purpose baselines by up to 316%.

The convergence of large language models and agents is catalyzing a new era of scientific discovery: Agentic Science. While the scientific method is inherently iterative, existing agent frameworks are predominantly static, narrowly scoped, and lack the capacity to learn from trial and error. To bridge this gap, we present EvoMaster, a foundational evolving agent framework engineered specifically for Agentic Science at Scale. Driven by the core principle of continuous self-evolution, EvoMaster empowers agents to iteratively refine hypotheses, self-critique, and progressively accumulate knowledge across experimental cycles, faithfully mirroring human scientific inquiry. Crucially, as a domain-agnostic base harness, EvoMaster is exceptionally easy to scale up -- enabling developers to build and deploy highly capable, self-evolving scientific agents for arbitrary disciplines in approximately 100 lines of code. Built upon EvoMaster, we incubated the SciMaster ecosystem across domains such as machine learning, physics, and general science. Evaluations on four authoritative benchmarks (Humanity's Last Exam, MLE-Bench Lite, BrowseComp, and FrontierScience) demonstrate that EvoMaster achieves state-of-the-art scores of 41.1%, 75.8%, 73.3%, and 53.3%, respectively. It comprehensively outperforms the general-purpose baseline OpenClaw with relative improvements ranging from +159% to +316%, robustly validating its efficacy and generality as the premier foundational framework for the next generation of autonomous scientific discovery. EvoMaster is available at https://github.com/sjtu-sai-agents/EvoMaster.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/21/26, 07:20 AM

Paper page - EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Source: https://huggingface.co/papers/2604.17406 Authors:

Abstract

EvoMaster is a scalable, self-evolving agent framework designed for large-scale scientific discovery that enables iterative hypothesis refinement and knowledge accumulation across experimental cycles.

The convergence of large language models and agents is catalyzing a new era of scientific discovery:Agentic Science. While the scientific method is inherently iterative, existing agent frameworks are predominantly static, narrowly scoped, and lack the capacity to learn from trial and error. To bridge this gap, we present EvoMaster, a foundationalevolving agent frameworkengineered specifically forAgentic Scienceat Scale. Driven by the core principle of continuousself-evolution, EvoMaster empowers agents to iteratively refine hypotheses, self-critique, and progressively accumulate knowledge across experimental cycles, faithfully mirroring humanscientific inquiry. Crucially, as adomain-agnostic baseharness, EvoMaster is exceptionally easy to scale up -- enabling developers to build and deploy highly capable, self-evolving scientific agents for arbitrary disciplines in approximately 100 lines of code. Built upon EvoMaster, we incubated the SciMaster ecosystem across domains such as machine learning, physics, and general science. Evaluations on four authoritative benchmarks (Humanity’s Last Exam, MLE-Bench Lite, BrowseComp, and FrontierScience) demonstrate that EvoMaster achieves state-of-the-art scores of 41.1%, 75.8%, 73.3%, and 53.3%, respectively. It comprehensively outperforms the general-purpose baseline OpenClaw with relative improvements ranging from +159% to +316%, robustly validating its efficacy and generality as the premier foundational framework for the next generation ofautonomous scientific discovery. EvoMaster is available at https://github.com/sjtu-sai-agents/EvoMaster.

View arXiv page View PDF GitHub116 Add to collection

Get this paper in your agent:

hf papers read 2604\.17406

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.17406 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.17406 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.17406 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Paper page - EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery

@tom_doerr: Automates research workflows with persistent multi-agent memory https://github.com/EvoScientist/EvoScientist…

EvoMap/evolver

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems

Submit Feedback

Similar Articles

EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery

@tom_doerr: Automates research workflows with persistent multi-agent memory https://github.com/EvoScientist/EvoScientist…

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems