Tag
TACO introduces a self-evolving compression framework that automatically learns to shrink redundant terminal interaction history, cutting token overhead ~10% while boosting accuracy 1-4% across TerminalBench and other code-agent benchmarks.
NVIDIA researchers present the first self-evolving logic synthesis framework where multi-agent LLMs autonomously refine the ABC EDA tool codebase.
Agent-World introduces a self-evolving training framework for general agent intelligence that autonomously discovers real-world environments and tasks via the Model Context Protocol, enabling continuous learning. Agent-World-8B and 14B models outperform strong proprietary models across 23 challenging agent benchmarks.
EvoMaster is a scalable, self-evolving agent framework for large-scale scientific discovery that enables iterative hypothesis refinement and knowledge accumulation across experimental cycles. It achieves state-of-the-art results on four benchmarks including Humanity's Last Exam (41.1%) and MLE-Bench Lite (75.8%), outperforming general-purpose baselines by up to 316%.
This paper introduces GenericAgent, a self-evolving LLM agent system designed to maximize context information density. It addresses long-horizon limitations through hierarchical memory, reusable SOPs, and efficient compression, achieving better performance with fewer tokens compared to leading agents.
Researchers introduce BEHEMOTH benchmark and CluE cluster-based prompt optimization to enable LLMs to extract and retain heterogeneous memory across diverse tasks, achieving 9% gains over prior self-evolving frameworks.