Tag
Adaptive Auto-Harness is a framework for sustained self-improvement of agentic systems deployed on open-ended task streams, outperforming baselines via a stateful multi-agent evolver, harness tree, and human-steering hooks.
BenchEvolver is an evolutionary framework that automatically generates harder coding problems from existing ones, creating challenging benchmarks that maintain validity and diversity while enabling model self-improvement and enhanced training performance.
Contrastive Reflection (CORE) is a non-parametric algorithm that generates concise, interpretable insights from comparing successful and unsuccessful reasoning traces, enabling faster and more efficient self-improvement for language models with fewer samples and rollouts than existing methods.
Introduces Hermes Dreaming, a staged plugin workflow that adds reviewable and validatable self-improvement to the Hermes agent, allowing operators to inspect, validate, and approve changes before they are applied.
This project describes an automated starfish organism of up to 128 agents that iteratively self-improves and solves social issues, having already written complete constitutions on various topics.
ImProver 2 is a neurosymbolic framework for automated proof optimization in Lean 4 that uses an expert-iteration pipeline and a scaffold to train a 7B-parameter model, outperforming much larger models and demonstrating that small models can effectively restructure research-level proofs.
Shares an officially approved Codex self-improvement prompt that guides reviewing recent work and identifying repeatable manual workflows to create skills, sub-agents, or automations for improved efficiency.
An autonomous agent team's Builder agent shipped two pull requests overnight, fixing a broken Instagram posting flow and eliminating redundant API calls, demonstrating the granular nature of self-improvement in autonomous systems.
ECHO introduces a hybrid objective that combines policy-gradient loss with environment observation prediction to provide dense supervision from terminal feedback, doubling performance on TerminalBench-2.0 for Qwen3 models.
awesome-autoresearch list updated, adding 6 application cases based on Karpathy's autoresearch pattern, covering scenarios such as customer service agent self-evolution, Shell integration, code configuration self-optimization, RAG tuning, and ASO.
ECHO is a new, simple, and free method that addresses CLI agents, continual learning, self-improvement, and world models.
Polarity is a self-improvement stack for AI agents, featured on ProductHunt.
Argues that current AI does not meet AGI standards because it lacks recursive self-improvement, and criticizes those who claim otherwise as having a weak definition of AGI.
The article argues that memory and skills in AI agents are not separate plugins but part of the same world model harness, and introduces Cognee's open-source approach to unifying them with self-improvement capabilities.
ASH is a system that learns embodied policies from unlabeled internet video via a self-improvement loop using inverse dynamics models, achieving strong performance on long-horizon tasks in Pokemon and Zelda games.
This paper introduces AIRA-Compose and AIRA-Design, dual frameworks using AI agents to autonomously discover neural architectures that outperform standard Transformers and scale efficiently.
An advanced Claude prompt designed to transform the AI into a comprehensive 'Life OS' dashboard for tracking productivity, habits, and personal performance.
Anthropic unveiled 'dreaming' and other updates for Claude Managed Agents, enabling AI agents to learn from past sessions and self-correct, alongside reports of 80x annualized growth.
This paper introduces SkillMaster, a training framework that enables LLM agents to autonomously create, refine, and select skills through trajectory-informed review and counterfactual utility evaluation.
将 Hermes Agent 与 AionUI 结合,可将个人电脑升级为支持多智能体并行、具备长期记忆与自我进化能力的 Agentic AI 操作系统,实现从数据分析、文件管理到代码编写的全自动化本地工作流。