Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning
Summary
Skill1 is a unified framework that trains a single policy to co-evolve skill selection, utilization, and distillation using a shared task-outcome objective. Experiments on ALFWorld and WebShop show it outperforms existing baselines in complex task environments.
View Cached Full Text
Cached at: 05/08/26, 07:27 AM
Paper page - Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning
Source: https://huggingface.co/papers/2605.06130
Abstract
Skill1 is a unified framework that trains a single policy to simultaneously evolve skill selection, utilization, and distillation capabilities using a shared task-outcome objective, demonstrating superior performance over existing baselines in complex task environments.
A persistentskill libraryallows language model agents to reuse successful strategies across tasks. Maintaining such a library requires three coupled capabilities. The agent selects a relevant skill, utilizes it during execution, and distills new skills from experience. Existing methods optimize these capabilities in isolation or with separate reward sources, resulting in partial and conflicting evolution. We propose Skill1, a framework that trains a single policy to co-evolveskill selection, utilization, and distillation toward a sharedtask-outcome objective. The policy generates a query to search theskill library, re-ranks candidates to select one, solves the task conditioned on it, and distills a new skill from the trajectory. All learning derives from a single task-outcome signal. Its low-frequency trend credits selection and its high-frequency variation credits distillation. Experiments onALFWorldandWebShopshow that Skill1 outperforms prior skill-based andreinforcement learningbaselines. Training dynamics confirm the co-evolution of the three capabilities, and ablations show that removing any credit signal degrades the evolution.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.06130
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.06130 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.06130 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.06130 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning
Skill0.5 is a novel agentic reinforcement learning framework that combines general skill internalization with task-specific skill utilization via a dynamic difficulty-aware router, improving out-of-distribution generalization in complex task environments as demonstrated on ALFWorld and WebShop.
SkillOS: Learning Skill Curation for Self-Evolving Agents
This paper introduces SkillOS, a reinforcement learning framework that enables LLM agents to learn long-term skill curation policies for self-evolution, improving performance and generalization across tasks.
SkillGraph: Skill-Augmented Reinforcement Learning for Agents via Evolving Skill Graphs
SkillGraph is a framework that represents reusable skills as nodes in a directed graph to enable large language model agents to handle compositional tasks more effectively through structured skill retrieval and continuous evolution.
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill
Skill-RM proposes a unified reward modeling framework that treats reward computation as a structured agentic task, enabling dynamic evidence aggregation and consistent evaluation across diverse applications, outperforming traditional judge baselines.
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
SkillClaw introduces a framework for collective skill evolution in multi-user LLM agent systems, enabling autonomous updates and cross-user knowledge transfer by aggregating interactions and feedback to improve performance across the ecosystem.