SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Hugging Face Daily Papers 05/22/26, 12:00 AM Papers

agent-skills text-space-optimizer skill-training validation-score rollouts transfer-experiments arxiv

Summary

SkillOpt introduces a systematic text-space optimizer for agent skills that trains skills as external agent state with stable updates and zero deployment inference overhead, achieving superior performance across multiple benchmarks and execution environments.

Agent skills today are hand-crafted, generated one-shot, or evolved through loosely controlled self-revision, none of which behaves like a deep-learning optimizer for the skill, and none of which reliably improves over its starting point under feedback. We argue the skill should instead be trained as the external state of a frozen agent, with the same discipline that makes weight-space optimization reproducible. SkillOpt is, to our knowledge, the first systematic controllable text-space optimizer for agent skills: a separate optimizer model turns scored rollouts into bounded add/delete/replace edits on a single skill document, and an edit is accepted only when it strictly improves a held-out validation score. A textual learning-rate budget, rejected-edit buffer, and epoch-wise slow/meta update make skill training stable while adding zero inference-time model calls at deployment. Across six benchmarks, seven target models, and three execution harnesses (direct chat, Codex, Claude Code), SkillOpt is best or tied on all 52 evaluated (model, benchmark, harness) cells and beats every per-cell competitor among human, one-shot LLM, Trace2Skill, TextGrad, GEPA, and EvoSkill skills. On GPT-5.5 it lifts the average no-skill accuracy by +23.5 points in direct chat, by +24.8 inside the Codex agentic loop, and by +19.1 inside Claude Code. Transfer experiments further show that optimized skill artifacts retain value when moved across model scales, between Codex and Claude Code execution environments, and to a nearby math benchmark without further optimization.

Original Article

View Cached Full Text

Cached at: 05/25/26, 02:35 AM

Paper page - SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Source: https://huggingface.co/papers/2605.23904 Authors:

Abstract

Agent skillstoday are hand-crafted, generated one-shot, or evolved through loosely controlled self-revision, none of which behaves like a deep-learning optimizer for the skill, and none of which reliably improves over its starting point under feedback. We argue the skill should instead be trained as the external state of a frozen agent, with the same discipline that makes weight-space optimization reproducible. SkillOpt is, to our knowledge, the first systematic controllabletext-space optimizerforagent skills: a separate optimizer model turns scoredrolloutsinto boundedadd/delete/replace editson a single skill document, and an edit is accepted only when it strictly improves a held-outvalidation score. Atextual learning-rate budget,rejected-edit buffer, andepoch-wise slow/meta updatemakeskill trainingstable while adding zero inference-time model calls at deployment. Across six benchmarks, seven target models, and three execution harnesses (direct chat, Codex, Claude Code), SkillOpt is best or tied on all 52 evaluated (model, benchmark, harness) cells and beats every per-cell competitor among human, one-shot LLM, Trace2Skill, TextGrad, GEPA, and EvoSkill skills. On GPT-5.5 it lifts the average no-skill accuracy by +23.5 points in direct chat, by +24.8 inside the Codex agentic loop, and by +19.1 inside Claude Code.Transfer experimentsfurther show that optimized skill artifacts retain value when moved across model scales, between Codex and Claude Code execution environments, and to a nearby math benchmark without further optimization.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.23904

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.23904 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.23904 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.23904 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper page - SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

SkillOpt-Lite: Better and Faster Agent Self-evolution via One Line of Vibe

@Yif_Yang: Introducing SkillOpt — an optimizer for agent skills. Instead of finetuning model weights, we treat a natural-language …

@AlphaSignalAI: https://x.com/AlphaSignalAI/status/2069064122218717387

@oliviscusAI: MICROSOFT JUST OPEN-SOURCED SELF-EVOLVING AGENT SKILLS. it's called skillopt. skills that improve themselves the same w…

@omarsar0: New research from Microsoft Research I see a lot of AI engineers handwriting agent skill docs and hope they generalize.…

Submit Feedback

Similar Articles

SkillOpt-Lite: Better and Faster Agent Self-evolution via One Line of Vibe

@Yif_Yang: Introducing SkillOpt — an optimizer for agent skills. Instead of finetuning model weights, we treat a natural-language …

@AlphaSignalAI: https://x.com/AlphaSignalAI/status/2069064122218717387

@oliviscusAI: MICROSOFT JUST OPEN-SOURCED SELF-EVOLVING AGENT SKILLS. it's called skillopt. skills that improve themselves the same w…

@omarsar0: New research from Microsoft Research I see a lot of AI engineers handwriting agent skill docs and hope they generalize.…