HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Papers with Code Trending 05/04/26, 12:00 AM Papers

Summary

HeavySkill is a new framework that internalizes complex reasoning as an intrinsic model skill through parallel reasoning and summarization stages, outperforming traditional orchestration methods and enabling self-evolving LLMs via reinforcement learning.

Recent advances in agentic harness with orchestration frameworks that coordinate multiple agents with memory, skills, and tool use have achieved remarkable success in complex reasoning tasks. However, the underlying mechanism that truly drives performance remains obscured behind intricate system designs. In this paper, we propose HeavySkill, a perspective that views heavy thinking not only as a minimal execution unit in orchestration harness but also as an inner skill internalized within the model's parameters that drives the orchestrator to solve complex tasks. We identify this skill as a two-stage pipeline, i.e., parallel reasoning then summarization, which can operate beneath any agentic harness. We present a systematic empirical study of HeavySkill across diverse domains. Our results show that this inner skill consistently outperforms traditional Best-of-N (BoN) strategies; notably, stronger LLMs can even approach Pass@N performance. Crucially, we demonstrate that the depth and width of heavy thinking, as a learnable skill, can be further scaled via reinforcement learning, offering a promising path toward self-evolving LLMs that internalize complex reasoning without relying on brittle orchestration layers.

Original Article

View Cached Full Text

Cached at: 05/08/26, 08:55 AM

Paper page - HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Source: https://huggingface.co/papers/2605.02396 Authors:

Abstract

HeavySkill presents a framework where complex reasoning is internalized as an intrinsic model skill rather than relying on external orchestration, demonstrating superior performance through parallel reasoning and summarization stages that can be enhanced via reinforcement learning.

Recent advances inagentic harnesswithorchestration frameworksthat coordinate multiple agents withmemory,skills, andtool usehave achieved remarkable success incomplex reasoning tasks. However, the underlying mechanism that truly drives performance remains obscured behind intricate system designs. In this paper, we proposeHeavySkill, a perspective that views heavy thinking not only as a minimal execution unit in orchestration harness but also as an inner skill internalized within the model’s parameters that drives the orchestrator to solve complex tasks. We identify this skill as a two-stage pipeline, i.e.,parallel reasoningthensummarization, which can operate beneath anyagentic harness. We present a systematic empirical study ofHeavySkillacross diverse domains. Our results show that this inner skill consistently outperforms traditionalBest-of-N(BoN) strategies; notably, stronger LLMs can even approachPass@Nperformance. Crucially, we demonstrate that the depth and width of heavy thinking, as a learnable skill, can be further scaled viareinforcement learning, offering a promising path towardself-evolving LLMsthat internalize complex reasoning without relying on brittle orchestration layers.

View arXiv page View PDF Project page GitHub63 Add to collection

Get this paper in your agent:

hf papers read 2605\.02396

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.02396 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.02396 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.02396 in a Space README.md to link it from this page.

HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Paper page - HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper2

Similar Articles

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents

@dair_ai: // Evolving Meta-Skill for Multi-Agent Systems // Can a multi-agent system get better at orchestration without touching…

SkillFlow: Flow-Driven Recursive Skill Evolution for Agentic Orchestration

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

Submit Feedback

Similar Articles

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents

@dair_ai: // Evolving Meta-Skill for Multi-Agent Systems // Can a multi-agent system get better at orchestration without touching…

SkillFlow: Flow-Driven Recursive Skill Evolution for Agentic Orchestration

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning