COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

arXiv cs.AI 06/01/26, 04:00 AM Papers
ai-skills knowledge-distillation agent-skills person-grounding llm-agents open-source expert-knowledge
Summary
This paper presents COLLEAGUE.SKILL, an open-source system for automatically distilling person-grounded AI skills from heterogeneous traces into inspectable, correctable, and portable skill packages, enabling LLM agents to carry bounded representations of human expertise and interaction style.
arXiv:2605.31264v1 Announce Type: new Abstract: LLM agents are increasingly expected not only to complete isolated tasks, but also to carry bounded representations of human expertise, judgment, and interaction style. Building such person-grounded agents remains difficult because actionable knowledge associated with a person or role is usually embedded in heterogeneous traces rather than written as clean instructions. Existing memory and persona systems capture fragments of this evidence, while skill frameworks provide portable packaging formats; however, there is no end-to-end workflow for distilling these traces into inspectable, correctable, and agent-usable skills. We present an automated trace-to-skill distillation system for generating person-grounded AI skills via expert knowledge distillation. Given materials from a target person or role, COLLEAGUE.SKILL produces a versioned skill package with two coordinated tracks: a capability track for practices, mental models, and decision heuristics, and a bounded behavior track for communication style, interaction rules, and correction history. The package can be inspected, invoked, updated through natural-language feedback, rolled back, installed across agent hosts, and optionally prepared for controlled distribution. We describe the artifact contract, generation workflow, correction lifecycle, deployment surface, and domain presets implemented in the open-source system. At the time of writing, the public repository has approximately 18.5k GitHub stars; the gallery lists 215 skills from 165 contributors and more than 100k cumulative stars across listed skill cards. The system illustrates how person-grounded skills can be represented as portable, correctable packages rather than opaque prompts or hidden memories.
Original Article
View Cached Full Text
Cached at: 06/01/26, 09:26 AM
# Automated AI Skill Generation via Expert Knowledge Distillation
Source: [https://arxiv.org/html/2605.31264](https://arxiv.org/html/2605.31264)
Tianyi Zhou Dongrui Liu11footnotemark:1Leitao Yuan Jing Shao Xia Hu Shanghai Artificial Intelligence Laboratory \{zhoutianyi, liudongrui, yuanleitao, shaojing, huxia\}@pjlab\.org\.cn

###### Abstract

LLM agents are increasingly expected not only to complete isolated tasks, but also to carry bounded representations of human expertise, judgment, and interaction style\. Building such person\-grounded agents remains difficult because actionable knowledge associated with a person or role is usually embedded in heterogeneous traces rather than written as clean instructions\. Existing memory and persona systems capture fragments of this evidence, while skill frameworks provide portable packaging formats; however, there is no end\-to\-end workflow for distilling these traces into inspectable, correctable, and agent\-usable skills\. We presentCOLLEAGUE\.SKILL\([https://github\.com/titanwings/colleague\-skill](https://github.com/titanwings/colleague-skill)\), an automated trace\-to\-skill distillation system for generating person\-grounded AI skills via expert knowledge distillation\. Given materials from a target person or role,COLLEAGUE\.SKILLproduces a versioned skill package with two coordinated tracks: a capability track for practices, mental models, and decision heuristics, and a bounded behavior track for communication style, interaction rules, and correction history\. The package can be inspected, invoked, updated through natural\-language feedback, rolled back, installed across agent hosts, and optionally prepared for controlled distribution\. We describe the artifact contract, generation workflow, correction lifecycle, deployment surface, and domain presets implemented in the open\-source system\. At the time of writing, the public repository has approximately 18\.5k GitHub stars; the gallery lists 215 skills from 165 contributors and more than 100k cumulative stars across listed skill cards\. The system illustrates how person\-grounded skills can be represented as portable, correctable packages rather than opaque prompts or hidden memories\. The colleague setting is our primary and most controllable case, but we instantiate the same distillation\-and\-packaging paradigm in two additional domains: celebrity/public\-figure skills, which rely on public evidence and source boundaries, and relationship skills, which require stronger consent, privacy, and local\-control assumptions\.

## 1Introduction

The role of LLM agents is shifting from executing isolated instructions toward carrying reusable context about how work and interaction should be performed\. In practice, users often want an agent to preserve bounded parts of a person’s expertise, memory, or interpersonal style: a teammate’s review judgment, a specialist’s decision heuristics, a public thinker’s mental models, or private interpersonal interaction patterns\. Rather than treating this demand as unrestricted person simulation, we frame it as*person\-grounded trace\-to\-skill distillation*: turning traces of a person or role into a constrained artifact that makes useful knowledge, interaction style, and limits of use explicit\. This framing does not claim identity replacement, and it treats the generated object as an editable technical artifact rather than as the person\.

LLM agents increasingly rely on modular extensions\. Tools connect agents to external actions, while skills package domain knowledge, procedures, scripts, and reference materials that can be discovered and loaded on demand\. This follows a broader shift from single\-prompt assistants toward agents that reason and act through external tools, feedback loops, and configurable interaction patterns\(yaoReAct2023;schickToolformer2023;shinnReflexion2023;wuAutoGen2023\)\. The Agent Skills standard defines a skill as a folder centered on aSKILL\.mdfile with metadata and instructions, optionally accompanied by scripts, references, and assets\(agentskills2026\)\. Claude Code similarly treats skills as reusable capabilities that can be invoked directly or loaded when relevant\(claudeskills2026\)\. The format is therefore beginning to act as a portable capability unit for agents\.

What remains under\-specified is how such skills should be created when the relevant capability or interaction pattern is not already written as an instruction manual\. In practice, person\-grounded knowledge is often dispersed across heterogeneous traces: a departing teammate’s review standards may appear in code comments, incident notes, and chat decisions; a public figure’s reasoning style may be expressed across interviews, speeches, essays, and public decisions; and a relationship skill may depend on private interaction histories whose consent and retention boundaries matter\. For LLM agents, the challenge is therefore not only to retrieve these materials, but to distill selected evidence into reusable skill packages whose contents, provenance, correction history, and usage limits remain visible\.

COLLEAGUE\.SKILLaddresses this question through an automated distillation pipeline\. The project name reflects its original colleague setting: when a teammate leaves, their local judgment, review standards, incident heuristics, and communication norms often disappear with them\. The implemented system generalizes this idea into a broader person\-grounded skill workflow\. It treats selected traces as evidence for a portable agent skill rather than as a hidden memory store or a claim to reproduce the person\.111Project repository:[https://github\.com/titanwings/colleague\-skill](https://github.com/titanwings/colleague-skill)\. Accessed 2026\-05\-28\.The system accepts chat logs, work documents, email, screenshots, public research material, subtitles, and lightweight user descriptions, then generates a skill package that can be inspected and installed into agent hosts such as Claude Code, OpenClaw, Codex, and Hermes\. The colleague, celebrity/public\-figure, and relationship variants reuse this package format under different source, evidence, consent, and distribution assumptions\.

We study this problem as*person\-grounded skill artifact*construction\. The target is not an unrestricted conversational model of an individual, but a bounded package of selected capabilities, mental models, communication constraints, examples, and usage boundaries\. In a workplace case, this may be an engineer’s API review checklist, incident triage heuristics and escalation thresholds\. In a celebrity or public\-figure case, it may be a source\-grounded reasoning style and mental\-model library\. In a relationship case, it may be a local representation of interaction patterns that should remain editable and deletable\. The output is a versioned package whose contents can be examined, corrected, rolled back, deleted, or shared under user control\.

We make four contributions:

- •We formulate person\-grounded trace\-to\-skill distillation as an artifact problem with explicit portability, inspectability, correctability, composability, and governance requirements\.
- •We present theCOLLEAGUE\.SKILLpipeline for distilling heterogeneous human traces into a capability track, a bounded behavior track, metadata, host installers, and version state\.
- •We describe the workflow for collection, skill rendering, multi\-host installation, natural\-language correction, rollback, and optional gallery distribution, with support for domain presets as extensions of the same mechanism\.
- •We document the open\-source deployment, public gallery, and extension presets that turn the artifact format into an externally inspectable distribution surface\.

## 2Problem Formulation

We use*person\-grounded skill*to describe a skill whose instructions are grounded in evidence about a person or role, while remaining bounded by explicit source, usage, and governance constraints\. The colleague setting is the primary instance studied here because work expertise gives the clearest utility target and governance boundary\. The broader object, however, is not limited to coworkers: the same artifact form can represent public mental models or private interaction patterns under different evidence and consent assumptions\.

We define*person\-grounded skill generation*as an artifact problem\. Given a lightweight profilepp, a source scopecc, and a set of source materialsD=\{d1,…,dn\}D=\\\{d\_\{1\},\\ldots,d\_\{n\}\\\}, the system produces a skill package:

whereAAis a set of generated files,MMis machine\-readable metadata and installation information, andLLis lifecycle state such as version, update time, correction count, and rollback history\.

The target is not a hidden model of what a real person would say to every possible prompt\. The target is a concrete package that distills selected practices and interaction norms into five operational properties:

1. 1\.Portable: skills\-compatible agents can load the package through ordinary skill mechanisms;
2. 2\.Inspectable: users can read extracted rules, examples, limitations, and metadata before use;
3. 3\.Composable: full, work\-only, and persona\-only entrypoints can be invoked separately;
4. 4\.Correctable: new evidence or user feedback can update the package while preserving prior state;
5. 5\.Governable: metadata, source boundaries, and disclaimers support deletion, sharing decisions, and safety review\.

Different domains instantiateDDdifferently\. Colleague skills may include design documents, code\-review comments, chat decisions, incident notes, and other work traces\. Public\-figure skills should favor public first\-person evidence and long\-form interviews\. Relationship skills may contain private traces, making consent and local control part of the technical problem rather than a deployment afterthought\.

This formulation givesCOLLEAGUE\.SKILLa narrower claim than behavioral cloning\. The system does not assert that a generated skill is a faithful model of a person\. It asserts that selected traces can be transformed into a skills\-compatible artifact with explicit files, metadata, entrypoints, correction records, and lifecycle operations\. This scope makes the contribution concrete: the artifact can be inspected for structure, source boundaries, update behavior, and deployment compatibility even before downstream human\-subject or task\-performance studies are available\.

## 3COLLEAGUE\.SKILLSystem Overview

Figure[1](https://arxiv.org/html/2605.31264#S3.F1)shows the deployedCOLLEAGUE\.SKILLarchitecture\. The core path begins with traces of a target person or role: work documents and review comments for a colleague, public interviews and long\-form writings for a public figure, or private interaction records for a relationship preset\. Collectors and parsers normalize this material into local knowledge directories\. Analyzers extract evidence about durable capability, mental models, and bounded interaction style; builders render structured Markdown; and a shared writer produces the generated skill package\. The resulting package can be invoked directly, installed into supported hosts, revised through correction records, or, when source rights and metadata permit, prepared for gallery distribution\.

![Refer to caption](https://arxiv.org/html/2605.31264v1/x1.png)

Figure 1:COLLEAGUE\.SKILLarchitecture for automated person\-grounded skill generation\. The shared distillation core renders portable agent\-skill artifacts; domain presets add source requirements, evidence checks, consent assumptions, and lifecycle or gallery metadata\.### 3\.1Application Presets

COLLEAGUE\.SKILLkeepscolleagueas the primary preset because it offers a concrete and socially useful starting point: turning a teammate’s practices, standards, and communication norms into an inspectable skill\. The implementation also makes the source domain explicit so the same artifact workflow can be reused under different evidence and consent assumptions\. The repository currently defines three presets:colleague,celebrity, andrelationship\. Each preset specifies a source boundary, storage root, command aliases, prompt bundle, and optional research or safety tooling\.

![Refer to caption](https://arxiv.org/html/2605.31264v1/x2.png)

Figure 2:Application presets layered on theCOLLEAGUE\.SKILLperson\-grounded skill pipeline\. The shared artifact workflow branches into colleague, celebrity, and relationship presets with different evidence scopes, governance requirements, and invocation aliases\.These presets are domain specializations of the same person\-grounded artifact workflow, not separate systems\. They avoid duplicating the pipeline when a new application setting needs different prompts, source boundaries, consent defaults, or publication rules\. Adding a future preset, such asself,author, orteam, then becomes a configuration and prompt\-design change rather than a new program\.

### 3\.2Dual Representation

GeneratedCOLLEAGUE\.SKILLartifacts use a dual representation\. The work or capability track captures responsibilities, workflows, technical standards, review criteria, decision heuristics, and lessons from past work\. The implementation names the second trackpersona\.md, but its technical role is narrower: it stores bounded behavior constraints, expression preferences, interaction rules, and correction records\. The combined runtime rule is therefore not open\-ended impersonation\. The agent should select the relevant behavior constraints, apply the capability or mental\-model track, and produce a response that remains within the artifact’s stated boundaries\.

This split is important because many failures in persona systems come from conflating three different things: factual knowledge, procedural judgment, and surface tone\.COLLEAGUE\.SKILLmakes these pieces inspectable and separately invocable through full, capability\-only, and persona\-only generated artifacts\. In the colleague case, this keeps the main object focused on reusable expert judgment rather than a simulated person; in celebrity/public\-figure and relationship presets, the same separation keeps source\-grounded mental models or private interaction rules from becoming the system identity\.

### 3\.3Artifact Schema and Writer

The writer normalizes metadata into a versioned schema containing identity, preset family, source context, classification, artifact names, engine and toolchain metadata, generation provenance, lifecycle state, and compatibility fields\. The current implementation uses schema version 3\. It then renders:

- •SKILL\.md: the combined invokable skill;
- •work\.mdandpersona\.md: editable source documents;
- •work\_skill\.mdandpersona\_skill\.md: independently invokable sub\-skills;
- •manifest\.jsonandmeta\.json: installation, optional gallery, and lifecycle metadata\.

This is aligned with the Agent Skills standard, whereSKILL\.mdis the required entrypoint and optional files can provide scripts or references\(agentskillsspec2026\)\. The design also follows progressive disclosure: agents see skill metadata first and load detailed instructions only when the skill is invoked\(agentskills2026;claudeskills2026\)\.

The combinedSKILL\.mdcontains standard skill frontmatter, including a generated name, a description, anduser\-invocable: true\. Its body embeds the capability track as Part A and the behavior track as Part B\. The split entrypoints expose the same tracks independently\. This makes the runtime behavior explicit: the artifact can be used as a full person\-grounded skill, a capability\-only skill, or a behavior\-only style reference\.

Table 1:Runtime artifact contract emitted by the shared writer\.

## 4Generation and Evolution Workflows

### 4\.1Creation Workflow

Creation begins with the shared person\-grounded distillation path: a user provides an alias, optional profile fields, and source material for the target person or role\. Repository\-supported collectors and import paths cover sources such as Feishu, DingTalk, Slack, WeChat SQLite exports, email archives, PDFs, screenshots, Markdown, and direct paste\. Application presets then specialize this creation path\. For colleagues, the prompt emphasizes work practice and review judgment\. For celebrity/public\-figure skills, thecelebritypreset adds a research pass over first\-person writings, interviews, decisions, expression style, external reception, and timeline evidence\. For relationship skills, therelationshippreset changes the prompt focus and consent assumptions rather than changing the artifact contract\.

The generation prompts then run two conceptual tracks\. The capability track extracts durable work methods, expert heuristics, or source\-grounded mental models\. The behavior track extracts expression and interaction patterns under the preset’s boundaries\. Builders render structured Markdown, and the writer packages the result into the artifact contract defined above\. This separation makes capability and behavior claims inspectable at the file level rather than hiding them in a single prompt\.

### 4\.2Correction and Update Workflow

The generated artifact is expected to be imperfect\. The correction handler recognizes natural\-language feedback such as “he would not say that” or “she would push back here\.” If the correction concerns expert work, it produces a Markdown patch to a relevant section\. Patches with matching level\-2 headings replace the corresponding section; unmatched sections are appended\. If the correction concerns expression or interaction behavior, it produces a normalized correction record:

\{scene,wrong,correct\}\.\\\{\\texttt\{scene\},\\texttt\{wrong\},\\texttt\{correct\}\\\}\.The writer archives the current version, applies the patch or correction, increments the lifecycle version, and regenerates all derived artifacts\. The version manager can list archived versions, back up the current artifacts, roll back to a previous version, and clean old archives\.

![Refer to caption](https://arxiv.org/html/2605.31264v1/x3.png)

Figure 3:Lifecycle loop for generated skills\. Corrections and patches create new versions while preserving rollback points\.
### 4\.3Public\-Figure Research Extension

Thecelebritypreset is an extension for public\-source expert distillation\. Its prompts prioritize first\-person works, long\-form interviews, documented decisions, and clearly marked inferences over short summaries or content farms\. The tooling includes subtitle download, audio transcription, subtitle cleanup, research\-note merging, and quality checks\. The quality checker scans for mental\-model coverage, limitations, expression patterns, internal tensions, grounding URLs, and copyright\-safety signals\.

The extension makes evidence requirements explicit and executable, but it does not certify factual truth by itself\. Instead, it records evidence limits and gives the system a way to downgrade confidence when evidence is thin rather than filling gaps with generic persona text\. The research toolchain therefore reuses the same workflow: creation produces inspectable artifacts, correction changes versioned state, and public\-facing distribution must expose the evidence limits of the artifact\.

### 4\.4Relationship Extension

Therelationshippreset applies the same artifact workflow to a more sensitive private domain\. Its value is not that an agent can replace a person, but that personal interaction traces can be represented as local, editable, and deletable state rather than as an opaque prompt or hidden memory\. Compared with the colleague and public\-figure settings, this preset requires stronger assumptions about consent, retention, access control, and optional sharing\. In the paper’s framing, relationship skills stress the governance surface of the package format: they make deletion, correction, local ownership, and non\-public defaults first\-order artifact requirements\.

## 5Deployment and Community Ecosystem

COLLEAGUE\.SKILLis deployed as an open\-source repository with a public site and gallery\.222Project site:[https://titanwings\.github\.io/colleague\-skill\-site/](https://titanwings.github.io/colleague-skill-site/)\. Accessed 2026\-05\-28\.The site documents the person\-grounded skill workflow, installation options, supported sources, and example outputs\. The gallery is a downstream sharing layer: generated skills can remain local, be installed into an agent host, or be submitted as shareable packages when the user has rights to publish them\. On 2026\-05\-28, we observed public counters listing 215 skills, 55 meta\-skills, and 165 contributors on the gallery, along with repository activity counters\. The gallery metadata also records a star count for each skill card; because these counts are synchronized asynchronously and may lag current GitHub state, we report the aggregate at the order\-of\-magnitude level as more than 100k cumulative gallery stars\. We use this statistic only as evidence of public distribution surface, not as adoption quality or task impact\.

![Refer to caption](https://arxiv.org/html/2605.31264v1/x4.png)

Figure 4:Observed public deployment counters on 2026\-05\-28\. Counts summarize repository activity, gallery scale, and cumulative public signals; they indicate deployment and distribution surface rather than task performance, behavioral fidelity, or adoption\-quality metrics\.The deployment changes the role ofCOLLEAGUE\.SKILLfrom a single\-prompt construction method to an artifact pipeline for person\-grounded skills\. Distribution matters because such skills may need to move across hosts, be corrected after use, or be withheld from public sharing\. The public site and gallery therefore function as part of the artifact story: they show how generated skills can move from local use to controlled installation and, when appropriate, community sharing beyond the creator’s local workspace\.

## 6Application Cases

The following cases show how the shared trace\-to\-skill workflow appears in different domains\. They are design\-oriented examples of the artifact workflow, not claims of behavioral equivalence\.

Colleague skill\.A workplace skill is the most concreteCOLLEAGUE\.SKILLinstance\. It uses private or enterprise material such as design documents, chat decisions, review comments, and incident notes to distill reusable work practice\. Its useful behavior is not surface style by itself, but applying review criteria: e\.g\., checking authentication, input validation, rate limiting, response schema, and sensitive\-data exposure before lower\-priority issues\. The artifact separates these criteria into work rules and behavior constraints, allowing work\-only invocation when style transfer is inappropriate\.

Celebrity skill\.A public\-figure skill is an extension built from public evidence\. The celebrity family uses a six\-dimensional research pass and quality checks to emphasize mental models, citations, and explicit boundaries\. A generated skill should indicate where evidence is thin, should not present itself as the actual person, and should remain distinguishable from the workplace case where enterprise traces, access control, and organizational consent dominate the artifact boundary\.

Relationship skill\.Relationship skills demonstrate the same workflow in a sensitive interpersonal domain\. They can represent interaction patterns as local, editable state, but they also expose risks: emotional overattachment, non\-consensual simulation, and misuse of private chats\. For this family, deployment should prioritize local ownership, deletion, clear disclaimers, and opt\-in sharing\. We include it as an extension capability, not as an endorsement of unconstrained use\.

## 7Related Work

Agent skills and reusable capabilities\.Recent agent systems increasingly externalize capability rather than relying only on monolithic prompts or model weights\. ReAct interleaves reasoning traces with actions, Toolformer learns when and how to call APIs, Reflexion and Self\-Refine use feedback to revise future behavior, and AutoGen exposes configurable multi\-agent conversation patterns\(yaoReAct2023;schickToolformer2023;madaanSelfRefine2023;wuAutoGen2023\)\. AgentBench further shows that agent ability must be evaluated in interactive environments rather than only in static question\-answering settings\(liuAgentBench2024\)\. The Agent Skills specification defines a skill as a directory centered onSKILL\.md, with optional scripts, references, and assets loaded through progressive disclosure\(agentskills2026\)\. A recent analysis of public Claude skills argues that skills are becoming an infrastructure layer for agents, while also surfacing redundancy, marketplace skew, and safety risks around state\-changing actions\(lingAgentSkills2026\)\.COLLEAGUE\.SKILLadopts this emerging package format, but its question is not how to define a skill abstraction\. It asks how a person’s review standards, mental models, communication constraints, or relationship\-specific interaction patterns can be distilled into skills that remain inspectable, editable, portable, and accountable across host environments rather than treated as temporary prompt text or hidden memory\.

Skill libraries and skill synthesis\.LLM agents have used skill libraries to accumulate reusable behavior\. Voyager stores executable code skills in an expanding library and retrieves them to solve new embodied tasks\(wangVoyager2023\)\. More recent systems construct or refine skill knowledge bases from execution trajectories\. SkillX distills raw agent trajectories into hierarchical strategic, functional, and atomic skills and refines them through execution feedback\(skillx2026\)\. SkillGen synthesizes auditable skills from successful and failed trajectories and evaluates skills as interventions that can both repair failures and introduce regressions\(maSkillGen2026\)\. AutoSkill abstracts reusable skills from dialogue and interaction traces to support lifelong personalized agents\(yangAutoSkill2026\)\.COLLEAGUE\.SKILLinstead distills human traces into person\-grounded skills that deliberately separate capability from bounded behavior, expose correction and rollback state, and target installation across multiple agent hosts and sharing surfaces\.

Memory, personalization, and role\-playing agents\.Memory and personalization provide continuity, but they usually keep the representation inside retrieval stores, context managers, or model behavior\. Retrieval\-augmented generation connects parametric generation with non\-parametric memory\(lewisRAG2020\)\. Personalization benchmarks and agent frameworks such as LaMP and PersonaAgent study how models adapt to user histories, preferences, and personalized action spaces\(salemiLaMP2024;zhangPersonaAgent2025\)\. Character\-LLM trains role\-playing agents from profiles and experiences, RoleLLM constructs role profiles and benchmarks character\-level role\-playing ability, and SOTOPIA evaluates social intelligence in interactive role\-play scenarios\(shaoCharacterLLM2023;wangRoleLLM2024;zhouSotopia2024\)\.COLLEAGUE\.SKILLis deliberately narrower than both traditions: it constructs explicit, reviewable person\-grounded skill artifacts that encode selected rules, communication constraints, mental models, limitations, and correction history\.

## 8Discussion

Grounded traces, not identity replacement\.COLLEAGUE\.SKILL’s core insight is that parts of a person’s knowledge, judgment, and interaction style can be distilled into an inspectable AI skill without claiming to reproduce the person\. The useful target is a bounded person\-grounded artifact: how a person or role weighs evidence, detects risk, explains trade\-offs, refuses bad requests, adapts communication to context, or follows documented interaction rules\. In the workplace setting, this may be review checklists and incident heuristics; in celebrity/public\-figure settings, mental models and cited reasoning patterns; in relationship settings, private interaction constraints under local control\. Surface style can make the skill easier to use, but the primary contribution is distilling selected human traces into files that can be inspected, corrected, versioned, installed, and bounded against identity replacement\.

Why the workflow matters\.A single prompt can mimic surface behavior, but it rarely makes the extracted person\-grounded knowledge accountable\.COLLEAGUE\.SKILLtreats trace\-to\-skill distillation as a workflow over files: creation, inspection, invocation, correction, rollback, deletion, host installation, and optional distribution\. These operations are not auxiliary engineering details\. They are the conditions under which a generated person\-grounded skill can be audited, repaired, withheld, or shared\. They also make the research object sharper\. Extraction quality can be inspected at the level ofwork\.mdandpersona\.md; installation and sharing can operate through manifests rather than ad hoc instructions; and governance can operate on explicit metadata rather than hidden prompt state\.

The colleague, celebrity, relationship, and gallery work should therefore be read as instances of the same person\-grounded skill thesis\. Colleague skills test practical expert\-knowledge transfer\. Celebrity/public\-figure skills test whether public mental\-model evidence can be packaged with source boundaries rather than becoming generic impersonation\. Relationship skills test whether the same artifact controls can protect sensitive private traces\. The gallery tests whether generated skills can become a governed distribution layer rather than a private prompt collection\. Across these instances, the broader implication is an ecosystem of reusable person\-grounded artifacts rather than a gallery of unbounded person simulations: security\-review skills, product\-decision skills, research\-mentor skills, public\-thinker mental\-model skills, or private interaction skills whose evidence and limits remain visible across installation contexts\.

Behavioral fidelity frontier\.The claims in this paper are artifact\-level claims:COLLEAGUE\.SKILLdefines a package format, implements a generation and update workflow, exposes correction and rollback state, supports multiple agent hosts, and demonstrates that the same mechanism can cover colleague, celebrity/public\-figure, relationship, and gallery distribution settings\. It does not claim that generated skills faithfully reproduce a person or improve downstream work\. Those questions require human and task\-based studies: whether colleague skills catch the same review issues as the source expert, whether capability\-only variants preserve utility without persona risk, whether relationship skills encourage overattachment, whether corrections improve behavior without regressions, and whether public\-figure extensions cite evidence rather than hallucinating motives\. A useful evaluation protocol should compare full, capability\-only, and behavior\-only artifacts under matched source evidence, since each variant exposes a different risk\-utility trade\-off\.

Productization as a research constraint\.The product surface is part of the research contribution because person\-grounded skills become consequential only when they can move across tools, teams, and sharing contexts\. Installers, manifests, gallery metadata, rollback state, and deletion paths make the artifact legible to users and hosts rather than leaving it as a private prompt\. They also create concrete handles for future study: researchers can compare source scopes, correction records, invocation modes, and publication labels without reverse\-engineering hidden memory\. In this sense, productization is not a cosmetic layer on top of distillation\. It is what turns person\-grounded distillation into an inspectable software object whose ownership, provenance, versioning, deployment boundaries, and evaluation handles can be compared, audited, and contested in concrete deployment settings rather than inferred from hidden model behavior\.

## 9Limitations and Responsible Deployment

COLLEAGUE\.SKILLtreats person\-grounded skills as editable artifacts, not faithful simulations, identity substitutes, or consent proxies\. This paper documents the artifact format, workflow, implementation, and public deployment surface, while leaving source matching, task performance, emotional safety, and user trust calibration open\. Real deployments will depend on source quality, extraction quality, model behavior, and human review\. Corrections can improve an artifact over time, but they can also encode editor bias or make contested traces appear more settled than they are\.

Responsible deployment therefore requires explicit participation, scoped source collection, access controls, retention limits, and non\-mandatory use\. The local\-first, inspectable, and versioned design provides useful governance affordances, while lawful source use, consent, and full redaction require separate review\. Gallery publication should remain opt\-in, with submitter attestation, review, takedown, source\-boundary labels, and visible disclaimers for celebrity/public\-figure or relationship extensions\.

## 10Conclusion

This paper presentedCOLLEAGUE\.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation\. The central claim is not that agents should recreate people, but that selected human traces can be distilled into portable, inspectable skills that encode capabilities, mental models, behavior constraints, and correction history\. The colleague setting remains the most concrete starting point, while celebrity, relationship, and gallery components show how the same artifact model extends to broader person\-grounded distillation scenarios\. The practical message is that digital distillation should produce artifacts that users can read, revise, install, withhold, and delete, rather than opaque prompts that merely sound like a target person\. This keeps the paper’s ambition grounded: the system does not solve behavioral fidelity, but it makes person\-grounded distillation visible enough to be governed, improved, and evaluated\. More broadly,COLLEAGUE\.SKILLpoints toward a product\-oriented research path for digital doubles: bounded packages with explicit evidence, rights, correction semantics, and distribution choices\. That framing also clarifies what future benchmarks should test: not open\-ended impersonation, but whether a bounded package preserves useful judgment while making provenance, consent, and failure modes visible to users\. Future work should measure useful judgment and interaction quality without obscuring source quality, consent, provenance, and safety boundaries across concrete deployment settings, artifact variants, and application domains\.

## Acknowledgements

We especially thank the community members who contributed skills, submitted feedback, and supported the public gallery\. Their participation helped turnCOLLEAGUE\.SKILLfrom a colleague\-knowledge experiment into a broader publicly deployed person\-grounded skill ecosystem\. We are grateful for their willingness to engage with, extend, and encourage the idea\. The GitHub handles below are listed alphabetically, with numeric handles first\.

0xAlexWu 123pyLeo 1544501967 1sh1ro 2559063619 Aar0nPB AdeleZhu Adrin agenmod AicbLab aka556 alchaincyf AnonBug arould001 awecsfgvs baibai2013 bankeluilian baojiachen0214 Bayson\-create BeamusWayne binggandata BiscuitCoder bombers26 Bughouse1024 ByteRax c0dedance Canding3021 cantian\-ai ccjincc ceetity Charpup ChrisWu11 ClarkYoung\-xhs CommitHu502Craft cyber\-immortal cyberk1895 Cyh29hao dadwadw233 daiyanpgg\-wq DanZai233 davecat Dclef derrickgong87 dglijin\-oss dull\-bird DysonSWang EastZsRoad FANzR\-arch Fhui Formangarden524 gufenglees guilings Hchen1218 heywanrong hotcoffeeshake huaqiang\-huang HughYau islanddddddd Jack3582\-eng Janlaywss jiangziyan\-693 JikunR JimmyJiang67 jinchenma94 kangarooking KingOfLitangDz KKKKhazix KKunkuner leezythu leilei926524\-tech liangfeiiiii LijiayuDeng linzzzzzz lipG\-waver lisi LittleLittleCloud liuyishou\-skill lkysyzxz ly\-xxx Lyricus233 melonlee miaomiao\-offical mickey996icu MIMIFY Ming\-H Minksgo minruixu miunasu moismin moyvch NatalieCao323 Neko\-Suwako nicepkg notdog1998 nowork\-studio nullurl onism11 op563296 open\-source\-zjq OpenDemon otter1101 Palind\-Rome perkfly prog\-le Pronting Ratnachem realteamprinz Ricardo\-Vv riceshowerX rottenpen SamadhiFire Schlaflied sherjy smallnest snowyowlmia SonicBotMan TammyTan516 TerryTian\-tech therealXiaomanChu thtang titanwings tmstack To\-Carpe\-Diem Tomsawyerhu TOPDzZzz Trailblazer\-Aha Trust\-000 UniversePeak VeniVeci vogtsw voidforall Walshyu walter201230 wangwu wdl339 weixr18 whu125 wildbyteai will2025btc WilliamX1019 wkbin xiaoheizi8 xiaoshiyilangzhao1996\-droid xr843 yangdongchen66\-boop yanghaoraneve yaofeino1 ybq22 yeasy1003 yhz\-2134 YIKUAIBANZI Yinmu YixiaJack Ylsssq926 YourongZhou YuzeHao2023 z969081067\-commits zesion21 zgjq zhangeven686\-dot zhanghaichao520 zhangsan ZhangZangQian Zhrq\-vis ZouR\-Ma

## References
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Similar Articles

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents

SkillCorpus: Consolidating and Evaluating the Open Skill Ecosystem for Real-World LLM Agents

SkillCenter: A Large-Scale Source-Grounded Skill Library for Autonomous AI Agents

@free_ai_guides: https://x.com/free_ai_guides/status/2071666929451094227

SkillGen: Verified Inference-Time Agent Skill Synthesis

Submit Feedback

Similar Articles

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents
SkillCorpus: Consolidating and Evaluating the Open Skill Ecosystem for Real-World LLM Agents
SkillCenter: A Large-Scale Source-Grounded Skill Library for Autonomous AI Agents
@free_ai_guides: https://x.com/free_ai_guides/status/2071666929451094227
SkillGen: Verified Inference-Time Agent Skill Synthesis