skill-evolution

Tag

Cards List
#skill-evolution

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

arXiv cs.AI · 5d ago Cached

SkillAudit introduces a framework for evolving LLM agent skills without ground-truth feedback by using paired trajectory auditing and contrastive evaluation. It achieves 73.9% average task reward across 89 tasks, outperforming baseline methods.

0 favorites 0 likes
#skill-evolution

VisualClaw: A Real-Time, Personalized Agent for the Physical World

Hugging Face Daily Papers · 5d ago Cached

VisualClaw is a self-evolving multimodal agent that reduces deployment costs through hybrid encoding and skill evolution, while improving video-QA accuracy across multiple benchmarks.

0 favorites 0 likes
#skill-evolution

SkillCAT: Contrastive Assessment and Topology-Aware Skill Self-Evolution for LLM Agents

arXiv cs.CL · 2026-06-12 Cached

SkillCAT is a training-free framework for LLM agent skill self-evolution that addresses limitations of single-trace bias, unverified merging, and full corpus loading via three stages: Contrastive Causal Extraction, Assessment-Augmented Evolution, and Topology-Aware Task Execution, achieving up to 40.40% improvement on benchmarks.

0 favorites 0 likes
#skill-evolution

SkillChain: Closing the Loop on Skill Evolution for Image-Based E-Commerce AI Assistants

arXiv cs.CL · 2026-06-12 Cached

SkillChain automates the lifecycle of per-intent skill specifications for image-based e-commerce AI assistants, improving response quality and user engagement through iterative refinement and routing alignment.

0 favorites 0 likes
#skill-evolution

Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses

Hugging Face Daily Papers · 2026-06-06 Cached

Bayesian-Agent presents a framework that treats reusable skills and SOPs as hypotheses, using Bayesian inference to guide agent behavior and improve task performance through posterior-guided harness optimization. It achieves significant improvements on multiple benchmarks with deepseek-v4-flash.

0 favorites 0 likes
#skill-evolution

Verilog-Evolve: Feedback-Driven and Skill-Evolving Verilog Generation

arXiv cs.CL · 2026-05-27 Cached

Verilog-Evolve is a feedback-driven framework that iteratively refines Verilog code generated by large language models, using functional simulation, synthesis, and timing metrics to promote better candidates and evolve reusable repair skills across tasks.

0 favorites 0 likes
#skill-evolution

@9hills: After trying many Agent Memory implementations, I found only two that are somewhat useful: 1. Hermes-style strictly length-limited entry-level memory and session recall, used to address personal assistant memory needs. But this has nothing to do with coding. 2. Skills precipitated from trajectories and skill evolution...

X AI KOLs Timeline · 2026-05-25 Cached

The author shares insights after trying various Agent Memory implementations, concluding that only strictly length-limited entry-level memory (like Hermes) and skill evolution based on trajectory precipitation are somewhat useful, while other graph-based or card-based methods are ineffective.

0 favorites 0 likes
#skill-evolution

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

Hugging Face Daily Papers · 2026-05-18 Cached

SkillsVote is a governance framework for long-horizon LLM agents that manages reusable skills through structured collection, recommendation, and evolution, improving performance on Terminal-Bench 2.0 and SWE-Bench Pro without model updates.

0 favorites 0 likes
#skill-evolution

SkillFlow: Flow-Driven Recursive Skill Evolution for Agentic Orchestration

arXiv cs.AI · 2026-05-15 Cached

SkillFlow proposes a flow-driven recursive skill evolution framework for LLM-based agentic orchestration, using Tempered Trajectory Balance to prevent strategy collapse and provide transparent credit assignment. Experiments on 14 datasets show significant improvements over baselines in QA, math, code, and decision-making tasks.

0 favorites 0 likes
#skill-evolution

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Papers with Code Trending · 2026-04-09 Cached

SkillClaw introduces a framework for collective skill evolution in multi-user LLM agent systems, enabling autonomous updates and cross-user knowledge transfer by aggregating interactions and feedback to improve performance across the ecosystem.

0 favorites 0 likes
← Back to home

Submit Feedback