ground-truth-free

#ground-truth-free

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

arXiv cs.AI ↗ · 5d ago Cached

SkillAudit introduces a framework for evolving LLM agent skills without ground-truth feedback by using paired trajectory auditing and contrastive evaluation. It achieves 73.9% average task reward across 89 tasks, outperforming baseline methods.

0 favorites 0 likes

ground-truth-free

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

Submit Feedback