ground-truth-free

Tag

Cards List
#ground-truth-free

SkillAudit: Ground-Truth-Free Skill Evolution via Paired Trajectory Auditing

arXiv cs.AI · 5d ago Cached

SkillAudit introduces a framework for evolving LLM agent skills without ground-truth feedback by using paired trajectory auditing and contrastive evaluation. It achieves 73.9% average task reward across 89 tasks, outperforming baseline methods.

0 favorites 0 likes
← Back to home

Submit Feedback