Tag
This paper introduces TraceLock, a lightweight plug-in controller that learns a token-commitment policy for frozen diffusion language models, improving the quality-step tradeoff across various tasks without retraining.
EVE-Agent introduces a framework for self-evolving search agents that ensure evidence verifiability by generating questions, answers, and evidence spans, and training on marginal accuracy gain of evidence. This improves grounded correctness without human annotations.