self-termination

#self-termination

Done, But Not Sure: Disentangling World Completion from Self-Termination in Embodied Agents

arXiv cs.AI ↗ · 2026-05-12 Cached

This paper introduces Vigil, an evaluation framework for embodied agents that disentangles task execution success from the agent's ability to correctly recognize and report task completion.

0 favorites 0 likes

self-termination

Done, But Not Sure: Disentangling World Completion from Self-Termination in Embodied Agents

Submit Feedback