cognitive-robotics

#cognitive-robotics

Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

arXiv cs.AI ↗ · 2026-05-12 Cached

This research introduces a 3D benchmark to evaluate whether Vision-Language Model (VLM) agents can achieve mirror self-recognition, a proxy for higher-order cognition. The study finds that while stronger VLMs can use reflected evidence for action, weaker models often fail to extract self-relevant information or misattribute reflections, highlighting the distinction between linguistic compliance and grounded self-identification.

0 favorites 0 likes

cognitive-robotics

Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

Submit Feedback