Tag
This paper investigates whether spatial geometry improves language-agent memory recall, demonstrating that geometry must lead recall over recency and importance, and that a ray-tracing visibility predicate is crucial for occlusion handling in 3D voxel worlds.
The paper introduces SpatialUncertain, a benchmark to evaluate whether vision-language models recognize when they cannot answer spatial questions due to occlusion or perspective ambiguity, revealing overconfidence and poor abstention behavior.