Tag
This paper explores grounding multi-hop textual-spatial stories into geometry-aware modalities like grids, showing a 42% performance improvement when switching from language-only to grid-based reasoning, and introduces a switching metric for modality selection in LLMs.