Tag
PanoWorld introduces spherical spatial cross-attention for panoramic reasoning, addressing limitations of MLLMs in 360-degree spatial understanding. It builds a large-scale pipeline for geometry-aware supervision and proposes a diagnostic benchmark, achieving state-of-the-art results on multiple benchmarks.