RemoteZero: Geospatial Reasoning with Zero Human Annotations

Hugging Face Daily Papers 05/06/26, 12:00 AM Papers

Summary

RemoteZero is a framework that eliminates the need for human-annotated box supervision in geospatial reasoning by leveraging the semantic verification capabilities of multimodal large language models (MLLMs) to enable self-evolving localization from unlabeled remote sensing data.

Geospatial reasoning requires models to resolve complex spatial semantics and user intent into precise target locations for Earth observation. Recent progress has liberated the reasoning path from manual curation, allowing models to generate their own inference chains. Yet a final dependency remains: they are still supervised by human-annotated ground-truth coordinates. This leaves the reasoning process autonomous, but not its spatial endpoint, and prevents true self-evolution on abundant unlabeled remote sensing data. To break this bottleneck, we introduce RemoteZero, a box-supervision-free framework for geospatial reasoning. RemoteZero is motivated by a simple asymmetry: an MLLM is typically better at verifying whether a region satisfies a query than at directly generating precise coordinates. Leveraging this stronger discriminative ability, RemoteZero replaces geometric supervision with intrinsic semantic verification and enables GRPO training without box annotations. The resulting framework further supports iterative self-evolution, allowing the model to improve from unlabeled remote sensing imagery through its own verification signal. Experiments show that RemoteZero achieves competitive performance against strong supervised methods, demonstrating the potential of self-verifying training for geospatial reasoning localization.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/08/26, 06:56 AM

Paper page - RemoteZero: Geospatial Reasoning with Zero Human Annotations

Source: https://huggingface.co/papers/2605.04451

Abstract

RemoteZero enables geospatial reasoning without box supervision by leveraging semantic verification capabilities of MLLMs for self-evolving localization from unlabeled remote sensing data.

Geospatial reasoningrequires models to resolve complex spatial semantics and user intent into precise target locations for Earth observation. Recent progress has liberated the reasoning path from manual curation, allowing models to generate their own inference chains. Yet a final dependency remains: they are still supervised by human-annotated ground-truth coordinates. This leaves the reasoning process autonomous, but not its spatial endpoint, and prevents trueself-evolutionon abundant unlabeled remote sensing data. To break this bottleneck, we introduce RemoteZero, abox-supervision-freeframework forgeospatial reasoning. RemoteZero is motivated by a simple asymmetry: anMLLMis typically better at verifying whether a region satisfies a query than at directly generating precise coordinates. Leveraging this stronger discriminative ability, RemoteZero replaces geometric supervision with intrinsicsemantic verificationand enablesGRPO trainingwithout box annotations. The resulting framework further supports iterativeself-evolution, allowing the model to improve from unlabeledremote sensing imagerythrough its own verification signal. Experiments show that RemoteZero achieves competitive performance against strong supervised methods, demonstrating the potential of self-verifying training forgeospatial reasoning localization.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.04451

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.04451 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.04451 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.04451 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

RemoteZero: Geospatial Reasoning with Zero Human Annotations

Paper page - RemoteZero: Geospatial Reasoning with Zero Human Annotations

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

TRN-R1-Zero: Text-rich Network Reasoning via LLMs with Reinforcement Learning Only

Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

Retrieve, Integrate, and Synthesize: Spatial-Semantic Grounded Latent Visual Reasoning

G-Zero: Self-Play for Open-Ended Generation from Zero Data

DiZiNER: Disagreement-guided Instruction Refinement via Pilot Annotation Simulation for Zero-shot Named Entity Recognition

Submit Feedback

Similar Articles

TRN-R1-Zero: Text-rich Network Reasoning via LLMs with Reinforcement Learning Only

Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

Retrieve, Integrate, and Synthesize: Spatial-Semantic Grounded Latent Visual Reasoning

G-Zero: Self-Play for Open-Ended Generation from Zero Data

DiZiNER: Disagreement-guided Instruction Refinement via Pilot Annotation Simulation for Zero-shot Named Entity Recognition