Tag
This paper introduces a distribution-aware reinforcement learning framework that enhances MLLM performance in long-tailed numerical regression tasks using batch-level comparison-based supervision.