multimodal-rl

Tag

Cards List
#multimodal-rl

SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning

Hugging Face Daily Papers · 2026-05-10 Cached

The paper introduces SeePhys Pro, a benchmark to diagnose modality transfer issues in multimodal RL for physics reasoning, revealing that models struggle with representation-invariant reasoning and often rely on residual textual cues rather than visual evidence.

0 favorites 0 likes
#multimodal-rl

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Papers with Code Trending · 2026-05-01 Cached

The paper introduces PRISM, a method that inserts a distribution-alignment stage between supervised fine-tuning and reinforcement learning to mitigate distributional drift in multimodal models. It uses a black-box adversarial game with an MoE discriminator to improve RLVR performance on models like Qwen3-VL.

0 favorites 0 likes
← Back to home

Submit Feedback