transition-aware

Tag

Cards List
#transition-aware

ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning

arXiv cs.LG · 2026-05-20 Cached

ReCrit introduces a transition-aware reinforcement learning framework for scientific critic reasoning, decomposing initial-to-critic behavior into four quadrants (Correction, Sycophancy, Robustness, Boundary) and using dynamic asynchronous rollout. It improves critic accuracy significantly on Qwen models across multiple scientific benchmarks.

0 favorites 0 likes
← Back to home

Submit Feedback