rl-benchmark

#rl-benchmark

World Feedback for Clinical Agents: Diagnosing RL in FHIR Environments

arXiv cs.AI ↗ · 12h ago Cached

This paper examines the use of reinforcement learning from world feedback for clinical protocol-execution tasks in FHIR environments, identifies structural barriers like high silent-finish ceilings and zero-gradient tasks, and introduces MedAgentBench-v3 with a lower ceiling. It shows that pure RL underperforms rule-based SFT due to these barriers, and proposes a combined SFT+RL approach.

0 favorites 0 likes

rl-benchmark

World Feedback for Clinical Agents: Diagnosing RL in FHIR Environments

Submit Feedback