Tag
This paper proposes a Variance-Aware Reward Framework using GRPO to improve LLM performance on heart-focused medical question answering, achieving significant accuracy and F1 gains on a HealthBench subset.
Researchers introduce DoseBench, a benchmark of 81 OTC dosing scenarios to evaluate LLM decision-making under temporal uncertainty for acetaminophen and ibuprofen use. Results show LLMs frequently struggle with rolling-window reasoning and can produce confident but medically unsupported responses.
A tutorial and project demonstrating LoRA fine-tuning of Qwen3-1.7B on AMD MI300X using ROCm for clinical question answering, providing a CUDA-free alternative for medical AI development.