medical-qa

#medical-qa

A Mechanistic View of Authority Hierarchy in LLM Sycophancy

arXiv cs.CL ↗ · 2026-07-02 Cached

This paper investigates authority bias in LLMs using a controlled medical QA setting, revealing that models override correct answers in a graded manner proportional to perceived authority. The effect is localized to a critical late layer where correct answer representations are actively erased.

0 favorites 0 likes

#medical-qa

Clinically Structured Rank-Gated LoRA for Cross-Benchmark Medical Question Answering

arXiv cs.CL ↗ · 2026-07-01 Cached

The paper proposes BiRG-LoRA, a rank-gated LoRA method for medical question answering that uses clinically structured priors to select sparse rank subsets, achieving 69.31% macro-average accuracy across four benchmarks while using fewer parameters than mixture-of-experts approaches.

0 favorites 0 likes

#medical-qa

Let LLMs Judge Each Other: Multi-Agent Peer-Reviewed Reasoning for Medical Question Answering

arXiv cs.CL ↗ · 2026-06-16 Cached

This paper introduces a multi-agent peer-reviewed reasoning method where multiple LLMs independently generate chain-of-thought reasoning and then evaluate each other's outputs to select the best answer. The method outperforms single-model reasoning and majority voting on medical QA benchmarks.

0 favorites 0 likes

#medical-qa

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

arXiv cs.CL ↗ · 2026-06-05 Cached

This paper proposes a Variance-Aware Reward Framework using GRPO to improve LLM performance on heart-focused medical question answering, achieving significant accuracy and F1 gains on a HealthBench subset.

0 favorites 0 likes

#medical-qa

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

arXiv cs.CL ↗ · 2026-06-04 Cached

Researchers introduce DoseBench, a benchmark of 81 OTC dosing scenarios to evaluate LLM decision-making under temporal uncertainty for acetaminophen and ibuprofen use. Results show LLMs frequently struggle with rolling-window reasoning and can produce confident but medically unsupported responses.

0 favorites 0 likes

#medical-qa

MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required

Hugging Face Blog ↗ · 2026-05-08 Cached

A tutorial and project demonstrating LoRA fine-tuning of Qwen3-1.7B on AMD MI300X using ROCm for clinical question answering, providing a CUDA-free alternative for medical AI development.

0 favorites 0 likes

medical-qa

A Mechanistic View of Authority Hierarchy in LLM Sycophancy

Clinically Structured Rank-Gated LoRA for Cross-Benchmark Medical Question Answering

Let LLMs Judge Each Other: Multi-Agent Peer-Reviewed Reasoning for Medical Question Answering

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required

Submit Feedback