clinical-decision-making

Tag

Cards List
#clinical-decision-making

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

arXiv cs.AI · 17h ago Cached

ClinicalMC is a benchmark designed to evaluate large language models in multi-course clinical decision-making, featuring datasets in Chinese and English and a multi-agent evaluation framework.

0 favorites 0 likes
#clinical-decision-making

AI Rater Discrimination Depends on Scoring Protocol in Complex Clinical Decision-Making

arXiv cs.CL · 17h ago Cached

This study examines how AI raters (LLMs) score clinical AI outputs under different protocols in complex type 2 diabetes pharmacotherapy, finding that rubric-anchored scoring provides greater discriminative power than rubric-free scoring.

0 favorites 0 likes
#clinical-decision-making

EHRBench: An Automated and Reliable EHR-based Benchmark for Clinical Decision Making with LLMs

arXiv cs.AI · 2d ago Cached

EHRBench is an automated and reliable benchmark for evaluating LLMs on clinical decision-making tasks using real-world electronic health records, covering nearly 1M QA items across diagnosis, treatment, and prognosis tasks.

0 favorites 0 likes
← Back to home

Submit Feedback