diagnostic-uncertainty

#diagnostic-uncertainty

Possible or Definite? A Benchmark for Evaluating Diagnostic Uncertainty Preservation in Clinical Text

arXiv cs.CL ↗ · 2026-06-18 Cached

This paper introduces a benchmark of 1,200 clinical documents with 9,184 uncertainty annotations to evaluate whether LLMs preserve diagnostic uncertainty in clinical text, finding that LLMs often fail to preserve original uncertainty cues and struggle with nuanced distinctions.

0 favorites 0 likes

diagnostic-uncertainty

Possible or Definite? A Benchmark for Evaluating Diagnostic Uncertainty Preservation in Clinical Text

Submit Feedback