Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Hugging Face Daily Papers 06/17/26, 12:00 AM Papers

Summary

ACIE, an agentic RAG system for clinical information extraction, achieves 96.5% acceptance rate in nuclear-medicine physicians' judgments across 7,326 instances, addressing challenges of heterogeneous patient contexts and missing metadata.

Patient contexts span hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standard retrieval-augmented generation fails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (Agentic Clinical Information Extraction) at University Medicine Essen: an on-premise agentic RAG pipeline that reasons over complete patient contexts and grounds every answer in source passages for clinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospective lymphoma registry study, in which nuclear-medicine physicians verify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.

Original Article

View Cached Full Text

Cached at: 06/20/26, 02:26 PM

Paper page - Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Source: https://huggingface.co/papers/2606.19602

Abstract

ACIE, an agentic RAG system deployed in a clinical setting, demonstrates high accuracy in extracting medical information from complex patient contexts, achieving 96.5% acceptance rate by nuclear-medicine physicians across 7,326 judgments.

Patient contextsspan hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standardretrieval-augmented generationfails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (AgenticClinical Information Extraction) at University Medicine Essen: an on-premiseagentic RAG pipelinethat reasons over completepatient contextsand grounds every answer insource passagesforclinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospectivelymphoma registry study, in whichnuclear-medicine physiciansverify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2606\.19602

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.19602 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.19602 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.19602 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Paper page - Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

SEMA-RAG: A Self-Evolving Multi-Agent Retrieval-Augmented Generation Framework for Medical Reasoning

Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction

Agentic Document Extraction

Submit Feedback

Similar Articles

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

SEMA-RAG: A Self-Evolving Multi-Agent Retrieval-Augmented Generation Framework for Medical Reasoning

Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction