Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Hugging Face Daily Papers Papers

Summary

ACIE, an agentic RAG system for clinical information extraction, achieves 96.5% acceptance rate in nuclear-medicine physicians' judgments across 7,326 instances, addressing challenges of heterogeneous patient contexts and missing metadata.

Patient contexts span hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standard retrieval-augmented generation fails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (Agentic Clinical Information Extraction) at University Medicine Essen: an on-premise agentic RAG pipeline that reasons over complete patient contexts and grounds every answer in source passages for clinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospective lymphoma registry study, in which nuclear-medicine physicians verify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.
Original Article
View Cached Full Text

Cached at: 06/20/26, 02:26 PM

Paper page - Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Source: https://huggingface.co/papers/2606.19602

Abstract

ACIE, an agentic RAG system deployed in a clinical setting, demonstrates high accuracy in extracting medical information from complex patient contexts, achieving 96.5% acceptance rate by nuclear-medicine physicians across 7,326 judgments.

Patient contextsspan hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standardretrieval-augmented generationfails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (AgenticClinical Information Extraction) at University Medicine Essen: an on-premiseagentic RAG pipelinethat reasons over completepatient contextsand grounds every answer insource passagesforclinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospectivelymphoma registry study, in whichnuclear-medicine physiciansverify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.

View arXiv pageView PDFAdd to collection

Get this paper in your agent:

hf papers read 2606\.19602

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.19602 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.19602 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.19602 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

arXiv cs.AI

This paper introduces AgenticRAG, a framework from Microsoft that enhances enterprise knowledge base retrieval by equipping LLMs with tools for iterative search, document navigation, and analysis. It demonstrates significant improvements in recall and factuality over standard RAG pipelines on multiple benchmarks.

Agentic Document Extraction

Product Hunt

Agentic Document Extraction is a tool that uses AI agents to make documents computable by extracting structured data from unstructured documents.