Tag
BCL is the first optimization framework that uses particle filtering with Bayesian updates to systematically refine label representations for information extraction tasks, showing consistent improvements over existing methods.
ACIE, an agentic RAG system for clinical information extraction, achieves 96.5% acceptance rate in nuclear-medicine physicians' judgments across 7,326 instances, addressing challenges of heterogeneous patient contexts and missing metadata.
AAbAAC is a manually annotated corpus of 115 PubMed abstracts for autoimmunity information extraction, focusing on entities like autoimmune diseases and autoantibodies. The study demonstrates improved NER performance after fine-tuning on this corpus.
This paper presents a fully local, two-stage LLM pipeline using MedGemma-27B for filling Case Report Forms from clinical notes, achieving a macro-F1 of 0.55 on the English test track and securing second place among local open-source submissions.
This paper benchmarks four large language models (Gemini 1.5 Pro, GPT-4o, Claude 3.7 Sonnet, Llama 3.1-70B) for extracting structured information from Safety Data Sheets, finding that text-based extraction with chain-of-thought prompting yields the highest accuracy (84% by Gemini 1.5 Pro) but no model surpasses the 90% threshold required for reliable industrial deployment.
This paper presents a deployment-focused study comparing LoRA fine-tuning of 24 model variants (270M–8B parameters) for merchant information extraction from financial transaction strings. The authors find that smaller models like Qwen 3.5 4B achieve 96.6% F1, within 0.35 points of the 8B baseline, while offering significant reductions in latency and cost.
This paper evaluates the open-weight LLM LLaMA 3.1 for automatic extraction of structured data from Dutch brain MRI reports, achieving high performance on visual rating scores and accurate detection of findings, with few-shot prompting improving extraction of numerical variables.
SMADE-IE is a sparse multi-agent framework for zero-shot information extraction that uses an Adaptive Mode Selector and Evidence-Driven Debate mechanism with Toulmin-style argumentation and Bayesian updates to outperform existing baselines on 9 benchmarks across NER, RE, and JERE tasks while improving token efficiency.
This paper introduces EURO-5K, a sentence-level dataset for extracting reporting obligations from EU legislation, and benchmarks discriminative and generative transformer models under full fine-tuning and parameter-efficient QLoRA. Results show that legal pretraining primarily benefits models with limited adaptation capacity, and all approaches converge around 3K samples.
This paper introduces EPPC-OASIS, an ontology-aware adaptation method for extracting structured communication behaviors from secure patient-provider messages. The approach combines Wasserstein alignment during fine-tuning with inference refinement procedures, achieving modest improvements over baselines on a de-identified corpus.
This paper proposes an LLM-based framework to extract segment disclosures from 10-K filings, improving completeness and comparability through retrieval-augmented systems for longitudinal and cross-firm analysis.
Explains how to fix agent memory by defining an ontology using Pydantic schemas, enabling structured extraction into knowledge graphs for multi-hop reasoning, with an open-source solution (Zep).
Open-sourcing Marlin-2B, a tiny VLM for extracting structured information from videos, fine-tuned to answer 'what is happening and when'. Best open model in its weight class, competitive with Gemini-2.5-flash.
This paper presents a method for comparing concordances of local grammars to optimize Named Entity Recognition for person names in Portuguese, achieving improved F-measure scores on the HAREM dataset.
This paper compares a domain-trained small language model (Olava Extract) against frontier LLMs for structured contract extraction, showing that the specialized model achieves higher F1 scores and dramatically lower cost.
Researchers from Bangladesh University of Engineering and Technology present CBRS, a multi-platform framework that filters and parses blood donation requests from social media using a dual-layer architecture and a novel 11K bilingual dataset in Bengali and English. Their LoRA fine-tuned Llama-3.2-3B model achieves 99% filtering accuracy and 92% zero-shot parsing accuracy, outperforming GPT-4o-mini and other LLMs with 35× reduced token usage.
Researchers from Banting Health AI present an AI system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction, achieving 89% accuracy compared to 62.6% for standalone LLMs, with AI-assisted workflows completing tasks 40% faster and reducing cognitive demand.
DiZiNER is a framework that uses disagreement between multiple LLMs to refine task instructions for zero-shot named entity recognition, achieving state-of-the-art results on 14 out of 18 benchmarks and significantly reducing the performance gap between zero-shot and supervised systems.
PIIBench presents a unified multi-source benchmark corpus for detecting personally identifiable information (PII) across diverse data sources. This resource addresses the need for standardized evaluation in PII detection tasks, which is critical for privacy-preserving NLP applications.