IRC-Bench: Recognizing Entities from Contextual Cues in First-Person Reminiscences

arXiv cs.CL Papers

Summary

This paper introduces IRC-Bench, a benchmark for recognizing implicit entities in first-person reminiscences using contextual cues rather than explicit mentions. It evaluates various LLM and retrieval configurations, finding QLoRA-adapted Llama 3.1 8B to be the top performer in open-world settings.

arXiv:2605.06142v1 Announce Type: new Abstract: When people recount personal memories, they often refer to people, places, and events indirectly, relying on contextual cues rather than explicit names. Such implicit references are central to reminiscence narratives: first-person accounts of lived experience used in therapeutic, archival, and social settings. They pose a difficult computational problem because the intended entity must be inferred from dispersed narrative evidence rather than from a local mention. We introduce IRC-Bench, the Implicit Reminiscence Context Benchmark, for evaluating implicit entity recognition in reminiscence transcripts. The benchmark targets non-locality: entity-identifying cues are distributed across multiple, non-contiguous clauses, unlike named entity recognition, entity linking, or coreference resolution. IRC-Bench comprises 25,136 samples constructed from 12,337 Wiki-data-linked entities across 1,994 transcripts spanning 11 thematic domains. Each sample pairs an Entity-Grounded Narrative, in which the target entity is explicitly mentioned, with an Entity-Elided Narrative, in which direct mentions are removed. We evaluate 19 configurations across LLM generation, dense retrieval, RAG, and fine-tuning. QLoRA-adapted Llama 3.1 8B performs best in the open-world setting (38.94% exact match; 51.59% Jaccard), while fine-tuned DPR leads closed-world retrieval (35.38% Hit@1; 71.49% Hit@10). We release IRC-Bench with data, code, and evaluation tools.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/08/26, 07:16 AM

# IRC-Bench: Recognizing Entities from Contextual Cues in First-Person Reminiscences
Source: [https://arxiv.org/abs/2605.06142](https://arxiv.org/abs/2605.06142)
[View PDF](https://arxiv.org/pdf/2605.06142)

> Abstract:When people recount personal memories, they often refer to people, places, and events indirectly, relying on contextual cues rather than explicit names\. Such implicit references are central to reminiscence narratives: first\-person accounts of lived experience used in therapeutic, archival, and social settings\. They pose a difficult computational problem because the intended entity must be inferred from dispersed narrative evidence rather than from a local mention\. We introduce IRC\-Bench, the Implicit Reminiscence Context Benchmark, for evaluating implicit entity recognition in reminiscence transcripts\. The benchmark targets non\-locality: entity\-identifying cues are distributed across multiple, non\-contiguous clauses, unlike named entity recognition, entity linking, or coreference resolution\. IRC\-Bench comprises 25,136 samples constructed from 12,337 Wiki\-data\-linked entities across 1,994 transcripts spanning 11 thematic domains\. Each sample pairs an Entity\-Grounded Narrative, in which the target entity is explicitly mentioned, with an Entity\-Elided Narrative, in which direct mentions are removed\. We evaluate 19 configurations across LLM generation, dense retrieval, RAG, and fine\-tuning\. QLoRA\-adapted Llama 3\.1 8B performs best in the open\-world setting \(38\.94% exact match; 51\.59% Jaccard\), while fine\-tuned DPR leads closed\-world retrieval \(35\.38% Hit@1; 71\.49% Hit@10\)\. We release IRC\-Bench with data, code, and evaluation tools\.

## Submission history

From: Yehudit Aperstein \[[view email](https://arxiv.org/show-email/806637db/2605.06142)\] **\[v1\]**Thu, 7 May 2026 12:39:49 UTC \(1,211 KB\)

Similar Articles

Beyond Static Personas: Situational Personality Steering for Large Language Models

arXiv cs.CL

This paper introduces IRiS, a training-free framework for situational personality steering in LLMs that moves beyond static persona modeling by identifying and leveraging situation-dependent persona neurons. The approach demonstrates that LLM behavior varies contextually and proposes neuron-based identification, retrieval, and weighted steering methods validated on PersonalityBench and a new SPBench benchmark.

PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

arXiv cs.CL

Researchers propose PRISM, a diagnostic benchmark that breaks down LLM hallucinations into four dimensions (knowledge missing/errors, reasoning errors, instruction-following errors) across three generation stages (memory, instruction, reasoning), evaluating 24 LLMs to reveal trade-offs in mitigation strategies.