LLM Anonymization Against Agentic Re-Identification
Summary
AURA is an LLM-powered anonymization framework that balances privacy protection against agentic web-search re-identification while preserving contextual utility through adaptive privacy scopes and mask-reconstruct methods.
View Cached Full Text
Cached at: 06/05/26, 06:09 PM
Paper page - LLM Anonymization Against Agentic Re-Identification
Source: https://huggingface.co/papers/2605.30848
Abstract
AURA is an LLM-powered anonymization framework that balances privacy protection against agentic web-search re-identification while preserving contextual utility through adaptive privacy scopes and mask-reconstruct methods.
Agentic LLMs with web search change the threat model for textanonymization: weak contextual cues can become cross-referenceable evidence forre-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance toagentic web-searchre-identificationand utility retention. We introduce AURA (Anonymizationwith Utility-Retention Adaptation), anLLM-poweredmask-reconstructframework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts usingre-identificationattacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the jointcontextual utilitygrid. Our results show that AURA improves theprivacy-utility frontierby usingadaptive privacy scopeto strengthen resistance to agenticre-identificationand using amask-reconstructanonymizationmethod to better preservecontextual utilityunder fixed privacy scope.
View arXiv pageView PDFProject pageGitHub0Add to collection
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.30848 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.30848 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.30848 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents
AURA introduces an intent-directed probing step for LLM agents to surface implicit user needs behind situated queries, improving coverage on a benchmark while reducing unnecessary tool calls and preventing privacy violations.
LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability
LLM-FACETS is an open-source evaluation framework designed to help practitioners assess LLM transparency and accountability with a focus on privacy and data flow transparency. It provides a browser interface, plugin architecture, and supports multiple auditing mechanisms including token-level log-probability visualization and RAG Triad metrics.
Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis
This paper investigates whether LLMs can identify their own model family from stylometric fingerprints in role-constrained political analysis texts, even after prompt-level anonymization. The findings confirm that anonymization is insufficient and have implications for EU AI Act compliance and multi-agent system validation.
PrivacyAkinator: Articulating Key Privacy Design Decisions by Answering LLM-Generated Multiple-choice Questions
This paper presents PrivacyAkinator, an interactive tool that helps novice developers articulate privacy design decisions via LLM-generated multiple-choice questions, achieving 47% more key decisions in 73% less time compared to NIST's PRAM methodology.
LLM-as-a-Discriminator: When Synthetic Tables Still Look Real
This paper proposes an LLM-as-Discriminator method to audit privacy of synthetic tabular data by asking an LLM to classify samples as real or synthetic, showing that LLM discrimination can serve as a practical privacy audit signal.