LLM Anonymization Against Agentic Re-Identification

Hugging Face Daily Papers 06/01/26, 12:00 AM Papers

anonymization re-identification llm privacy utility web-search mask-reconstruct

Summary

AURA is an LLM-powered anonymization framework that balances privacy protection against agentic web-search re-identification while preserving contextual utility through adaptive privacy scopes and mask-reconstruct methods.

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask-reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.

Original Article

View Cached Full Text

Cached at: 06/05/26, 06:09 PM

Paper page - LLM Anonymization Against Agentic Re-Identification

Source: https://huggingface.co/papers/2605.30848

Abstract

Agentic LLMs with web search change the threat model for textanonymization: weak contextual cues can become cross-referenceable evidence forre-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance toagentic web-search re-identificationand utility retention. We introduce AURA (Anonymizationwith Utility-Retention Adaptation), anLLM-powered mask-reconstructframework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts usingre-identificationattacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the jointcontextual utilitygrid. Our results show that AURA improves theprivacy-utility frontierby usingadaptive privacy scopeto strengthen resistance to agenticre-identificationand using amask-reconstruct anonymizationmethod to better preservecontextual utilityunder fixed privacy scope.

View arXiv page View PDF Project page GitHub0 Add to collection

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.30848 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.30848 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.30848 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

LLM Anonymization Against Agentic Re-Identification

Paper page - LLM Anonymization Against Agentic Re-Identification

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis

PrivacyAkinator: Articulating Key Privacy Design Decisions by Answering LLM-Generated Multiple-choice Questions

LLM-as-a-Discriminator: When Synthetic Tables Still Look Real

Submit Feedback

Similar Articles

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis

PrivacyAkinator: Articulating Key Privacy Design Decisions by Answering LLM-Generated Multiple-choice Questions

LLM-as-a-Discriminator: When Synthetic Tables Still Look Real