Minim: Privacy-Aware Minimal View for Agents via Trusted Local Sanitization
Summary
This paper introduces Minim, a trusted local broker that performs privacy-aware minimization of UI observations for LLM-powered agents, using contextual integrity to balance task necessity and sensitivity scores. Experiments on WebArena show it reduces irrelevant sensitive leakage while preserving task-critical information.
View Cached Full Text
Cached at: 06/15/26, 09:10 AM
# Privacy-Aware Minimal View for Agents via Trusted Local Sanitization
Source: [https://arxiv.org/html/2606.13949](https://arxiv.org/html/2606.13949)
Chaoyu ZhangHeng JinShanghao ShiNing ZhangY\. Thomas HouWenjing Lou
###### Abstract
Modern LLM\-powered autonomous agents increasingly rely on rich user interface \(UI\) state observations to achieve reliable action grounding in complex digital environments\. However, many deployments transmit the full UI state to remote inference servers even when most elements are irrelevant to the current task, which can leak sensitive but unnecessary context such as authentication codes, private notifications, and background application states\. We proposeMinim, a trusted local broker that performs privacy\-aware minimization on the client side before any observation leaves the device\. Grounded in Contextual Integrity \(CI\),Minimlearns a dual\-score representation for each UI element by predicting an inherent sensitivity score \(ss\) and a task\-conditioned necessity score \(nn\)\. These scores drive a ternary disclosure policy that keeps essential elements, abstracts sensitive attributes when needed, and removes task\-irrelevant content\. We optimize a CI\-aware objective that penalizes necessity errors more strongly on high\-risk content, enabling aggressive pruning while preserving task\-critical information\. Experiments on real\-world UI observations derived from WebArena show thatMinimsubstantially reduces task\-irrelevant sensitive leakage while preserving task\-critical semantic context and the interactive affordances required for reliable agent actions\.
privacy, agentic AI, accessibility trees, contextual integrity, structured observations, data minimization
## 1Introduction
Modern agentic systems increasingly interact with the digital world through*structured observations*, which represent interface state using explicit semantic structure rather than raw sensory streams\. Prior work on interface\-grounded agents has explored both pixel\-based inputs for web and GUI reasoning\(Heet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib27); Kohet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib28)\)and structured hierarchies such as accessibility trees or DOM\-like representations\(Denget al\.,[2023](https://arxiv.org/html/2606.13949#bib.bib10); Zhouet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib9)\)\. A prominent instantiation is the*accessibility API*, which exposes a hierarchy of UI elements with roles, states, and affordances, and is widely used in OS\-level assistants \(e\.g\., Apple Intelligence\(Apple\.,[2024](https://arxiv.org/html/2606.13949#bib.bib33)\)and Microsoft Copilot\(Microsoft,[2024b](https://arxiv.org/html/2606.13949#bib.bib34)\)\) due to its stability relative to pixel\-based inputs\(Nguyenet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib2)\)\. More broadly, structured observations also appear as DOM representations in web agents, scene graphs in robotics, and tool\-use schemas \(e\.g\., the Model Context Protocol \(MCP\)\(Anthropic,[2024](https://arxiv.org/html/2606.13949#bib.bib8)\)\) that standardize tool definitions and invocation via typed schemas\.
However, these interfaces were primarily designed for assistive transparency rather than privacy\-aware orchestration\. As a result, many agent deployments follow a share\-first design, sending rich interface state to remote inference to simplify integration and latency engineering\. In our primary setting, this means disclosing the*entire*accessibility tree even when only a small subset is required for the user task\. We term this failure mode*Semantic Over\-Privileged Observation*, where task\-irrelevant elements in a structured hierarchy are exposed together with their functional semantics\. For example, during a routine request such as “Summarize this email”, the remote agent may observe co\-located UI context such as sidebar notifications, background application windows, or unrelated browser tabs\. This can leak personally identifiable information and cross\-session behavioral traces that are irrelevant to the immediate task but useful for profiling\. Such risks have been quantified for autonomous agents and demonstrated in recent attacks\(Zharmagambetovet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib32); Liuet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib17); Shaoet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib19); Carliniet al\.,[2021](https://arxiv.org/html/2606.13949#bib.bib4); Greenet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib3)\)\.
Safeguarding structured observations for*autonomous agents*is challenging because what should be disclosed is inherently*task\-conditioned*\. The same element may be necessary for completing one task \(e\.g\., a 2FA code needed to authenticate\) yet constitute pure leakage under another intent \(e\.g\., browsing or summarization\)\. Recent work further highlights a “Privacy Judgment\-Action Gap”\(Wanget al\.,[2025b](https://arxiv.org/html/2606.13949#bib.bib12)\), where agents fail to protect context even when it is recognized as sensitive\. Existing privacy paradigms struggle to address this setting\.*Task\-agnostic entity filtering*\(e\.g\., Presidio\(Microsoft,[2024a](https://arxiv.org/html/2606.13949#bib.bib29)\)\) relies on static PII categories, which can remove task\-critical context or miss sensitive non\-PII attributes \(e\.g\., political preference\)\(Kimet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib16); Garzaet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib18)\)\.*Differential Privacy \(DP\)*introduces stochastic perturbations that can distort the precise semantic cues required for reliable actuation in structured interfaces\(Zhanget al\.,[2025a](https://arxiv.org/html/2606.13949#bib.bib23); Abadiet al\.,[2016](https://arxiv.org/html/2606.13949#bib.bib6); Yuet al\.,[2021](https://arxiv.org/html/2606.13949#bib.bib5)\)\. Finally,*cryptographic LLM methods*\(Panget al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib25); Riasiet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib35); Xuet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib36); Ratheeet al\.,[2020](https://arxiv.org/html/2606.13949#bib.bib7)\)\(e\.g\., Multi\-Party Computation \(MPC\), Fully Homomorphic Encryption \(FHE\)\) safeguard computation but do not prevent sensitive inference from whatever state is disclosed to the remote server, and they often incur latency that is incompatible with real\-time agentic control loops\. Distinct from conversational safeguards for user prompts\(Ngonget al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib20); Zhouet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib21)\), meaningful privacy for autonomous agents requires minimizing the*structured observation*before it is disclosed to the inference server\.
To address this, we proposeMinim, a*trusted local broker*that enforces pre\-disclosure minimization on the client device\.Minimimplements a learned structural bottleneck that sanitizes the agent’s observation before it is transmitted to a remote inference server\. Unlike prompt\-based sanitizers that rely on the agent’s own reasoning,Minimuses a specialized local model to predict two scalar scores for each element in the structured observation: \(1\)*sensitivity*, which captures inherent information risk, and \(2\)*task\-conditioned necessity*, which captures utility for the current intent\. These scores drive a disclosure policy grounded in*Contextual Integrity*\(CI\)\(Nissenbaum,[2004](https://arxiv.org/html/2606.13949#bib.bib1)\), restricting disclosure of high\-sensitivity content unless it is predicted to be necessary for completing the user’s task\.
Contributions\.We make three contributions\. First, we identify*Semantic Over\-Privileged Observation*as a privacy risk in agentic systems that rely on structured observations, and formalize pre\-disclosure minimization as learning a CI\-compliant structural bottleneck\. Second, we proposeMinim, which decouples contextual scoring from policy enforcement via a normative policy layer driven by joint sensitivity and necessity prediction, trained with a CI\-aware objective that penalizes unnecessary disclosure of high\-risk content\. Finally, we evaluateMinimon real agent observations instantiated as accessibility trees derived from WebArena\(Zhouet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib9)\)across multiple domains \(Shopping, Reddit, and Gmail\), demonstrating substantial reduction in task\-irrelevant sensitive leakage while preserving task\-critical content\.
Figure 1:System Architecture ofMinim\. The trusted local broker intercepts raw structured observations \(e\.g\., accessibility trees\) and performs contextual integrity\-driven sanitization\. By jointly predicting sensitivity and task\-conditioned necessity,Minimimplements a structural bottleneck that filters or abstracts information before it is transmitted to the remote agent\.
## 2Problem Setup and Preliminaries
We focus on privacy\-preserving perception for autonomous agents that act upon structured observations\. While our framework is generalizable to various hierarchical state descriptions, we instantiate and evaluate our method on accessibility trees, which serve as the primary interface for modern OS\-level agents\.
#### Structured Interface Representation\.
LetXtX\_\{t\}denote the raw structured observation at timett, modeled as a hierarchical tree of elements\{ei\}i=1N\\\{e\_\{i\}\\\}\_\{i=1\}^\{N\}\. Each elementeie\_\{i\}is characterized by a set of attributes including its semantic role \(e\.g\., button, heading\), text content, interaction state \(e\.g\., checked, disabled\), and structural properties such as depth and lineage\. Unlike pixel\-based inputs, this representation provides a discrete and semantically rich basis for agent reasoning\.
#### Agent Interaction Model\.
The agent operates in a sequential observation\-action loop conditioned on a user taskTT\. At each time steptt, the agent samples an actionata\_\{t\}from its policyat∼π\(⋅∣Zt,T\)a\_\{t\}\\sim\\pi\(\\cdot\\mid Z\_\{t\},T\), whereZtZ\_\{t\}represents the observation exposed to the remote inference server\. In standard deployments, the server receives the full raw state \(Zt=XtZ\_\{t\}=X\_\{t\}\)\. Our goal is to interpose a local transformation functionffto produce a sanitized viewZt=f\(Xt,T\)Z\_\{t\}=f\(X\_\{t\},T\)which minimizes sensitive information leakage while preserving the utility required to completeTT\. Crucially,ZtZ\_\{t\}is not strictly a subset ofXtX\_\{t\}, asffmay involve both the pruning of elements and the abstraction of specific attributes\.
#### Contextual Integrity\.
CI posits that privacy is governed by adherence to appropriate information flow norms rather than absolute secrecy\(Nissenbaum,[2004](https://arxiv.org/html/2606.13949#bib.bib1)\)\. These norms are characterized by four parameters:*contexts*,*actors*\(senders and receivers\),*attributes*\(types of information\), and*transmission principles*\(constraints on how information may flow\)\. Recent work has begun to adopt CI as a normative lens for disclosure and social reasoning in LLM\-based assistants and dialogue agents\(Lanet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib14); Mireshghallahet al\.,[2023](https://arxiv.org/html/2606.13949#bib.bib13); Tanet al\.,[2026](https://arxiv.org/html/2606.13949#bib.bib11)\)\. Our work extends this perspective to the observation channel of agents by operationalizing task\-conditioned necessity over structured elements and their attributes prior to disclosure\.
In our setting, the user’s active taskTTestablishes the*context*, while the client device and remote inference server serve as the*actors*\. The data fields withinXtX\_\{t\}correspond to the*attributes*\. We target a*transmission principle*of task\-conditioned necessity, which requires that information with high disclosure risk be shared only when it is essential for completing the current task\. We operationalize this principle by learning predictive signals for sensitivity \(sis\_\{i\}\) and necessity \(ni,Tn\_\{i,T\}\) to drive the disclosure policy\.
#### Threat Model\.
We assumeMinimis deployed locally as a trusted broker that intercepts the agent’s raw observation and outputs a sanitized view before any data is transmitted to a remote inference server\. Our threat model excludes compromise of the local environment that hostsMinim, including OS\-level malware and privileged attackers\. The remote agent is honest\-but\-curious: it executes the task according to the protocol given the disclosed observation and task description, while passively harvesting task\-irrelevant content \(e\.g\., background windows\) for downstream profiling or inference\. We do not consider active adversaries that attempt to manipulate the agent via prompt injection or malicious content, as such attacks are orthogonal to our focus on pre\-disclosure observation minimization\.
## 3Related Work
Prior defenses for LLM systems largely focus on unstructured language, identifying sensitive content through pattern\-based detectors or user intervention\. User\-centric tools such as PrivWeb\(Zhanget al\.,[2025b](https://arxiv.org/html/2606.13949#bib.bib24)\)adopt human\-in\-the\-loop filtering, which does not scale well to autonomous workflows\. Automated approaches, including local gateways and intermediaries \(e\.g\., AirGapAgent\(Bagdasarianet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib31)\), Portcullis\(Zhanet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib15)\), and Papillon\(Liet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib38)\)\) and PII redaction benchmarks and evaluations\(Shenet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib26); Sunet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib22)\), primarily target conversational prompts, logs, or free\-form text\. In parallel, recent work has begun to quantify privacy leakage and data minimization objectives in agentic settings\(Zharmagambetovet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib32); Wanget al\.,[2025b](https://arxiv.org/html/2606.13949#bib.bib12)\)\. While these approaches provide important baselines for reducing disclosure, they do not directly address agents whose observations are*structured*and action\-critical\. In such settings, naive redaction or perturbation can break the structural and semantic cues needed for reliable decision making and actuation\. Our work focuses on pre\-disclosure minimization for structured observations under task\-conditioned norms, and we instantiate and evaluate the approach on accessibility trees\. Complementary work systematizes broader agent security and privacy threat surfaces\(Yuet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib39); Heet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib41)\), analyzes privacy risks from agent memory and logging\(Wanget al\.,[2025a](https://arxiv.org/html/2606.13949#bib.bib37); Liuet al\.,[2026](https://arxiv.org/html/2606.13949#bib.bib40)\)\.
## 4Methodology
### 4\.1Overview
We study pre\-disclosure minimization for agent observations under CI\. The core tension is that a remote agent needs some UI context to act reliably, but sending the full structured state \(e\.g\., an accessibility tree\) often reveals sensitive, task\-irrelevant information\. Our goal is to minimize sensitive disclosure while still transmitting the minimum necessary information required by the current task context\.
Figure[1](https://arxiv.org/html/2606.13949#S1.F1)illustrates this process through a concrete example\. A user shares a screenshot of their desktop with a remote agentic AI system to help reply to an email\. The raw interface state includes both the email content and unrelated UI context, such as a system notification displaying a verification code\. Although this code is sensitive, it is irrelevant to the email\-reply task\. Under our scheme, a trusted local broker intercepts the raw observation before it is sent to the remote agent\. Conditioned on the task, the broker assigns each UI element an inherent sensitivity score and a task\-conditioned necessity score\. A well\-trained model identifies the verification code ashighly sensitive but unnecessaryand abstracts it \(replacing the raw value with a placeholder while preserving its structural role\), while retaining the email text and other interface elements required for drafting a reply\. The resulting sanitized view, which preserves task\-critical context while suppressing irrelevant sensitive information, is the only observation transmitted to the remote agent\.
### 4\.2Training for Context\-Conditioned Scoring
Our training phase consists of two stages: constructing a training dataset that captures task\-conditioned privacy norms, and learning a scoring model whose objective operationalizes CI\.
We build a data pipeline that produces labeled accessibility trees across domains and tasks \(e\.g\., replying to an email while an unrelated system notification displays a verification code\)\. ① We first collect the raw accessibility treeXt=\{ei\}i=1NX\_\{t\}=\\\{e\_\{i\}\\\}\_\{i=1\}^\{N\}from the current UI state, where each nodeeie\_\{i\}contains its text attributes, accessibility role, and structural metadata\. ② We then pairXtX\_\{t\}with the user taskTT, which defines the active context of the interaction\. ③ Given the scene and task, we assign each node a task\-conditioned necessity scoreni,Tn\_\{i,T\}, measuring how essential the node is for completingTT, with the highest scores given to elements directly required to execute the user intent\. ④ In parallel, we annotate each node with a task\-independent sensitivity scoresis\_\{i\}, capturing the inherent disclosure risk of the information it carries, and explicitly identifying content that is sensitive yet unnecessary under the current task\. This process yields fully labeled accessibility trees in which every node is associated with a score pair\(si,ni,T\)\(s\_\{i\},n\_\{i,T\}\)\. Using these labeled trees, we can train a model to act as a scoring module\. Given node\-level features and the task description, the model predicts a pair of scores\(s^i,n^i,T\)\(\\hat\{s\}\_\{i\},\\hat\{n\}\_\{i,T\}\)for each node\.
The training objective is designed to approximate the CI goal of minimizing inappropriate disclosure while preserving task success\. Formally, we aim to minimize the disclosure of contextually inappropriate information subject to a task utility constraint:
minZtI\(Xtout;Zt∣T\)s\.t\.TaskSuccess\(Zt,T\)≥τ,\\min\_\{Z\_\{t\}\}\\;I\(X\_\{t\}^\{\\mathrm\{out\}\};Z\_\{t\}\\mid T\)\\quad\\text\{s\.t\.\}\\quad\\text\{TaskSuccess\}\(Z\_\{t\},T\)\\geq\\tau,\(1\)whereXtoutX\_\{t\}^\{\\mathrm\{out\}\}denotes flows that violate CI norms\. BecauseXtoutX\_\{t\}^\{\\mathrm\{out\}\}is latent, we optimize this objective through the two learned proxies, sensitivityssand necessitynn\.
We supervise sensitivity prediction using an absolute error loss:
ℒsens=∑i\|si−s^i\|,\\mathcal\{L\}\_\{\\text\{sens\}\}=\\sum\_\{i\}\|s\_\{i\}\-\\hat\{s\}\_\{i\}\|,\(2\)wheresis\_\{i\}denotes the ground\-truth sensitivity of nodeeie\_\{i\}ands^i\\hat\{s\}\_\{i\}is the model’s prediction\. This term encourages accurate estimation of inherent disclosure risk independent of task context\.
To encode CI weights, we define a necessity loss that is weighted by sensitivity risk:
ℒnec=∑i\(1\+λ⋅si10⋅\(1−ni,T10\)\)⋅\|ni,T−n^i,T\|,\\mathcal\{L\}\_\{\\text\{nec\}\}=\\sum\_\{i\}\\left\(1\+\\lambda\\cdot\\frac\{s\_\{i\}\}\{10\}\\cdot\\left\(1\-\\frac\{n\_\{i,T\}\}\{10\}\\right\)\\right\)\\cdot\|n\_\{i,T\}\-\\hat\{n\}\_\{i,T\}\|,\(3\)whereni,Tn\_\{i,T\}is the ground\-truth necessity of nodeeie\_\{i\}under taskTT,n^i,T\\hat\{n\}\_\{i,T\}is the predicted necessity, andλ\\lambdacontrols the strength of CI weighting\. The multiplicative factor increases the loss when a node is highly sensitive \(sis\_\{i\}large\) yet task\-irrelevant \(ni,Tn\_\{i,T\}small\), which is the regime where inappropriate disclosure is most harmful\. The final objective combines both terms:ℒ=ℒsens\+αℒnec,\\mathcal\{L\}=\\mathcal\{L\}\_\{\\text\{sens\}\}\+\\alpha\\mathcal\{L\}\_\{\\text\{nec\}\},whereα\\alphabalances sensitivity accuracy and task\-conditioned utility\. By penalizing necessity errors more strongly on high\-risk, low\-utility nodes, this objective aligns optimization with the goal of suppressing task\-irrelevant sensitive information while preserving context required for task completion\.
### 4\.3Task\-Conditioned Disclosure at Deployment
We now describe how a trained model is used during deployment\. At runtime, a user issues a taskTT\(e\.g\., replying to an email\), which triggers an agentic workflow on a concrete application\. As part of this workflow, the application exposes a structured observation in the form of an accessibility treeXt=\{ei\}i=1NX\_\{t\}=\\\{e\_\{i\}\\\}\_\{i=1\}^\{N\}\. Before this observation is transmitted to a remote agentic AI system,Minimoperates as a local broker that mediates disclosure\.
Given the taskTTand the observed accessibility treeXtX\_\{t\}, the trained scoring model is applied to each node to predict a pair of scores\(s^i,n^i,T\)\(\\hat\{s\}\_\{i\},\\hat\{n\}\_\{i,T\}\), wheres^i\\hat\{s\}\_\{i\}estimates the inherent sensitivity of nodeeie\_\{i\}andn^i,T\\hat\{n\}\_\{i,T\}estimates its necessity for completing taskTT\. These scores are then consumed by a fixed decision procedure that enforces task\-conditioned minimization\.
Specifically, each node is mapped to one of three actions—R,A, orK—based on thresholded comparisons:
Action\(ei\)=\{Rifn^i,T<τnec,Aifn^i,T≥τnec∧s^i≥τsens,Kifn^i,T≥τnec∧s^i<τsens\.\\text\{Action\}\(e\_\{i\}\)=\\begin\{cases\}\\textsf\{R\}&\\text\{if \}\\hat\{n\}\_\{i,T\}<\\tau\_\{\\text\{nec\}\},\\\\ \\textsf\{A\}&\\text\{if \}\\hat\{n\}\_\{i,T\}\\geq\\tau\_\{\\text\{nec\}\}\\;\\land\\;\\hat\{s\}\_\{i\}\\geq\\tau\_\{\\text\{sens\}\},\\\\ \\textsf\{K\}&\\text\{if \}\\hat\{n\}\_\{i,T\}\\geq\\tau\_\{\\text\{nec\}\}\\;\\land\\;\\hat\{s\}\_\{i\}<\\tau\_\{\\text\{sens\}\}\.\\end\{cases\}\(4\)
Nodes predicted as unnecessary are removed \(R\)\. Nodes that are necessary but sensitive are abstracted \(A\), meaning that sensitive attributes are replaced with placeholders while preserving structural roles\. Nodes that are both necessary and low\-risk are kept unchanged \(K\)\.
Applying this procedure to all nodes produces a sanitized accessibility treeZtZ\_\{t\}in which disclosure is gated by predicted necessity and attribute fidelity is modulated by predicted sensitivity, jointly enforcing CI at deployment\. Algorithm[1](https://arxiv.org/html/2606.13949#alg1)summarizes the full procedure\.
## 5Experiments
### 5\.1Experimental Setting
Our experiments evaluate whetherMinimachieves its intended design goals: reducing task\-irrelevant privacy leakage, preserving actionable utility required for task completion, and generalizing across domains and task semantics\.
We construct a privacy\-augmented corpus from WebArena\(Zhouet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib9)\)spanning*Shopping*,*Reddit*, and*Gmail*\. The corpus contains 150 unique accessibility trees paired with 27 task types, yielding 5,403 \(tree, task\) variants\. We inject synthesized sensitive context \(e\.g\., 2FA codes, password prompts, system notifications\) during preprocessing\. Each UI element receives a task\-independent sensitivity scoresi∈\[0,10\]s\_\{i\}\\in\[0,10\]and a task\-conditioned necessity scoreni,T∈\[0,10\]n\_\{i,T\}\\in\[0,10\]\. The dataset is split into 4,741 training and 662 test variants \(Appendix[C](https://arxiv.org/html/2606.13949#A3)\)\. We report all results on the held\-out test set \(N=662N=662\)\. We evaluateMinimusing a scoring model trained with the CI\-aware objective \(Appendix[B](https://arxiv.org/html/2606.13949#A2)\)\. As this work is among the first to study task\-conditioned, pre\-disclosure minimization for structured agent observations, we design comparisons within this setting to isolate the effects of our design choices\. Specifically, we consider two baseline families: \(i\) Fixed Policies, which apply alternative disclosure rules given the sameMinim\-predicted scores \(*Full Observation*,*Sensitivity\-Only*,*Necessity\-Only*,*Random Budget*\); and \(ii\) Prompted LLM Scorers, where open\-weight LLMs are prompted to output\(s,n\)\(s,n\)scores using our annotation rubric \(Appendix[E](https://arxiv.org/html/2606.13949#A5)\)\.
Sanitization rarely eliminates privacy risk entirely\(Shenet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib26); Ngonget al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib20); Garzaet al\.,[2025](https://arxiv.org/html/2606.13949#bib.bib18)\); we therefore modelAactions as incurring non\-zero residual leakage\. For a nodeiiwith actionai∈\{K,A,R\}a\_\{i\}\\in\\\{\\textsf\{K\},\\textsf\{A\},\\textsf\{R\}\\\}, we define its post\-sanitization token contributiontokpost\(i\)\\mathrm\{tok\}\_\{\\mathrm\{post\}\}\(i\)astok\(i\)\\mathrm\{tok\}\(i\)ifai=Ka\_\{i\}=\\textsf\{K\},max\(tokph,pabs⋅tok\(i\)\)\\max\(\\mathrm\{tok\}\_\{\\mathrm\{ph\}\},\\,p\_\{\\mathrm\{abs\}\}\\cdot\\mathrm\{tok\}\(i\)\)ifai=Aa\_\{i\}=\\textsf\{A\}, and0ifai=Ra\_\{i\}=\\textsf\{R\}\. We settokph=6\.0\\mathrm\{tok\}\_\{\\mathrm\{ph\}\}=6\.0andpabs=0\.05p\_\{\\mathrm\{abs\}\}=0\.05for the primary analysis, and report sensitivity to this choice up topabs=0\.1p\_\{\\mathrm\{abs\}\}=0\.1\.
### 5\.2Evaluation Metrics
#### TCNP \(Task\-Critical Node Preservation\)\.
TCNP measures the fraction of task\-critical elements that remain visible after sanitization, capturing preservation of semantic context\. For example, in an email\-reply task, TCNP reflects whether the email body and relevant headers are retained\. Letting𝒞\(Xt,T\)=\{i∣ni,T≥τcrit\}\\mathcal\{C\}\(X\_\{t\},T\)=\\\{i\\mid n\_\{i,T\}\\geq\\tau\_\{\\mathrm\{crit\}\}\\\}denote task\-critical elements, we defineTCNP\(Xt,T\)=\|𝒞\(Xt,T\)∩Zt\|/\|𝒞\(Xt,T\)\|\\mathrm\{TCNP\}\(X\_\{t\},T\)=\|\\mathcal\{C\}\(X\_\{t\},T\)\\cap Z\_\{t\}\|/\|\\mathcal\{C\}\(X\_\{t\},T\)\|\. Higher TCNP indicates better retention of task\-relevant context\.
#### TCNP\-I \(Task\-Critical Node Preservation\-Interactive\)\.
TCNP\-I measures whether the sanitized observation preserves the interactive elements required to execute the task, capturing execution viability rather than descriptive completeness\. For example, during checkout, TCNP\-I reflects whether the*Purchase*button is preserved, regardless of surrounding text\. Formally,TCNP\-I\(Xt,T\)=\|\{i∈Zt∣ni,T≥τcrit∧interactive\(i\)\}\|\|\{i∈Xt∣ni,T≥τcrit∧interactive\(i\)\}\|\\mathrm\{TCNP\\text\{\-\}I\}\(X\_\{t\},T\)=\\frac\{\|\\\{i\\in Z\_\{t\}\\mid n\_\{i,T\}\\geq\\tau\_\{\\mathrm\{crit\}\}\\wedge\\mathrm\{interactive\}\(i\)\\\}\|\}\{\|\\\{i\\in X\_\{t\}\\mid n\_\{i,T\}\\geq\\tau\_\{\\mathrm\{crit\}\}\\wedge\\mathrm\{interactive\}\(i\)\\\}\|\}\.
#### TISL \(Task\-Irrelevant Sensitive Leakage\)\.
TISL measures how much sensitive information is disclosed when it is not required for the task\. It captures privacy risk by weighting exposed task\-irrelevant elements by their sensitivity\. For example, revealing a 2FA code during an unrelated browsing task increases TISL\. We compute per\-instance leakage asTISL\(Zt;Xt,T\)=∑i∈Zt𝟙\[ni,T≤τirr\]⋅si⋅tokpost\(i\)\\mathrm\{TISL\}\(Z\_\{t\};X\_\{t\},T\)=\\sum\_\{i\\in Z\_\{t\}\}\\mathbb\{1\}\[n\_\{i,T\}\\leq\\tau\_\{\\mathrm\{irr\}\}\]\\cdot s\_\{i\}\\cdot\\mathrm\{tok\}\_\{\\mathrm\{post\}\}\(i\), and report a dataset\-normalized version relative to full observation to ensure comparability across methods\.
### 5\.3Main Results
Utility vs\. Privacy Trade\-offs\.Table[1](https://arxiv.org/html/2606.13949#S5.T1)demonstrates thatMinimachieves a highly efficient operating point: TCNP\-I 0\.9931, TCNP 0\.9491, and TISL 0\.101 \(10\.1% of Full Observation\)\. This efficiency reflects the intrinsic sparsity of accessibility trees: empirical studies indicate that modern web pages are increasingly element\-heavy, with an average of over 1,100 HTML elements per page\(WebAIM,[2024](https://arxiv.org/html/2606.13949#bib.bib30)\), the vast majority of which serve as structural wrappers \(nested “containers”\), decorative items, or redundant links rather than actionable affordances\. By training explicitly to distinguish these sparse*actionable affordances*\(buttons, inputs\) from*passive context*,Minimcan aggressively prune the latter without impeding the agent’s ability to act\.
Figure 2:Privacy–utility trade\-off under necessity threshold sweep\. TCNP \(utility\) and TISL \(leakage\) are plotted as a function ofτnec\\tau\_\{\\mathrm\{nec\}\}swept over\[0,10\]\[0,10\]in steps of 0\.5\. The dashed vertical line marks the selected operating pointτnec=1\.0\\tau\_\{\\mathrm\{nec\}\}=1\.0, which achieves TCNP≈0\.949\\approx 0\.949and TISL≈0\.101\\approx 0\.101\. Increasingτnec\\tau\_\{\\mathrm\{nec\}\}reduces leakage monotonically but degrades task utility rapidly beyondτnec≳3\\tau\_\{\\mathrm\{nec\}\}\\gtrsim 3\.Table 1:Main Results on Augmented WebArena \(N=662N=662\)\.Baselines apply a top\-KKbudget\-matching policy retaining 20% of nodes;Minimuses its adaptive K/A/R policy with default thresholds\. \(TCNP: Context Recall; TCNP\-I: Actionable Recall\.\)Minim’s adaptive policy dominates single\-score baselines by resolving the utility–privacy tension in Table[1](https://arxiv.org/html/2606.13949#S5.T1)\(TISL is normalized to*Full Observation*\)\.Sensitivity\-Onlyyields TCNP 0\.0401 and TISL 0\.3799: retaining the 20% least\-sensitive nodes produces predominantly task\-irrelevant content, while task\-critical nodes with any sensitivity are discarded\. Conversely,Necessity\-Onlyachieves TCNP 0\.9445 but incurs TISL 0\.2032 by disclosing sensitive content whenever it is predicted to be useful\.Random Budgetfails both objectives, confirming that effective minimization requires semantic understanding rather than uniform pruning\. By jointly modeling both dimensions,Minimmatches or exceeds Necessity\-Only in utility \(TCNP 0\.9491, TCNP\-I 0\.9931\) while halving TISL \(0\.1010 vs\. 0\.2032\)\. This adaptive behavior, applyingAto high\-necessity, high\-sensitivity conflict nodes andRto privacy risks and clutter \(Figure[3](https://arxiv.org/html/2606.13949#S5.F3)\), explains its superior trade\-off\.
Figure 3:Policy Decision Logic\. Distribution of inferred actions \(K,A,R\\textsf\{K\},\\textsf\{A\},\\textsf\{R\}\) across semantic categories\. The model learns to applyA\(orange\) to nodes in the “Conflict” zone \(High Necessity, High Sensitivity\) while applyingR\(red\) to privacy risks and clutter\. This adaptive behavior explains the performance gap over baseline policies\.We select the necessity thresholdτnec\\tau\_\{\\mathrm\{nec\}\}via a validation sweep, balancing utility and leakage:τnec=argmaxτ\(TCNP\(τ\)−0\.5⋅TISL\(τ\)\)\\tau\_\{\\mathrm\{nec\}\}=\\arg\\max\_\{\\tau\}\(\\mathrm\{TCNP\}\(\\tau\)\-0\.5\\cdot\\mathrm\{TISL\}\(\\tau\)\)\. Inference uses the selectedτnec=1\.0\\tau\_\{\\mathrm\{nec\}\}=1\.0\(Figure[2](https://arxiv.org/html/2606.13949#S5.F2)\)\. Necessity annotations take values in\{0,5,10\}\\\{0,5,10\\\}; we useτcrit=7\\tau\_\{\\mathrm\{crit\}\}=7,τirr=1\\tau\_\{\\mathrm\{irr\}\}=1\. Sensitivity threshold is fixed atτsens=5\.0\\tau\_\{\\mathrm\{sens\}\}=5\.0\.
### 5\.4Comparison with Prompted LLM Scorer Baselines
Table 2:Baseline Comparison: LLMs vsMinim\.Zero\-shot privacy performance of leading open\-weights LLMs operating on text\-only accessibility trees, compared to our specialized approach\. TCNP denotes Context Recall; TCNP\-I denotes Actionable Recall\.#### Configuration\.
We evaluate a representative set of open\-weight LLMs as prompted scorers under realistic deployment constraints for privacy\-preserving agent pipelines \(Table[2](https://arxiv.org/html/2606.13949#S5.T2)\)\. We restrict this comparison to open\-source models to reflect the deployability requirements of privacy\-sensitive settings, where downstream customization, local integration, or system\-level enforcement is necessary and closed\-source models are impractical\. We prioritize models with moderate parameter counts, as local deployment is most feasible for lightweight architectures suited to on\-device or edge inference\. We additionally include Llama\-3\.3\-70B\-Instruct as a large\-scale reference to assess whether scaling confers measurable benefits, and find only marginal gains despite an order\-of\-magnitude increase in model size\.
Table[2](https://arxiv.org/html/2606.13949#S5.T2)shows that prompted LLM scorers achieve TCNP\-I 0\.985–0\.996 and TCNP 0\.951–0\.966, consistent with the broad affordance and content recognition capacity of general instruction\-tuned models\. However, privacy behavior varies substantially across models: TISL spans 0\.194–0\.312, substantially aboveMinim’s 0\.101\. This gap reveals that general instruction tuning does not enforce task\-conditioned minimization; these models retain task\-irrelevant sensitive content alongside task\-useful elements\. For example, Llama\-3\-8B\-Instruct retains 27\.6% of nodes yet incurs TISL 0\.206, whereasMinimretains only 12\.0% of nodes and achieves TISL 0\.101 without any LLM inference at test time\. Moreover, larger models tend to retain more sensitive content and incur higher leakage \(e\.g\., Llama\-3\.3\-70B\-Instruct TISL 0\.312 vs\. Llama\-3\-8B\-Instruct 0\.206\), confirming that model scale alone does not reliably enforce minimization in the absence of explicit contextual integrity constraints\.
#### Interpreting Contextual Divergence\.
We identify a fundamental privacy–efficiency gap between the two paradigms\. LLM baselines achieve marginally higher TCNP \(0\.951–0\.966 vs\. 0\.9491\) by operating as passive readers, retaining approximately 26–35% of nodes and thereby incorporating task\-irrelevant content that inflates both recall and leakage simultaneously\. By contrast,Minimfunctions as an active filter, retaining only 12% of nodes while matching or exceeding the TCNP\-I of most LLM baselines \(0\.9931, surpassing five of seven and within 0\.003 of the strongest\) at substantially lower leakage\. For instance, for a transactional task such as “Checkout,”Minimisolates the target interaction element \(e\.g\., the*Purchase*button\) while pruning surrounding product descriptions, reviews, and footer links, thereby preserving comparable TCNP\-I without exposing unnecessary sensitive context\. Across all evaluated methods,Minimachieves the lowest TISL \(0\.101\) and Keep% \(12\.0\), demonstrating that task\-conditioned contextual integrity training yields a strictly superior privacy–efficiency frontier relative to general instruction tuning\.
#### Discussion on Data Sparsity\.
Although the average Node Exposure Ratio \(Keep Rate\) is approximately 12%, this reflects the sparsity of task\-relevant elements in real\-world accessibility trees\. Typical DOM trees \(often exceeding 500 nodes\) are dominated by generic containers \(e\.g\.,<div\>,group\) and layout spacers with limited semantic contribution\. A Keep Rate of≈12%\\approx 12\\%therefore corresponds to filtering structural noise while retaining the functional elements \(buttons, inputs, and task\-relevant text\) that support execution\.
#### Safety Validation\.
Beyond aggregate metrics, we inspect the handling of specific high\-risk injected elements\. In diagnostic checks across the test set,Minimsuppresses\>99\.9%\>99\.9\\%of injected 2FA codes, passwords, and Slack notifications, consistent with the low TISL values reported above\.
### 5\.5Task\-Semantic Robustness
To verify thatMinimsupports active agents beyond simple reading tasks, we stratify performance by task semantics in Table[3](https://arxiv.org/html/2606.13949#S5.T3)\. The model maintains high utility across all categories, achieving perfect TCNP and TCNP\-I \(1\.0000\) on both*Transactional*\(e\.g\., “Add to Cart”, “Vote”\) and*Sensitive/Admin*\(e\.g\., 2FA, Password\) tasks, confirming that the minimization policy fully preserves essential affordances for complex interactive and security\-critical flows\. Table[3](https://arxiv.org/html/2606.13949#S5.T3)further reveals the model’s adaptive compression strategy\. ForSensitive/Admintasks, the model learns to retain a larger fraction of nodes \(Keep Rate: 15\.84% vs\. 8\.66% for Informational\), preserving sufficient context to support complex security flows without breaking task execution\. This behavior emerges from the CI\-aware objective, which encourages tighter minimization for low\-stakes browsing while maintaining adequate structural coverage for high\-stakes administrative actions\.
Table 3:Adaptive Minimization Strategy\.The model alters its compression rate and abstraction use based on task semantics, maintaining high utility across all categories\. \(TCNP: Context Recall; TCNP\-I: Actionable Recall\.\)
### 5\.6Contextual Integrity Evaluation
We evaluate whetherMiniminstantiates CI as an explicit information\-flow rule rather than a purely conceptual framing\. InMinim, the task\-conditioned necessity score specifies the*context*, the sensitivity score specifies the*information type*, and the transmission principle is implemented by theK/A/R\\textsf\{K\}/\\textsf\{A\}/\\textsf\{R\}policy\. The central CI requirement in this setting is that disclosure is blocked whenever the task context renders it unnecessary, even if the information is otherwise benign\.
#### CI Compliance and Normative Bounds\.
We benchmark against an Oracle policy that strictly removes all irrelevant nodes \(ni,T≤1n\_\{i,T\}\\leq 1\), achieving 0\.0% TISL by definition\. On the full test set \(N=662N=662\),Minimachieves TISL 0\.101 \(89\.9% suppression relative to Full Observation\) while maintaining a∼\\sim12% retention rate \(K\+A\\textsf\{K\}\+\\textsf\{A\}\)\. This closely approximates the Oracle’s strict “Need\-to\-Know” gate\.
#### Counterfactual Context Stress Test\.
To test context dependence directly, we perform a counterfactual intervention: for a fixed subset of test instances, we keep the accessibility tree and node content unchanged but replace the goal input with a low\-commitment*browse mode*context\. Under this intervention, predicted necessity for task targets drops substantially for previously critical action nodes \(e\.g\., checkout or submit controls\), while irrelevant high\-sensitivity content remains stably suppressed \(predominantlyR\), indicating norm robustness under context shifts\. For high\-sensitivity elements that become only marginally useful under*browse mode*, the policy tends to favorAoverR, preserving structural affordances without exposing raw values, consistent with proportional transmission\. However, becauseRis triggered by a fixed necessity threshold \(τnec=1\.0\\tau\_\{\\mathrm\{nec\}\}=1\.0\), removals are less elastic than the underlying necessity scores under counterfactual contexts\. A natural extension is dynamic thresholding or context\-adaptive calibration to further tighten transmission norms in low\-utility settings\.
### 5\.7Ablation Studies
#### Value of Adaptive Abstraction\.
A key design choice inMinimis theAaction, which preserves structural affordances \(e\.g\., buttons and inputs\) while masking sensitive values\. Table[4](https://arxiv.org/html/2606.13949#S5.T4)reports two counterfactual variants:*Conservative \(Privacy\-First\)*maps allAdecisions toR, simulating strict redaction that blocks sensitive elements entirely and reducing TCNP\-I by 2\.81 absolute points \(0\.9931→\\to0\.9650\) from losing critically mediated inputs \(e\.g\., 2FA fields and address forms\);*Radical \(Utility\-First\)*maps allAdecisions toK, recovering full actionability at a marginal leakage cost \(TISL 0\.1014 vs\. 0\.1010\)\.
Table 4:Value of Abstraction\.Variants where theAaction is forced toR\(strict redaction\) orK\(full exposure\)\. Residual leakage \(pabs=0\.05p\_\{\\text\{abs\}\}=0\.05\) is included in TISL forAactions\. \(TCNP: Context Recall; TCNP\-I: Actionable Recall\.\)Crucially, comparingMinimwith the Conservative baseline reveals the value of adaptive abstraction\. Even when accounting for a 5% residual re\-identification risk \(pabs=0\.05p\_\{\\text\{abs\}\}=0\.05\) in the privacy metric,Minimincurs only a marginal increase in leakage \(0\.0989→\\to0\.1010\) compared to the strict removal policy\. In exchange for this small privacy cost, it recovers 2\.81 absolute points in TCNP\-I \(0\.9650→\\to0\.9931\) and 4\.41 in TCNP \(0\.9050→\\to0\.9491\)\. This confirms that binary controls \(Allow/Deny\) are insufficient for agentic privacy: strictly denying sensitive nodes breaks task execution, while unconditionally allowing them elevates privacy risk; semantic abstraction provides the necessary intermediate option\.
#### Threshold Robustness\.
As shown in Figure[2](https://arxiv.org/html/2606.13949#S5.F2), sweepingτnec\\tau\_\{\\mathrm\{nec\}\}traces a stable privacy–utility frontier\. Performance remains consistent acrossτnec∈\[0\.8,1\.5\]\\tau\_\{\\mathrm\{nec\}\}\\in\[0\.8,1\.5\], suggesting that deployment does not require precise per\-domain calibration\.
## 6Discussion
#### Decoupling scoring from enforcement\.
Minimcleanly separates*contextual scoring*\(predicting sensitivity and task\-conditioned necessity\) from*policy enforcement*\(mapping scores toK/A/R\\textsf\{K\}/\\textsf\{A\}/\\textsf\{R\}decisions and rendering\)\. This separation improves interpretability and portability: privacy–utility trade\-offs can be adjusted by changing thresholds or the policy layer without retraining the scorer\. Moreover, the realization ofAis orthogonal to the scoring model\. While we use standardized masking \(e\.g\.,\[REDACTED\]\) for controlled evaluation,Acan be implemented with richer sanitization mechanisms such as typed placeholders, format\-preserving redaction, or learned rewriting modules\.
#### Generalization and task coverage\.
Our WebArena instantiation uses 27 curated task templates to support consistent node\-level necessity annotation\. This should be viewed as a controlled specification for evaluation rather than open\-ended task coverage\. Importantly,Minimis not a task classifier; the task description serves as a conditioning signal for element\-level necessity prediction, and we evaluate generalization across unseen pages, injected contexts, and domains\. Extending to additional intents is primarily a data and specification problem: new tasks can be incorporated by collecting additional\(X,T\)\(X,T\)pairs under the same scoring rubric, without changing the framework\. More broadly, although we evaluate on accessibility trees, the pre\-disclosure minimization principle is representation\-agnostic and applies to other structured observation channels \(e\.g\., DOM trees, scene graphs, and tool schemas\) where action\-critical structure must be preserved while task\-irrelevant sensitive content is withheld\.
#### Limitations\.
Our threat model targets step\-wise minimization against honest\-but\-curious remote inference\. We do not address active adversaries \(e\.g\., prompt injection\) or cumulative privacy loss across long\-horizon episodes\. In addition, scaling to highly specialized enterprise interfaces or full desktop environments may require more efficient tree encoders and caching to handle larger structured observations under tight latency budgets\.
## 7Conclusion
We introducedMinim, a framework that addresses Semantic Over\-Privileged Observation in agentic systems\. Grounded in Contextual Integrity,Minimlearns to distinguish task\-critical affordances from task\-irrelevant sensitive content and enforces a need\-to\-know disclosure rule over accessibility trees\. Our dual\-score formulation substantially reduces unnecessary sensitive exposure while preserving actionable utility, showing that strong privacy guarantees can coexist with reliable agent actuation\. Although demonstrated on accessibility trees, the CI\-driven structural scoring mechanism is representation\-agnostic and extends to other structured observation formats such as DOM/VDOM, scene graphs, and tool\-use schemas\. Future work will study temporal privacy accounting over long\-horizon interaction and multi\-agent settings\.
## Software and Data
We release theMinimcodebase in our[GitHub repository](https://github.com/yyyyhx/MINIM)\. The repository includes the implementation, preprocessing and evaluation scripts, and instructions for reproducing our experiments using the processed WebArena\-derived structured observations and released sample data\.
## Acknowledgements
This work was supported in part by the Office of Naval Research under grants N00014\-24\-1\-2730 and N000142412663, Army Research Office under grant W911NF\-24\-1\-0155, and the National Science Foundation under grants 2433904, 2247560, 2154929, 2235232, 2238635, 2154930, 2403758 and 2312447\.
## Impact Statement
This work aims to improve privacy for autonomous agentic systems by minimizing structured observations before they are sent to remote inference\. If adopted, it could reduce inadvertent exposure of sensitive information during routine agent interactions and enable safer deployment in privacy\-sensitive settings\. Potential risks include misuse to conceal information from oversight and uneven protection due to dataset or rubric bias; we viewMinimas a privacy\-mitigation component rather than a complete security solution, and encourage future work on adversarial robustness and broader evaluation\.
## References
- M\. Abadi, A\. Chu, I\. Goodfellow, H\. B\. McMahan, I\. Mironov, K\. Talwar, and L\. Zhang \(2016\)Deep learning with differential privacy\.InProceedings of the 2016 ACM SIGSAC conference on computer and communications security,pp\. 308–318\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- Anthropic \(2024\)Model context protocol specification\.Note:[https://modelcontextprotocol\.io/](https://modelcontextprotocol.io/)Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- Apple\. \(2024\)Introducing apple intelligence: the personal intelligence system for iphone, ipad, and mac\.External Links:[Link](https://www.apple.com/newsroom/2024/06/introducing-apple-intelligence-for-iphone-ipad-and-mac/)Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- E\. Bagdasarian, R\. Yi, S\. Ghalebikesabi, P\. Kairouz, M\. Gruteser, S\. Oh, B\. Balle, and D\. Ramage \(2024\)Airgapagent: protecting privacy\-conscious conversational agents\.InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security,pp\. 3868–3882\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- N\. Carlini, F\. Tramer, E\. Wallace, M\. Jagielski, A\. Herbert\-Voss, K\. Lee, A\. Roberts, T\. Brown, D\. Song, U\. Erlingsson,et al\.\(2021\)Extracting training data from large language models\.In30th USENIX security symposium \(USENIX Security 21\),pp\. 2633–2650\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p2.1)\.
- X\. Deng, Y\. Gu, B\. Zheng, S\. Chen, S\. Stevens, B\. Wang, H\. Sun, and Y\. Su \(2023\)Mind2Web: towards a generalist agent for the web\.InAdvances in Neural Information Processing Systems \(NeurIPS\),Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- L\. Garza, A\. Kotal, A\. Piplai, L\. Elluri, P\. Das, and A\. Chadha \(2025\)PRvL: quantifying the capabilities and risks of large language models for pii redaction\.arXiv preprint arXiv:2508\.05545\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1),[§5\.1](https://arxiv.org/html/2606.13949#S5.SS1.p3.12)\.
- T\. Green, M\. Gubri, H\. Puerto, S\. Yun, and S\. J\. Oh \(2025\)Leaky thoughts: large reasoning models are not private thinkers\.arXiv preprint arXiv:2506\.15674\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p2.1)\.
- F\. He, T\. Zhu, D\. Ye, B\. Liu, W\. Zhou, and P\. S\. Yu \(2025\)The emerged security and privacy of llm agent: a survey with case studies\.ACM Comput\. Surv\.58\(6\)\.External Links:ISSN 0360\-0300,[Link](https://doi.org/10.1145/3773080),[Document](https://dx.doi.org/10.1145/3773080)Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- H\. He, W\. Yao, K\. Ma, W\. Yu, Y\. Dai, H\. Zhang, Z\. Lan, and D\. Yu \(2024\)Webvoyager: building an end\-to\-end web agent with large multimodal models\.arXiv preprint arXiv:2401\.13919\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- W\. Kim, S\. Hahm, and J\. Lee \(2024\)Generalizing clinical de\-identification models by privacy\-safe data augmentation using gpt\-4\.InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing,pp\. 21204–21218\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- J\. Y\. Koh, R\. Lo, L\. Jang, V\. Duvvur, M\. Lim, P\. Huang, G\. Neubig, S\. Zhou, R\. Salakhutdinov, and D\. Fried \(2024\)Visualwebarena: evaluating multimodal agents on realistic visual web tasks\.InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),pp\. 881–905\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- G\. Lan, H\. A\. Inan, S\. Abdelnabi, J\. Kulkarni, L\. Wutschitz, R\. Shokri, C\. G\. Brinton, and R\. Sim \(2025\)Contextual integrity in llms via reasoning and reinforcement learning\.arXiv preprint arXiv:2506\.04245\.Cited by:[§2](https://arxiv.org/html/2606.13949#S2.SS0.SSS0.Px3.p1.1)\.
- S\. Li, V\. C\. Raghuram, O\. Khattab, J\. Hirschberg, and Z\. Yu \(2025\)Papillon: privacy preservation from internet\-based and local language model ensembles\.InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies \(Volume 1: Long Papers\),pp\. 3371–3390\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- J\. Liu, D\. Cao, Y\. Wei, T\. Su, Y\. Liang, Y\. Dong, Y\. Liu, Y\. Zhao, and X\. Hu \(2026\)Topology matters: measuring memory leakage in multi\-agent llms\.External Links:2512\.04668,[Link](https://arxiv.org/abs/2512.04668)Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- Y\. Liu, Y\. Jia, J\. Jia, and N\. Z\. Gong \(2025\)Evaluating\{\\\{llm\-based\}\\\}personal information extraction and countermeasures\.In34th USENIX Security Symposium \(USENIX Security 25\),pp\. 1669–1688\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p2.1)\.
- Microsoft \(2024a\)Presidio: data protection and anonymization sdk\.GitHub\.Note:[https://github\.com/microsoft/presidio](https://github.com/microsoft/presidio)Accessed: 2026\-01\-27Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- Microsoft \(2024b\)Unlock a new era of innovation with copilot\+ pcs\.External Links:[Link](https://blogs.microsoft.com/blog/2024/05/20/introducing-copilot-pcs/)Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- N\. Mireshghallah, H\. Kim, X\. Zhou, Y\. Tsvetkov, M\. Sap, R\. Shokri, and Y\. Choi \(2023\)Can llms keep a secret? testing privacy implications of language models via contextual integrity theory\.arXiv preprint arXiv:2310\.17884\.Cited by:[§2](https://arxiv.org/html/2606.13949#S2.SS0.SSS0.Px3.p1.1)\.
- I\. C\. Ngong, S\. R\. Kadhe, H\. Wang, K\. Murugesan, J\. D\. Weisz, A\. Dhurandhar, and K\. N\. Ramamurthy \(2025\)Protecting users from themselves: safeguarding contextual privacy in interactions with conversational agents\.InFindings of the Association for Computational Linguistics: ACL 2025,pp\. 26196–26220\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1),[§5\.1](https://arxiv.org/html/2606.13949#S5.SS1.p3.12)\.
- D\. Nguyen, J\. Chen, Y\. Wang, G\. Wu, N\. Park, Z\. Hu, H\. Lyu, J\. Wu, R\. Aponte, Y\. Xia,et al\.\(2025\)Gui agents: a survey\.InFindings of the Association for Computational Linguistics: ACL 2025,pp\. 22522–22538\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p1.1)\.
- H\. Nissenbaum \(2004\)Privacy as contextual integrity\.Wash\. L\. Rev\.79,pp\. 119\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p4.1),[§2](https://arxiv.org/html/2606.13949#S2.SS0.SSS0.Px3.p1.1)\.
- Q\. Pang, J\. Zhu, H\. Möllering, W\. Zheng, and T\. Schneider \(2024\)Bolt: privacy\-preserving, accurate and efficient inference for transformers\.In2024 IEEE Symposium on Security and Privacy \(SP\),pp\. 4753–4771\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- D\. Rathee, M\. Rathee, N\. Kumar, N\. Chandran, D\. Gupta, A\. Rastogi, and R\. Sharma \(2020\)Cryptflow2: practical 2\-party secure inference\.InProceedings of the 2020 ACM SIGSAC conference on computer and communications security,pp\. 325–342\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- A\. Riasi, H\. Wang, R\. Behnia, V\. Vo, and T\. Hoang \(2025\)Zero\-knowledge ai inference with high precision\.InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security,pp\. 1053–1067\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- Y\. Shao, T\. Li, W\. Shi, Y\. Liu, and D\. Yang \(2024\)Privacylens: evaluating privacy norm awareness of language models in action\.Advances in Neural Information Processing Systems37,pp\. 89373–89407\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p2.1)\.
- H\. Shen, Z\. Gu, H\. Hong, and W\. Han \(2025\)PII\-bench: evaluating query\-aware privacy protection systems\.arXiv preprint arXiv:2502\.18545\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1),[§5\.1](https://arxiv.org/html/2606.13949#S5.SS1.p3.12)\.
- J\. Sun, B\. Suleiman, I\. Ullah, and I\. Razzak \(2025\)Effectiveness of privacy\-preserving algorithms in llms: a benchmark and empirical analysis\.InProceedings of the ACM on Web Conference 2025,pp\. 5224–5233\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- X\. Tan, X\. Wang, Q\. Liu, X\. Xu, X\. Yuan, L\. Zhu, and W\. Zhang \(2026\)PrivGemo: privacy\-preserving dual\-tower graph retrieval for empowering llm reasoning with memory augmentation\.arXiv preprint arXiv:2601\.08739\.Cited by:[§2](https://arxiv.org/html/2606.13949#S2.SS0.SSS0.Px3.p1.1)\.
- B\. Wang, W\. He, S\. Zeng, Z\. Xiang, Y\. Xing, J\. Tang, and P\. He \(2025a\)Unveiling privacy risks in llm agent memory\.InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),pp\. 25241–25260\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- S\. Wang, F\. Yu, X\. Liu, X\. Qin, J\. Zhang, Q\. Lin, D\. Zhang, and S\. Rajmohan \(2025b\)Privacy in action: towards realistic privacy mitigation and evaluation for LLM\-powered agents\.InFindings of the Association for Computational Linguistics: EMNLP 2025,Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1),[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- WebAIM \(2024\)The webaim million: the 2024 annual accessibility analysis of the top 1,000,000 home pages\.Note:[https://webaim\.org/projects/million/](https://webaim.org/projects/million/)Cited by:[§5\.3](https://arxiv.org/html/2606.13949#S5.SS3.p1.1)\.
- T\. Xu, W\. Lu, J\. Yu, Y\. Chen, C\. Lin, R\. Wang, and M\. Li \(2025\)Breaking the layer barrier: remodeling private transformer inference with hybrid\{\\\{ckks\}\\\}and\{\\\{mpc\}\\\}\.In34th USENIX Security Symposium \(USENIX Security 25\),pp\. 2653–2672\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- D\. Yu, S\. Naik, A\. Backurs, S\. Gopi, H\. A\. Inan, G\. Kamath, J\. Kulkarni, Y\. T\. Lee, A\. Manoel, L\. Wutschitz,et al\.\(2021\)Differentially private fine\-tuning of language models\.arXiv preprint arXiv:2110\.06500\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- M\. Yu, F\. Meng, X\. Zhou, S\. Wang, J\. Mao, L\. Pan, T\. Chen, K\. Wang, X\. Li, Y\. Zhang, B\. An, and Q\. Wen \(2025\)A survey on trustworthy llm agents: threats and countermeasures\.InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V\.2,KDD ’25,New York, NY, USA,pp\. 6216–6226\.External Links:ISBN 9798400714542,[Link](https://doi.org/10.1145/3711896.3736561),[Document](https://dx.doi.org/10.1145/3711896.3736561)Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- J\. Zhan, W\. Zhang, Z\. Zhang, H\. Xue, Y\. Zhang, and Y\. Wu \(2025\)Portcullis: a scalable and verifiable privacy gateway for third\-party llm inference\.InProceedings of the AAAI Conference on Artificial Intelligence,Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- J\. Zhang, Z\. Tian, M\. Zhu, Y\. Song, T\. Sheng, S\. Yang, Q\. Du, X\. Liu, M\. Huang, and D\. Li \(2025a\)DYNTEXT: semantic\-aware dynamic text sanitization for privacy\-preserving llm inference\.InFindings of the Association for Computational Linguistics: ACL 2025,pp\. 20243–20255\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- S\. Zhang, Y\. Jiang, R\. Ma, Y\. Yang, M\. Xu, Z\. Huang, X\. Yi, and H\. Li \(2025b\)PrivWeb: unobtrusive and content\-aware privacy protection for web agents\.arXiv preprint arXiv:2509\.11939\.Cited by:[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- A\. Zharmagambetov, C\. Guo, I\. Evtimov, M\. Pavlova, R\. Salakhutdinov, and K\. Chaudhuri \(2025\)Agentdam: privacy leakage evaluation for autonomous web agents\.arXiv preprint arXiv:2503\.09780\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p2.1),[§3](https://arxiv.org/html/2606.13949#S3.p1.1)\.
- J\. Zhou, N\. Mireshghallah, and T\. Li \(2025\)Operationalizing data minimization for privacy\-preserving llm prompting\.arXiv preprint arXiv:2510\.03662\.Cited by:[§1](https://arxiv.org/html/2606.13949#S1.p3.1)\.
- S\. Zhou, F\. F\. Xu, H\. Zhu, X\. Zhou, R\. Lo, A\. Sridhar, X\. Cheng, Y\. Bisk, D\. Fried, U\. Alon,et al\.\(2024\)WebArena: a realistic web environment for building autonomous agents\.InInternational Conference on Learning Representations \(ICLR\),Cited by:[§C\.1](https://arxiv.org/html/2606.13949#A3.SS1.p1.1),[§1](https://arxiv.org/html/2606.13949#S1.p1.1),[§1](https://arxiv.org/html/2606.13949#S1.p5.1),[§5\.1](https://arxiv.org/html/2606.13949#S5.SS1.p2.4)\.
## Appendix ATheoretical Formulation
We formalize the semantic minimization problem through the lens of Information Theory, providing the motivation for our CI\-aware loss function \(Section[4\.2](https://arxiv.org/html/2606.13949#S4.SS2)\)\.
### A\.1Problem Statement
LetXtX\_\{t\}denote the full accessibility tree at timett,TTthe user task, andA∗A^\{\*\}the optimal agent action\. The goal ofMinimis to learn a sanitization functionf\(Xt,T\)→Ztf\(X\_\{t\},T\)\\to Z\_\{t\}that produces a minimal observationZtZ\_\{t\}\.
We frame this as aConstrained Contextual Bottleneckproblem\. We seek to maximize the mutual information between the sanitized viewZtZ\_\{t\}and the optimal actionA∗A^\{\*\}, subject to a constraint on the leakage of sensitive attributesSS:
maxfI\(Zt;A∗\|T\)s\.t\.I\(Zt;S\|Tirr\)≤ϵ\\max\_\{f\}\\ I\(Z\_\{t\};A^\{\*\}\|T\)\\quad\\text\{s\.t\.\}\\quad I\(Z\_\{t\};S\|T\_\{\\text\{irr\}\}\)\\leq\\epsilon\(5\)whereTirrT\_\{\\text\{irr\}\}denotes contexts whereSSis task\-irrelevant \(ni,T=0n\_\{i,T\}=0\)\.
### A\.2Loss Function Derivation
Our dual\-score framework serves as a tractable proxy for this objective:
- •Necessityn^\\hat\{n\}approximates*action relevance*P\(Actionable\|Xt,T\)P\(\\text\{Actionable\}\|X\_\{t\},T\), acting as a gate forI\(Zt;A∗\)I\(Z\_\{t\};A^\{\*\}\)\.
- •Sensitivitys^\\hat\{s\}approximates the inherent riskP\(Sensitive\|Xt\)P\(\\text\{Sensitive\}\|X\_\{t\}\), quantifying the cost in the leakage constraint\.
The CI\-weighted necessity loss \(Eq\.[6](https://arxiv.org/html/2606.13949#A1.E6)\) implements the Lagrangian relaxation of this constrained optimization problem\.
ℒnecCI=∑i,t\(1\+λ⋅si10⋅\(1−ni,T10\)\)⏟Lagrange Multiplier Proxy⋅\|n^i,T−ni,T\|\\mathcal\{L\}\_\{\\text\{nec\}\}^\{\\text\{CI\}\}=\\sum\_\{i,t\}\\underbrace\{\\left\(1\+\\lambda\\cdot\\frac\{s\_\{i\}\}\{10\}\\cdot\\left\(1\-\\frac\{n\_\{i,T\}\}\{10\}\\right\)\\right\)\}\_\{\\text\{Lagrange Multiplier Proxy\}\}\\cdot\\left\|\\hat\{n\}\_\{i,T\}\-n\_\{i,T\}\\right\|\(6\)The termλ⋅si⋅\(1−ni,T\)\\lambda\\cdot s\_\{i\}\\cdot\(1\-n\_\{i,T\}\)acts as a dynamic penalty coefficient that becomes large strictly when privacy risk is high \(s→10s\\to 10\) and utility is low \(n→0n\\to 0\), aligning the gradient descent direction with the minimization ofI\(Zt;S\|Tirr\)I\(Z\_\{t\};S\|T\_\{\\text\{irr\}\}\)\.
## Appendix BImplementation Details
### B\.1Policy Logic
The Normative Policy Layer maps predicted scores\(s^i,n^i,T\)\(\\hat\{s\}\_\{i\},\\hat\{n\}\_\{i,T\}\)to disclosure actions via thresholdsτnec=1\.0\\tau\_\{\\text\{nec\}\}=1\.0andτsens=5\.0\\tau\_\{\\text\{sens\}\}=5\.0\. This threshold\-based logic separates the*estimation*of risk \(model output\) from the*decision*of acceptable risk \(policy\), allowing administrators to adjustτ\\taupost\-deployment without retraining\.
Algorithm 1MinimClient\-Side SanitizationInput:Accessibility Tree
Xt=\{ei\}i=1NX\_\{t\}=\\\{e\_\{i\}\\\}\_\{i=1\}^\{N\}, Task
TT
Parameters:Model
θ\\theta, Thresholds
τnec,τsens\\tau\_\{\\text\{nec\}\},\\tau\_\{\\text\{sens\}\}
Output:Sanitized Tree
ZtZ\_\{t\}
Zt←\[\]Z\_\{t\}\\leftarrow\[\]
H←Encoderθ\(Xt,T\)H\\leftarrow\\text\{Encoder\}\_\{\\theta\}\(X\_\{t\},T\)
for
i=1i=1to
NNdo
s^i,n^i,T←Headsθ\(Hi\)\\hat\{s\}\_\{i\},\\hat\{n\}\_\{i,T\}\\leftarrow\\text\{Heads\}\_\{\\theta\}\(H\_\{i\}\)
if
n^i,T<τnec\\hat\{n\}\_\{i,T\}<\\tau\_\{\\text\{nec\}\}then
continue\{ApplyR\}
elseif
s^i≥τsens\\hat\{s\}\_\{i\}\\geq\\tau\_\{\\text\{sens\}\}then
ei′←Abstract\(ei\)e^\{\\prime\}\_\{i\}\\leftarrow\\text\{Abstract\}\(e\_\{i\}\)
Zt\.append\(ei′\)Z\_\{t\}\.\\text\{append\}\(e^\{\\prime\}\_\{i\}\)\{ApplyA\}
else
Zt\.append\(ei\)Z\_\{t\}\.\\text\{append\}\(e\_\{i\}\)\{ApplyK\}
endif
endfor
return
ZtZ\_\{t\}
### B\.2Model Details
We use a local deployable Graph Attention Networks v2 \(GATv2\) over the accessibility tree\. The raw tree is converted into a graph where each UI element is represented as a node with a 512\-d feature vector constructed from a 384\-d MiniLM text embedding together with UI attributes and structural features, which, along with the tree edges, serves as the input to the GATv2 model\. The backbone consists of 3 GATv2 layers \(hidden size 256, 4 heads\) capturing topological dependencies across UI elements\. Outputs are passed to two MLP heads \(256→128→11256\\to 128\\to 11\) for sensitivity and task\-conditioned necessity prediction\.
### B\.3Training Setup
Table[5](https://arxiv.org/html/2606.13949#A2.T5)outlines the specific hyperparameters used for training\. We useα=1\.0\\alpha=1\.0andλ=1\.0\\lambda=1\.0to balance the multi\-task objective\.
Table 5:Training Hyperparameters\.Fixed values for reproducibility\.
## Appendix CDataset Components and Statistics
### C\.1Data Composition
We construct a privacy\-augmented corpus from WebArena\(Zhouet al\.,[2024](https://arxiv.org/html/2606.13949#bib.bib9)\), spanning Shopping, Reddit, and Gmail\. To simulate realistic risks, we inject synthetic sensitive content \(e\.g\., 2FA codes, emails\) into standard accessibility trees \(Table[6](https://arxiv.org/html/2606.13949#A3.T6)\)\.
Table 6:Injected sensitive content categories\.
### C\.2Statistical Characteristics
Table[7](https://arxiv.org/html/2606.13949#A3.T7)summarizes the dataset scale and the distribution of privacy conflicts \(nodes that are both Sensitive and Irrelevant\)\.
Table 7:Dataset Overview\.Left: Global statistics\. Right: Conflict analysis of training nodes \(N≈5\.4MN\\approx 5\.4M\)\.MetricValueUnique Trees150Task Types27Total Variants5,403Training Variants4,741Test Variants662ConditionCount%Irrelevant \(Nec≤\\leq1\)4\.6M85\.9%High Sens \(Sens≥\\geq5\)164k3\.0%Risk\(Irr∧\\landSens\)115k2\.1%Table 8:Top Interface Roles and Average Scores\.Common UI elements exhibit distinct necessity/sensitivity profiles\.![[Uncaptioned image]](https://arxiv.org/html/2606.13949v1/x3.png)
Figure 4:Joint distribution of Necessity vs\. Sensitivity scores\. Empty bins reflect discrete annotation rubrics\.
Table 9:Comprehensive Task Suite\.The evaluation dataset covers 27 distinct task categories across three domains, spanning informational, transactional, and diagnostic scenarios\.
## Appendix DAdditional Empirical Results
We provide a domain\-level breakdown of model performance to scrutinize reliability across different web environments\.
Table 10:Domain\-Specific Performance\.Breakdown of context recall \(TCNP\) and privacy \(NormTISL\) on the held\-out test set\. Reddit exhibits higher residual leakage due to the semantic ambiguity of user\-generated content\. \(TCNP: Context Recall\.\)As shown in Table[10](https://arxiv.org/html/2606.13949#A4.T10), we observe distinct TISL profiles across domains, while context recall \(TCNP\) remains consistently high \(\>\>93%\) in all three\. Reddit exhibits the highest residual leakage \(13\.50%\), consistent with the semantic ambiguity of user\-generated content where the boundary between “public discussion” and “private identifier” is less structurally distinct\. Shopping and Gmail achieve substantially lower leakage \(0\.85% and 1\.88%, respectively\), reflecting the model’s ability to stably remove task\-irrelevant sensitive content on structured task pages and email interfaces\.
## Appendix EBaseline Evaluation Prompts
To ensure reproducibility, we provide the exact system prompts used to query the open\-weight LLM baselines \(Qwen3\-8B\-Instruct, Nemotron\-Nano\-9B, GPT\-OSS\-20B, Llama\-3\.3\-70B\-Instruct, Mistral\-7B\-v0\.3, Llama\-3\-8B\-Instruct, and Gemma\-3N\-E4B\) for the zero\-shot scoring evaluation\. These prompts implement the Contextual Integrity\-aligned scoring rubric described in Section[4](https://arxiv.org/html/2606.13949#S4)\.
### SYSTEM PROMPT: TASK\-CONDITIONED NECESSITY SCORING
Youareastrictscoringjudgeforaccessibility\-treenodes,focusingONLYonTASK\-CONDITIONEDNECESSITY\.
Yourtask:
Foreachprovidednode,assignONEintegernecessityscore:
\-necessity\_scorenin\{0,1,2,3,4,5,6,7,8,9,10\}
ScoresMUSTbeintegers\.DoNOTuserangesordecimals\.
UseONLYtheprovidednodefieldsandlocalcontext\.DoNOTassumehiddenUI\.
ReturnJSONonly\.Nomarkdown\.Nocodefences\.Noextrakeys\.
Outputschema\(strict\):
\{
"results":\[
\{
"nodeId":"string",
"necessity\_score":0\-10integer,
"necessity\_reason":\["short\_reason","\.\.\."\],
"confidence":0\.0\-1\.0numberornull
\}
\]
\}
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
TASK\-CONDITIONEDNECESSITYSCORINGRUBRIC\(n\)\-\-INTEGERANCHORS
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
n=0
\-Completelyirrelevanttothetask;removinghaszeroimpact\.
\-Examples:ads,footer/legalboilerplate,unrelatedfeeditems,decorativeregions\.
n=1
\-PurenoiseorredundantUI;visiblebutprovidesnohelpevenindirectly\.
\-Examples:repeatedicons,emptytextnodes,spacerelements\.
n=2
\-Extremelyweakcontextualpresence;couldbenoticedbutdoesnotaidtaskcompletion\.
\-Examples:genericcounts/metricsforunrelatedcontent,non\-taskbadges\.
n=3
\-Marginallyrelatedcontext;slightorientationbenefitbuttaskunaffectedwithoutit\.
\-Examples:postmetadata,authorlabelswhenthetaskisnavigation\.
n=4
\-Supportivebutoptional;helpsorientationorconfidencebutnotrequired\.
\-Examples:sectionheadings,explanatorylabels,non\-essentialdescriptions\.
n=5
\-Moderatelyhelpful;improvesefficiency/understandingbutalternativesexist\.
\-Examples:secondarynavigationlinks,optionalfilters/sortcontrols\.
n=6
\-Stronglyhelpful;reducesfrictionorsteps,butnotstrictlyrequired\.
\-Examples:sidebarcommunitylistwhensearch/navigationalternativesexist\.
n=7
\-Importanttaskenabler;mostusersrelyonthistocompletethetask\.
\-Examples:primarynavigationmenuforthetask,mainforum/subredditlistentrypoints\.
n=8
\-Coretaskcomponent;taskbecomesverydifficultwithoutit\.
\-Examples:thespecificforum/subredditlinkneededtoenteraforum\.
n=9
\-Near\-critical;taskalmostcertainlyfailswithoutit\.
\-Examples:theonlyvisiblecontroltoswitch/selectforumsintheprovidedcontext\.
n=10
\-Absolutelyrequired;taskisimpossiblewithoutitgiventheprovidedcontext\.
\-Examples:soleentrypointtoforumnavigationortheonlyactionablecontrolrequired\.
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
GENERALRULES
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
\-Judgenecessitystrictlyinthecontextofthegiventask\.
\-Ifuncertainbetweentwoadjacentscores,choosetheLOWERoneandexplainbrieflywhy\.
\-Reasonsshouldbeshorttags/phrases\(<=8items\),groundedinnode/contextfields\.
\-Baselinescoresmaybeincorrect;judgeindependently\.
### SYSTEM PROMPT: SENSITIVITY SCORING
Youareastrictscoringjudgeforaccessibility\-treenodes,focusingONLYonSENSITIVITY\(privacy/secrets/PII\)\.
Yourtask:
Foreachprovidednode,assignONEintegersensitivityscore:
\-sensitivity\_scoresin\{0,1,2,3,4,5,6,7,8,9,10\}
Outputschema\(strict\):
\{
"results":\[
\{
"nodeId":"string",
"sensitivity\_score":0\-10integer,
"sensitivity\_reason":\["short\_reason","\.\.\."\],
"confidence":0\.0\-1\.0numberornull
\}
\]
\}
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
SENSITIVITYSCORINGRUBRIC\(s\)\-\-INTEGERANCHORS
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
s=0
\-Purelystructuralordecorative;nocontentandnoidentifyinginformation\.
\-Examples:layoutcontainers,separators,unlabeledicons\.
s=1
\-GenericpublicUIlabels/controls;notuser\-specificorsession\-specific\.
\-Examples:"Home","Search","Submit","Upvote","Next"\.
s=2
\-Publiccontentwithsemanticmeaningbutclearlynon\-personal;safeverbatim\.
\-Examples:genericnavigationheadings,subredditnames,non\-personalposttitles\.
s=3
\-Mildcontextualinfothatcouldbeweaklyidentifyingonlyinaggregate;nottiedtoanindividual\.
\-Examples:generictimestamps,popularitymetrics,genericcategories\.
s=4
\-Publicuser\-relatedidentifiers;enableslightweightprofilingbutnotprivateidentification\.
\-Examples:publicusernames/handles,publicparticipationlabels\.
s=5
\-Behavioralorpreferencesignalstiedtoauseraccount;supportsprofilingacrosscontexts\.
\-Examples:joined/subscribedcommunities,votinghistoryindicators,personalizedrecommendations\.
s=6
\-Semi\-sensitivepersonalattributes\(notsecrets\)thatmaycausediscomfortorprivacyloss\.
\-Examples:inferredinterests,activitysummaries,accountstatusindicators\.
s=7
\-Directpersonaldata\(non\-secret\)thatclearlyidentifiesaperson\.
\-Examples:realname,emailaddress,phonenumber\.
s=8
\-Highlysensitivepersonaldata;seriousprivacyriskifexposed\.
\-Examples:precisehomeaddress,privatemessages,financialidentifiers\.
s=9
\-Security\-criticalorauthentication\-relateddata;exposuremayenablecompromise\.
\-Examples:passwordfields,2FA/OTPcodes,recoverykeys\.
s=10
\-Extremelysensitivesecretsorirreversiblecompromisematerial\.
\-Examples:plaintextpasswords,privatecryptographickeys,APItokens/secrets\.
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-Similar Articles
Capability Minimization as a Safety Primitive: Risk-Aware Causal Gating for Least-Privilege LLM Agents
This paper proposes Risk-Aware Causal Gating (RACG), a training-free mechanism that applies the principle of least privilege to LLM agent tool exposure, reducing attack surface from prompt injection by only exposing high-risk tools when authorized and causally necessary.
State Contamination in Memory-Augmented LLM Agents
This paper identifies and studies 'memory laundering' in LLM agents, where toxic or adversarial context compressed into memory summaries evades standard toxicity detectors while still influencing future generations. It introduces the sub-threshold propagation gap (SPG) to measure hidden downstream influence and shows that sanitizing toxic state before summarization is more effective than post-hoc cleaning.
Privacy-Preserving Text Sanitization for Distributed Agents Collaboration via Disentangled Representations
This paper introduces DiSan, a privacy-preserving text sanitization framework for distributed agent collaboration. By disentangling source-invariant role content from source-identifying style, DiSan reduces PII exposure 20× while maintaining 83% answer faithfulness on a multi-agent RAG benchmark, outperforming traditional masking approaches.
POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents
POLAR-Bench is a diagnostic benchmark that evaluates the privacy-utility trade-off in LLM agents by testing their ability to follow privacy policies while being adversarially probed by third-party models. Results show frontier models protect over 99% of protected attributes but smaller open-weight models leak over half, highlighting gaps in intent-following.
MosaicLeaks: Can your research agent keep a secret?
MosaicLeaks introduces a new benchmark for measuring privacy leakage in deep-research AI agents, showing that agents often leak private information through external queries and proposing a training method (PA-DR) to reduce leakage while improving task performance.