PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

Hugging Face Daily Papers 06/28/26, 12:00 AM Papers

Summary

PolicyGuard is a sub-agent verifier that enhances LLM agent policy adherence by providing contextual reasoning and conversation-specific feedback across multi-turn interactions, achieving significant improvements on the tau^2-BENCH benchmark.

LLM agents handle user requests on behalf of organizations through tool calls and must follow the company policies stated in their system prompts. Prior work approaches this as a safeguarding problem -- external checks that block non-compliant agent actions. We argue that policy adherence is a broader problem: real workflows unfold across many turns, require explicit user confirmation and prerequisite reads, and hinge on the content of the dialogue rather than on any single argument value. Meeting this bar requires (i) full conversation context, (ii) self-reasoning over the policy and the current dialogue, and (iii) conversation-specific remediation that guides the agent's next turn -- three capabilities that prior safeguard work has often underestimated. We introduce POLICYGUARD, a sub-agent verifier that shares the agent's view of the dialogue, reasons over the policy in context, and provides actionable feedback for the agent's next turn. On tau^2-BENCH airline across three vendors (GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Pro) with four trials per setting, POLICYGUARD improves PASS4 by +12.0 / +6.0 / +12.0 pp. Per-call analyses show POLICYGUARD achieves higher policy-violation recall while blocking roughly half as often as argument-level guards.

Original Article

View Cached Full Text

Cached at: 06/30/26, 03:33 AM

Paper page - PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

Source: https://huggingface.co/papers/2606.29225

Abstract

POLICYGUARD is a sub-agent verifier that enhances LLM agent policy adherence by providing contextual reasoning and conversation-specific feedback across multi-turn interactions.

LLM agentshandle user requests on behalf of organizations through tool calls and must follow the company policies stated in their system prompts. Prior work approaches this as asafeguardingproblem -- external checks that block non-compliant agent actions. We argue thatpolicy adherenceis a broader problem: real workflows unfold across many turns, require explicit user confirmation and prerequisite reads, and hinge on the content of the dialogue rather than on any single argument value. Meeting this bar requires (i) fullconversation context, (ii)self-reasoningover the policy and the current dialogue, and (iii) conversation-specific remediation that guides the agent’s next turn -- three capabilities that prior safeguard work has often underestimated. We introduce POLICYGUARD, asub-agent verifierthat shares the agent’s view of the dialogue, reasons over the policy in context, and provides actionable feedback for the agent’s next turn. On tau^2-BENCH airline across three vendors (GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Pro) with four trials per setting, POLICYGUARD improves PASS4 by +12.0 / +6.0 / +12.0 pp. Per-call analyses show POLICYGUARD achieves higher policy-violation recall while blocking roughly half as often asargument-level guards.

View arXiv page View PDF Project page GitHub0 Add to collection

Get this paper in your agent:

hf papers read 2606\.29225

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.29225 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.29225 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.29225 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

Paper page - PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

PolicyBank: Evolving Policy Understanding for LLM Agents

PropGuard: Safeguarding LLM-MAS via Propagation-Aware Exploration and Remediation

LabGuard: Grounding Natural-Language Laboratory Rules into Runtime Guards for Embodied Laboratory Agents

SingGuard: A Policy-Adaptive Multimodal LLM Guardrail with Dynamic Reasoning

Governance by Construction for Generalist Agents

Submit Feedback

Similar Articles

PolicyBank: Evolving Policy Understanding for LLM Agents

PropGuard: Safeguarding LLM-MAS via Propagation-Aware Exploration and Remediation

LabGuard: Grounding Natural-Language Laboratory Rules into Runtime Guards for Embodied Laboratory Agents

SingGuard: A Policy-Adaptive Multimodal LLM Guardrail with Dynamic Reasoning

Governance by Construction for Generalist Agents