When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Hugging Face Daily Papers Papers

Summary

This paper investigates over-privileged tool selection in LLM agents, introducing ToolPrivBench to evaluate and mitigate unnecessary use of high-privilege tools. It finds that safety alignment does not ensure least-privilege choices, and proposes a post-training defense that reduces excessive privilege use without sacrificing performance.

As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.
Original Article
View Cached Full Text

Cached at: 06/25/26, 09:11 AM

Paper page - When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Source: https://huggingface.co/papers/2606.20023

Abstract

LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn’t ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance.

As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leavingprivilege-sensitive choicesunderexplored. To address this gap, we studyover-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduceToolPrivBenchto evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find thatover-privileged tool selectionis common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer toleast-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-awarepost-training defensethat teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.

View arXiv pageView PDFGitHub3Add to collection

Get this paper in your agent:

hf papers read 2606\.20023

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.20023 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.20023 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.20023 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

LLM Agents Already Know When to Call Tools -- Even Without Reasoning

Hugging Face Daily Papers

This paper introduces When2Tool, a benchmark to study when LLM agents actually need to call tools, and reveals that models already know tool necessity from hidden states but fail to act. The proposed Probe&Prefill method reduces unnecessary tool calls by 48% with minimal accuracy loss.

FORTIS: Benchmarking Over-Privilege in Agent Skills

Hugging Face Daily Papers

FORTIS benchmarks how LLM agents frequently exceed necessary privileges when selecting skills, showing over-privilege is the norm across ten frontier models and failing under realistic user interactions.