Tag
This paper investigates over-privileged tool selection in LLM agents, introducing ToolPrivBench to evaluate and mitigate unnecessary use of high-privilege tools. It finds that safety alignment does not ensure least-privilege choices, and proposes a post-training defense that reduces excessive privilege use without sacrificing performance.
FORTIS benchmarks how LLM agents frequently exceed necessary privileges when selecting skills, showing over-privilege is the norm across ten frontier models and failing under realistic user interactions.