Tag
FORTIS benchmarks how LLM agents frequently exceed necessary privileges when selecting skills, showing over-privilege is the norm across ten frontier models and failing under realistic user interactions.