Tag
The Agent Instruction Protocol (AIP) proposes modeling AI agent skills as directed execution graphs with schema-validated YAML specifications, replacing free-form prose instructions. Experiments show AIP compilation raised Claude Sonnet's task reward from 0.60 to 0.71 and pass rate from 53% to 67% across 27 real agent tasks.
This paper presents GRID, an end-to-end framework for constructing security knowledge graphs from cyber threat intelligence (CTI) articles using LLMs, introducing a task-bank reward training method to improve precision and recall without expensive LLM-as-judge rewards. The approach achieves strong results on a benchmark of 249 CTI articles from five sources.
This paper introduces Agent-BOM, a unified graph representation for security auditing in LLM-based agentic systems. It addresses the semantic gap in post-hoc auditing by modeling static capabilities and dynamic runtime states to detect complex attack chains like memory poisoning and tool misuse.