Are AI agents creating a new runtime supply-chain attack surface?

Reddit r/AI_Agents News

Summary

Discusses AI agent security as a runtime supply-chain problem beyond prompt injection, highlighting risks from untrusted data, tools, and feedback loops, and questions how developers enforce boundaries.

I’ve been thinking about AI agent security less as a prompt-injection-only problem and more as a runtime supply-chain problem. In many deployed agents, the model is no longer just generating text. It retrieves external data, reads memory, discovers tools, calls APIs, writes files, and sometimes produces outputs that later become future inputs for another agent/session. That creates a different kind of attack surface: 1. Data-side risk: untrusted documents, RAG sources, memory, emails, or web pages can influence the agent’s next actions. 2. Tool-side risk: tool descriptions, schemas, MCP servers, or API behavior can shape what the agent believes it can/should do. 3. Loop risk: an agent’s output can be stored somewhere, retrieved later, and influence future behavior, creating a kind of “viral” feedback loop. The part I find interesting is that many of these failures do not look like a single bad prompt or a single unauthorized tool call. Each step may look locally reasonable, but the end-to-end workflow can still become unsafe. For people building or deploying agents: How are you currently drawing the boundary between trusted instructions, untrusted context, and executable actions? Are you mostly relying on prompt-injection detection / guardrails, or are you enforcing constraints at the runtime/tool boundary?
Original Article

Similar Articles

Designing AI agents to resist prompt injection

OpenAI Blog

OpenAI publishes guidance on designing AI agents resistant to prompt injection attacks, arguing that modern attacks increasingly use social engineering tactics rather than simple string injections, and advocating for system-level defenses that constrain impact rather than relying solely on input filtering.