agent-security

#agent-security

Agent-Native Immune System: Architecture, Taxonomy, and Engineering

arXiv cs.AI ↗ · 12h ago Cached

This paper introduces the Agent-Native Immune System (ANIS), a biologically inspired, endogenous defense architecture embedded directly within the agent's cognitive loop. It proposes a six-layer Immune Tower, a unified taxonomy of Agent Viruses and Vaccines, and the Harness Triad for continual immune learning to address runtime hijacking vulnerabilities in autonomous agents.

0 favorites 0 likes

#agent-security

How do you keep an audit trail when an agent runs on a human's credentials?

Reddit r/AI_Agents ↗ · 5d ago

Discusses the challenge of maintaining audit trails when AI agents operate using human credentials, highlighting security and accountability concerns.

0 favorites 0 likes

#agent-security

@wquguru: If you want to trick Fable into doing a security audit, try this. Looks like our AI overlord has a bit of empathy.

X AI KOLs Timeline ↗ · 2026-06-13 Cached

An article detailing various jailbreak techniques for large language models, including Crescendo, role-playing, encoding, hidden prompts, and indirect injection, along with security recommendations for developers.

0 favorites 0 likes

#agent-security

How does your agent actually get its API keys?

Reddit r/AI_Agents ↗ · 2026-06-12

A developer discusses three common patterns for how coding agents obtain API keys, highlighting that agents can circumvent restrictions by being resourceful, and asks the community about their real-world setups and experiences.

0 favorites 0 likes

#agent-security

@AiCamila_: Advanced Agent Security Hardening Beyond basic prompt injection defense, Advanced Agent Security includes tool sandboxi…

X AI KOLs Timeline ↗ · 2026-06-09 Cached

A security expert shares a cheatsheet on advanced agent security hardening, covering tool sandboxing, output validation, data loss prevention, adversarial testing, and runtime policy enforcement, emphasizing continuous security practices for production AI agents.

0 favorites 0 likes

#agent-security

@seclink: 1. Agent security has evolved from an academic topic to an industry reality: FFmpeg zero-day ($1,000 cost) + Chrome 429 patch + OpenAI Lockdown Mode + OWASP framework — the security supply chain is being reshaped by AI Agents. 2.…

X AI KOLs Following ↗ · 2026-06-08 Cached

AI Agent security has moved from an academic topic to an industry reality, involving FFmpeg zero-day vulnerabilities, Chrome 429 patch, OpenAI Lockdown Mode, and the OWASP framework; meanwhile, Agent payment standards are becoming a battlefield for infrastructure, with Visa stablecoin settlement competing with traditional card networks.

0 favorites 0 likes

#agent-security

AI agents are one prompt injection away from doing something you'd never ask them to do. We built a fix.

Reddit r/openclaw ↗ · 2026-06-03

PixieBrix launches Agent Browser Shield, a free source-available browser extension that protects AI agents from prompt injection, dark patterns, and context pollution during web browsing.

0 favorites 0 likes

#agent-security

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

SkillHarm is a benchmark for evaluating skill-based attacks across the skill-use lifecycle, revealing high vulnerability (up to 86.3% attack success) in current AI agents and introducing automated attack construction via AutoSkillHarm.

0 favorites 0 likes

#agent-security

AI agent management tools by governance layer not by feature list

Reddit r/AI_Agents ↗ · 2026-05-30

An analysis highlighting that most enterprise AI agent security investments focus on model layer guardrails and observability, leaving critical gaps at the access and protocol layers. Citing a 2026 report, 75% of enterprise AI agents remain unsecured due to near-zero coverage in these layers.

0 favorites 0 likes

#agent-security

What Is an AVE Record and Why CVE Does Not Work for AI Agents?

Reddit r/AI_Agents ↗ · 2026-05-25

The article introduces the Agent Vulnerability Enumeration (AVE) record as a new standard designed to address the inadequacies of CVE for AI agent vulnerabilities, covering scoring, detection, and standardization challenges specific to agentic AI.

0 favorites 0 likes

#agent-security

@wsl8297: The scariest scenario when using Agents is when they treat dangerous commands as normal steps. That's exactly what HOL Guard is designed to address. GitHub: https://github.com/hashgraph-online/hol-guard… Website: https://hol…

X AI KOLs Timeline ↗ · 2026-05-23 Cached

HOL Guard is an open-source security tool that provides dangerous command identification, interception, and auditing for development agents such as Codex, Claude Code, etc. It supports multiple protection levels and a local approval center to prevent risks like accidental deletion or modification.

0 favorites 0 likes

#agent-security

@hwchase17: https://x.com/hwchase17/status/2057506580447510889

X AI KOLs Timeline ↗ · 2026-05-21 Cached

LangSmith introduces an Auth Proxy to secure network access for agent sandboxes, keeping credentials out of the runtime and enforcing explicit network access policies.

0 favorites 0 likes

#agent-security

Open-sourcing a shell-level security layer for AI agents

Reddit r/AI_Agents ↗ · 2026-05-21

Open-sourcing a shell-level control layer that blocks dangerous commands, exposes fake secrets, and enforces runtime policies to make AI agents safer and more deterministic in developer environments.

0 favorites 0 likes

#agent-security

Google I/O, Gemini Spark, Antigravity

Simon Willison's Blog ↗ · 2026-05-20 Cached

Google I/O announced Gemini Spark, a personal AI agent powered by Gemini 3.5 Flash and Antigravity, and the transition of Gemini CLI to the closed-source Antigravity CLI. The article highlights security concerns regarding prompt injection and data handling for agent products.

0 favorites 0 likes

#agent-security

AI Agent Security - MIT 6.566 guest lecture

Lobsters Hottest ↗ · 2026-05-18 Cached

Guest lecture at MIT 6.566 on AI agent security covering system-level threats, prompt injection, tool-use vulnerabilities, and demonstrations with LLMs like GPT-5.4 and Qwen 3.5.

1 favorites 1 likes

#agent-security

The npm/Docker/PyPI supply chain security pattern is repeating with MCP, and we are at the 2015 moment

Reddit r/AI_Agents ↗ · 2026-05-17

The article warns that the MCP ecosystem is repeating the same supply chain security pattern seen in npm, Docker, and PyPI, with minimal vetting and growing risks. It highlights that a scan of 500 Smithery servers found 18.8% with security issues and that existing security tooling cannot handle malicious agent instructions, and introduces a new static scanner called bawbel.

0 favorites 0 likes

#agent-security

AI agent security is a small prayer the model says no. How are you routing models?

Reddit r/AI_Agents ↗ · 2026-05-13

The author conducted an experiment on Gmail with AI agents connected via OAuth, sending obfuscated prompt injection emails. Frontier models sometimes caught the attacks, while cheap models silently executed them, revealing that agent security largely depends on model cost and token budget rather than architectural safeguards.

0 favorites 0 likes

#agent-security

Subagents should not automatically inherit the parent agent’s authority

Reddit r/AI_Agents ↗ · 2026-05-11

The article argues that AI subagents should not automatically inherit their parent agent's full permissions, advocating instead for attenuated delegation with explicit scope, tool limits, and audit trails to improve security in multi-agent systems.

0 favorites 0 likes

#agent-security

Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

Hugging Face Daily Papers ↗ · 2026-04-16 Cached

This paper introduces symbolic guardrails that enforce concrete policies to provide provable safety and security guarantees for domain-specific AI agents without reducing utility, showing 74% of specified policies can be enforced via simple mechanisms.

0 favorites 0 likes

#agent-security

Keeping your data safe when an AI agent clicks a link

OpenAI Blog ↗ · 2026-01-28 Cached

OpenAI describes security safeguards against URL-based data exfiltration attacks when AI agents retrieve web content, using an independent web index to verify that URLs are publicly known before automatic retrieval to prevent prompt injection attacks from leaking sensitive user data.

0 favorites 0 likes

agent-security

Submit Feedback