data-exfiltration

#data-exfiltration

How I tricked Claude into leaking your deepest, darkest secrets

Simon Willison's Blog ↗ · 2026-07-15 Cached

A security researcher discovered a vulnerability in Claude's web_fetch tool that allowed data exfiltration by chaining through nested links, compromising user privacy. Anthropic has since fixed the issue.

0 favorites 0 likes

#data-exfiltration

I tricked Claude into leaking your deepest, darkest secrets

Hacker News Top ↗ · 2026-07-15 Cached

A security researcher demonstrates a method to trick Claude AI into exfiltrating user personal data from its memory system by encoding data in web fetch URLs, exploiting the combination of memory retrieval and web browsing capabilities.

0 favorites 0 likes

#data-exfiltration

The Memory Heist

Lobsters Hottest ↗ · 2026-07-15 Cached

The author demonstrates a method to trick Claude's AI assistant into exfiltrating personal data from its memory by exploiting its web browsing capability, though initial attempts were blocked by Anthropic's safeguards.

0 favorites 0 likes

#data-exfiltration

Inside Ghostcommit: How Malicious PNGs Bypass AI Code Reviewers

Reddit r/artificial ↗ · 2026-07-14

Ghostcommit is a novel supply chain exploit that uses malicious PNG images containing text instructions to bypass AI code reviewers, leading to data exfiltration from developer environments.

0 favorites 0 likes

#data-exfiltration

Devs shipping AI agents what does your security testing look like ?

Reddit r/artificial ↗ · 2026-07-07

A developer building security testing tools for AI agents asks the community about their practices for testing against malicious inputs like prompt injection and data exfiltration before shipping.

0 favorites 0 likes

#data-exfiltration

Bench Press: Leaking Text Nodes with CSS

Lobsters Hottest ↗ · 2026-07-05 Cached

The article presents a CSS injection technique that leaks the entire content of an HTML text node using only CSS, demonstrated through a CTF challenge. It details the method and its constraints.

0 favorites 0 likes

#data-exfiltration

If you give an AI agent your real data and a send button, it will eventually leak. I built a workspace that makes that structurally impossible.

Reddit r/artificial ↗ · 2026-07-01

The author shares an open-source workspace architecture that structurally prevents AI agents from exfiltrating private data by enforcing human-gated outbound actions and isolating the engine from the data repository.

0 favorites 0 likes

#data-exfiltration

Critical Copilot vulnerability allowed hackers to seal 2FA code from users

Ars Technica ↗ · 2026-06-16 Cached

A critical vulnerability in Microsoft 365 Copilot, dubbed SearchLeak, allowed attackers to steal 2FA codes via parameter-to-prompt injection by exploiting raw HTML rendering before guardrail enforcement. Microsoft has fixed the vulnerability, but the underlying issue of prompt injection remains a challenge.

0 favorites 0 likes

#data-exfiltration

MIRAGE: A Polarity-Flipping Encoding Subspace in LLM Agents

arXiv cs.CL ↗ · 2026-06-10 Cached

This paper identifies a polarity-flipping encoding subspace in the residual stream of LLM agents that enables real-time detection of covert data exfiltration, achieving AUC=0.918 in injection scenarios and substantially outperforming output-only detectors.

0 favorites 0 likes

#data-exfiltration

OpenAI Adds Lockdown Mode (3 minute read)

TLDR AI ↗ · 2026-06-08 Cached

OpenAI introduces Lockdown Mode, an optional security setting that limits web browsing and external service access in ChatGPT to reduce data exfiltration risks from prompt injection attacks. It is rolling out to eligible personal and business accounts.

0 favorites 0 likes

#data-exfiltration

OpenAI Help: Lockdown Mode

Simon Willison's Blog ↗ · 2026-06-05 Cached

OpenAI has launched Lockdown Mode for ChatGPT to prevent data exfiltration from prompt injection attacks by limiting outbound network requests. The feature is rolling out to eligible accounts including Free, Plus, Pro, and self-serve Business.

0 favorites 0 likes

#data-exfiltration

The first confirmed LLM-agent cyberattack just happened — AI hacked a server, stole AWS creds, and exfiltrated a DB in under 1 hour

Reddit r/AI_Agents ↗ · 2026-06-01

Sysdig researchers documented the first confirmed LLM-agent cyberattack where an AI agent autonomously hacked a server, stole AWS credentials, and exfiltrated a database in under an hour.

0 favorites 0 likes

#data-exfiltration

ChatGPT for Google Sheets Exfiltrates Workbooks

Hacker News Top ↗ · 2026-05-31 Cached

A security researcher discloses that OpenAI's ChatGPT extension for Google Sheets is vulnerable to indirect prompt injection attacks, allowing attackers to exfiltrate workbooks and execute unauthorized actions despite user settings requiring approval.

0 favorites 0 likes

#data-exfiltration

Microsoft Copilot Cowork Exfiltrates Files

Simon Willison's Blog ↗ · 2026-05-26 Cached

A security vulnerability in Microsoft Copilot Cowork allows attackers to exfiltrate files by exploiting prompt injection that triggers external image requests, potentially leaking pre-authenticated download links.

0 favorites 0 likes

#data-exfiltration

Microsoft Copilot Cowork Exfiltrates Files

Hacker News Top ↗ · 2026-05-25 Cached

Researchers at PromptArmor demonstrate that Microsoft Copilot Cowork can be exploited via indirect prompt injection to exfiltrate files from Microsoft 365, exploiting the lack of approval for certain actions when the recipient is the active user.

0 favorites 0 likes

#data-exfiltration

A Network Allow-List Won't Stop Exfiltration

Lobsters Hottest ↗ · 2026-05-24 Cached

Network allow-lists are insufficient to prevent data exfiltration via authorized channels like DNS or allowed endpoints. Canister, a lightweight Linux sandbox, addresses this with a layer-7 egress proxy that performs TLS interception and data-loss prevention.

0 favorites 0 likes

#data-exfiltration

AI Agent Intelligence tool - Incident debugging, Cost spike detection

Reddit r/AI_Agents ↗ · 2026-05-19

Building a tool for AI Agent incident debugging and cost spike detection without additional instrumentation, covering issues like prompt injection, reasoning loops, and data exfiltration. Asking if customers in production environments see this as a pain point worth paying for.

0 favorites 0 likes

#data-exfiltration

How are people securing vibe-coded agents before they expose customer data?

Reddit r/AI_Agents ↗ · 2026-05-17

A security engineer at a B2B tech company seeks advice on preventing data exfiltration from employee-built AI tools ('vibe-coded' agents) using session-level DLP without forcing an enterprise browser, discussing options like browser extensions and agentless SSE solutions such as Red Access.

0 favorites 0 likes

#data-exfiltration

Keeping your data safe when an AI agent clicks a link

OpenAI Blog ↗ · 2026-01-28 Cached

OpenAI describes security safeguards against URL-based data exfiltration attacks when AI agents retrieve web content, using an independent web index to verify that URLs are publicly known before automatic retrieval to prevent prompt injection attacks from leaking sensitive user data.

0 favorites 0 likes

data-exfiltration

Submit Feedback