ShadowPrompt let any website silently hijack Claude for Chrome

Tombstone icon

Koi Security researcher Oren Yomtov disclosed ShadowPrompt, a zero-click attack chain in Anthropic's Claude for Chrome extension, which has more than 3 million users. It combined an overly permissive origin allowlist that trusted any *.claude.ai subdomain with a DOM-based cross-site scripting bug in an outdated Arkose Labs CAPTCHA hosted on a-cdn.claude.ai. Merely visiting a malicious page let an attacker inject arbitrary instructions into Claude as if the user had typed them, with no click and no permission prompt - enough to steal a Gmail access token, read Google Drive, export chat history, or send email as the victim. Reported December 26, 2025; Anthropic shipped an extension fix around January 15, 2026, and Arkose Labs patched the XSS on February 19, 2026. Responsibly disclosed; no confirmed exploitation.

Incident Details

Severity:Facepalm
Company:Anthropic
Perpetrator:AI browser assistant
Incident Date:
Blast Radius:More than 3 million Claude for Chrome users exposed to silent, zero-click account takeover and data theft by any website they visited

An assistant that does things for you

The whole point of a browser-based AI assistant is that it can act on your behalf. Claude for Chrome, Anthropic's extension with more than three million users, sits in a sidebar, reads the page you are on, and reaches into the accounts you have connected so it can do useful work: summarize a thread, pull a file, draft a reply. To be useful it needs broad reach. To be safe it needs to be very sure about who is allowed to give it instructions.

ShadowPrompt, disclosed by Koi Security researcher Oren Yomtov, is what happens when the answer to "who is allowed to give it instructions" turns out to be "more or less anyone." It was a zero-click chain: a victim only had to visit a web page the attacker controlled. No click on a prompt, no permission dialog, no malicious file to open. The page did the talking, and Claude listened.

The two flaws that composed into one

ShadowPrompt is honest about its plumbing: at its base sits a classic web bug, a DOM-based cross-site scripting (XSS) flaw. What makes it graveyard-worthy is what that bug was wired into.

Flaw one: an origin allowlist that trusted a wildcard. The extension accepted prompt messages from web pages, but it decided whether to trust a sender by checking the sender's origin against the pattern *.claude.ai. In other words, any subdomain of claude.ai was treated as Anthropic's own trusted property and allowed to send instructions straight into Claude. That is the kind of allowlist that looks fine in a code review and is quietly catastrophic, because a wildcard trusts more than the subdomains you run carefully; it trusts any subdomain that anything - including third-party vendor code - happens to serve content from.

Flaw two: a DOM-based XSS in an outdated CAPTCHA. One of those subdomains, a-cdn.claude.ai, hosted an Arkose Labs CAPTCHA component. Arkose CAPTCHAs are the "prove you are human" widgets used to slow down bots. The version in use contained a DOM-based XSS: it took attacker-influenced input and wrote it into the page without sanitizing it (the reports point to a dangerouslySetInnerHTML-style sink, which is exactly the React pattern whose name is a warning label). DOM-based XSS means the malicious script runs purely in the browser, by manipulating the page's document object model, rather than being reflected off a server. Crucially, once that script ran, it ran in the context of a-cdn.claude.ai - a subdomain that matched the trusted *.claude.ai pattern.

Put the two together and you get the chain. The attacker's page loads the vulnerable Arkose component inside a hidden iframe, fires an XSS payload at it via the browser's postMessage mechanism, and gains JavaScript execution in the context of a-cdn.claude.ai. From that trusted origin, the injected script sends a message to the Claude extension carrying any prompt the attacker wants. The extension checks the origin, sees a *.claude.ai match, and accepts it. Claude receives the instruction as if you had typed it yourself.

What the attacker could do once inside

This is where the "it is just an XSS" framing falls apart. A bare XSS on a CDN subdomain is a problem, but a bounded one. What turns ShadowPrompt into full account takeover is the privileged tool access the AI assistant carries around. Claude for Chrome is connected to things. The injected prompt does not have to break out of the browser or escalate operating-system privileges; it just has to ask Claude to use the access Claude already has.

Per Koi's writeup, that meant an attacker could instruct Claude to:

  • Steal the Gmail access token, handing over programmatic access to the victim's email.
  • Read the contents of Google Drive.
  • Export the victim's Claude chat history, which for many people is a sediment layer of business plans, credentials pasted in a hurry, and things they would never email.
  • Send email as the victim, turning the takeover into onward phishing or fraud.

All of that without a single click or a permission prompt. The AI layer is not incidental here. It is the amplifier. The XSS got a foot in the door; the agent's standing permissions are what let the intruder walk through the whole house.

Why this keeps happening

The structural lesson is the same one that shows up across prompt-injection incidents, and it is worth stating plainly: an AI assistant treats instructions and data as one undifferentiated stream, and it acts with whatever permissions it has been handed. So the security of the whole system collapses onto two questions. Can an attacker get text into that stream? And how much can the assistant do once it reads that text?

ShadowPrompt answers the first question with a wildcard allowlist and an unpatched third-party widget, and the second question with a connected Gmail, Drive, and chat archive. Neither of those is exotic. The origin allowlist is the kind of trust boundary that gets drawn loosely because *.claude.ai "is all us." It is not all you - a CDN subdomain serving a vendor's CAPTCHA is a different trust domain than your own application code, even when it shares a parent domain. And the vendor component was simply out of date, which is the most ordinary supply-chain failure there is.

Browser AI assistants raise the stakes on both questions at once. They are designed to read arbitrary web content (so the attack surface for "get text in" is the entire internet) and they are designed to act across your connected accounts (so the blast radius of "what can it do" is your digital life). That combination is precisely the one that demands paranoid origin checks and tightly scoped tool permissions, and it is precisely the one where a convenient wildcard is most tempting.

The response, and the honest caveats

Yomtov reported ShadowPrompt to Anthropic via HackerOne on December 26, 2025. Anthropic confirmed the issue the next day and, per the reports, deployed an extension fix around January 15, 2026 - tightening the origin check so the extension requires an exact match to claude.ai rather than trusting any subdomain. Arkose Labs fixed the underlying CAPTCHA XSS on February 19, 2026. That is a responsive turnaround, and it is worth saying so.

Two caveats keep this in the "near-miss" column rather than the "breach" column. First, this was responsibly disclosed; there is no public evidence that ShadowPrompt was exploited against real users before it was patched. The harm here is exposure, not confirmed theft. Second, the chain genuinely did depend on a conventional web vulnerability - the Arkose XSS - so anyone tempted to file this purely under "spooky AI risk" should remember that half the exploit is a bug class older than the assistant it hijacked.

But that is also the point. Old, well-understood web flaws do not stay bounded when you bolt them onto a system that can act with your credentials. The exact-match origin fix is the boring, correct remedy. The broader lesson is the uncomfortable one: a wildcard you trusted out of convenience plus a vendor widget you forgot to update is enough to turn "visit a web page" into "lose your inbox."

Graveyard lesson

ShadowPrompt belongs here because it shows how cheaply a browser AI assistant can be turned against its user when trust boundaries are drawn for convenience. Wildcard allowlists are convenient. Third-party CAPTCHAs are convenient. Connected Gmail and Drive are convenient. Compose them without sharp boundaries and you get a machine that takes instructions from any website and executes them with the victim's full identity.

The fixes are the unglamorous ones. Trust origins by exact match, not by wildcard. Treat every third-party subdomain as a separate, untrusted trust domain even when it shares your parent name, and keep vendor components patched. And scope an assistant's tool access so that one injected prompt cannot quietly hand over an access token, read a drive, and send mail in a single uninterrupted breath. Convenience is not a security model. It is just the thing attackers count on.

Discussion