A security firm graded 100 AI agents and found almost all of them carrying the lethal trifecta

There is a phrase that has quietly become the most useful idea in AI security, coined by the developer Simon Willison: the lethal trifecta. The point is simple and ruthless. An AI agent becomes genuinely dangerous when three things are true at once. It has access to private data. It is exposed to untrusted content, things like web pages, emails, or documents it did not write. And it can take outbound actions, meaning it can send data out or do something in the world. Any one of those is fine. All three together is a loaded gun, because the untrusted content can carry hidden instructions that hijack the agent into using its private-data access and its outbound actions against you. In June 2026, the security firm Adversa AI went and measured how many real agents are holding that loaded gun. The answer is almost all of them.

What AIRQ measured

Adversa published something it calls AIRQ, the AI Risk Quadrant, billed as the largest independent agentic-AI security assessment to date. The team scored 100 commercial and public AI agents along three axes: attack surface (how exposed the agent is), blast radius (how much damage it could do if compromised), and defenses (what actually stands in the way). Plot agents on those axes and they fall into quadrants, from the well-defended corner down to the agents that combine maximum reach with minimum protection.

The framework had eyes on it from people who do this for a living. Reviewers were drawn from OWASP, the Cloud Security Alliance, NIST, Cisco, and CrowdStrike, which matters for a study that, as we will get to, comes from a vendor with something to sell.

The numbers

The findings are blunt enough to quote without much translation.

Ninety-eight percent of the assessed agents exhibited the lethal trifecta. Nearly every agent tested can read private data, ingest untrusted content, and act outward, which is to say nearly every agent tested has the precise structure that prompt injection exploits.

Only 11% landed in the top, well-defended quadrant Adversa calls Fortified Leaders. Roughly one in nine agents has defenses worthy of the access it has been granted.

Forty percent fell into the worst group, the high-reach, low-defense agents, and that group accounts for the majority of total risk in the sample, on the order of 60%. The danger is concentrated in the agents that can do the most and protect the least.

And 83% of the security claims vendors make about their agents had no independent verification. Most of the reassurance in this market is the vendor's own word.

Within the pack, the worst performers were the agents you would most want to trust: coding agents and computer-use agents, the ones granted hands on your codebase, your terminal, or your desktop. The agents with the most power had, on average, the least adequate defenses for it.

The honest caveat

This is a study published by a company that sells AI security services, and a framework with a vendor's name on it is always at least partly a marketing artifact. A skeptic is right to weigh that. Two things keep it from being dismissible. Help Net Security, an independent outlet, covered the findings rather than just reprinting a press release, and the methodology passed in front of reviewers from established security bodies. The lethal-trifecta framing it leans on is not Adversa's invention; it is a widely accepted way of describing why agentic systems are hard to secure. The specific percentages deserve a grain of salt. The shape of the result, that agent capability has raced far ahead of agent security, is consistent with everything else the field has been finding.

Why a study earns a place on a failures site

No single agent here got breached on camera. This is a measurement, not an incident, which is exactly why it belongs in the same drawer as the studies showing AI-generated code ships with more vulnerabilities or that AI customer service fails more often than other AI uses. The value of a study is that it tells you the individual horror stories are not bad luck; they are the visible tip of a structural problem.

And the structural problem AIRQ describes is a procurement problem as much as a technical one. Organizations are wiring agents into email, code, cloud consoles, and customer systems on the strength of vendor assurances, and 83% of those assurances are unverified. The agents being trusted with the most are the ones least equipped to be trusted. That is not a prediction about some future risk; it is a snapshot of what is already deployed.

What to actually do with this

The practical reading is not "never use agents." It is to treat the lethal trifecta as a checklist before you hand an agent real access. Does this thing touch private data? Does it read content from sources you do not control? Can it act or send data outward? If all three are yes, you have a prompt-injection target, and the mitigations are boring and necessary: constrain what it can reach, separate trusted instructions from untrusted input, require human approval for consequential actions, and do not accept "we are secure" without something more than the vendor's word. AIRQ's contribution is a number to wave at people who think this is hypothetical. Ninety-eight percent. Almost all of them.

Vibe Graveyard

A security firm graded 100 AI agents and found almost all of them carrying the lethal trifecta

Incident Details

Tech Stack

References

What AIRQ measured

The numbers

The honest caveat

Why a study earns a place on a failures site

What to actually do with this

Discussion