Faros study finds AI coding throughput rose while bugs and incidents rose faster

Acceleration with an audit trail

Faros AI's 2026 engineering report landed with the kind of numbers that make AI coding adoption look both useful and expensive. The report, titled "The Acceleration Whiplash," analyzed two years of telemetry from 22,000 developers across more than 4,000 teams. Faros compared periods of low AI adoption with periods of high AI adoption inside the same organizations and tracked what changed across task systems, IDE activity, version control, static analysis, CI/CD, incident tools, and HR metadata.

The acceleration part is not fake. Faros found that task completion per developer was up 33.7%, epics completed per developer were up 66.2%, and PR merge rate per developer was up 16.2%. In plain English, more work moved through the system. More tickets closed. More code entered the codebase. Roadmaps looked less like decorative fiction.

Then the rest of the workflow had to absorb it.

Bugs per developer rose 54%. Incidents per pull request rose 242.7%. Monthly incidents rose 57.9%. Median time in PR review rose 441.5%. Code churn, measured as the ratio of lines deleted to lines added, rose 861%. Faros also reported that 31.3% more pull requests merged with no review at all.

That last number is the part that should make engineering leaders sit up a little straighter. The report describes review systems built around human-paced code production getting force-fed machine-paced output, rather than reviewers simply getting lazy. Once the queue grows faster than reviewers can reason about it, the options get ugly: reviews become slower, reviews become shallower, or reviews disappear.

More authoring, less finishing

Faros argues that AI has crossed a threshold from assistant to primary author in many organizations. Its data says 80% of teams exceed a 50% weekly active-user threshold for AI tools, and acceptance rates for AI-generated code have risen from 20% to 60%. About a quarter of pull requests are reviewed by AI agents, but fewer than 1% are opened autonomously by agents. The report is therefore mostly about humans shipping AI-authored work, not fully autonomous software development.

That distinction matters. The humans are still on the hook. The incident report does not care whether the bug started as a model completion, a copied Stack Overflow answer, or an inspired act of typing at 1:00 a.m. The production system only sees the merged code.

The Faros data suggests AI makes starting work much easier than finishing it. Daily PR contexts per developer rose 67.4%. Daily task contexts rose 17.7%. Work restarts rose 13.8%. In-progress tasks with no activity for seven or more days rose 26%. That is the shape of a team opening more threads than it can close.

The code also got harder to review. Pull request size rose 51.3%, files edited per pull request rose 59.7%, and files touched per developer per month rose 149.9%. Larger changes are slower to understand, harder to roll back, and more likely to hide the one wrong assumption that slips past a tired reviewer. AI-generated code can look polished while still being structurally wrong, which is a spectacularly annoying failure mode because the code does not smell bad enough to warn anyone.

Contrast with DORA

The 2025 DORA report framed AI as an amplifier: strong teams get stronger, weak systems expose their weaknesses faster. Faros pushes against that optimism. Its 2026 report says high-performing teams were not insulated from downstream quality deterioration. Better foundations helped, but they did not make the review bottleneck go away.

That does not mean DORA was wrong and Faros was right in some clean scoreboard sense. They measured different things. DORA's report is survey-heavy and focused on organizational capabilities. Faros is leaning on telemetry and workflow instrumentation. One tells you how teams experience AI adoption. The other tells you what happened to tickets, reviews, bugs, incidents, and churn after adoption crossed a threshold.

Both can be true. A competent team may get more useful work out of AI than a brittle one. The same competent team may still produce enough extra code to overload review, testing, and release systems that were already running close to capacity.

Why it belongs here

This is a Vibe Graveyard story because the failure is not "AI coding tools exist" or "developers used a new productivity tool." The failure pattern is measurable: more AI-assisted output entered engineering systems, and downstream quality controls did not scale with it. The result was more bugs, more incidents, more churn, and longer review cycles.

Faros has a commercial interest here. It sells engineering intelligence software, so its report should be read with the usual vendor-research squint. The report also describes statistically significant correlations, not proof that every extra bug was individually caused by an AI model. ADTmag's coverage noted the same limitation: the 2025 and 2026 Faros reports are independent cross-sections, not a single longitudinal study of identical teams.

Even so, the telemetry is hard to wave away. A 54% increase in bugs per developer and a 242.7% increase in incidents per PR are not vibes. They are the operational residue of an engineering system that learned to generate code faster than it learned to verify code.

For teams adopting AI coding tools, the warning is simple enough to fit on a post-it and painful enough to ignore anyway: if code generation speeds up and verification does not, the bottleneck does not vanish. It moves downstream, puts on a pager, and waits for Friday afternoon.

Vibe Graveyard

Faros study finds AI coding throughput rose while bugs and incidents rose faster

Incident Details

Tech Stack

References

Acceleration with an audit trail

More authoring, less finishing

Contrast with DORA

Why it belongs here

Discussion