KPMG pulled an agentic AI report after the citations and case studies started evaporating

Tombstone icon

In June 2026, KPMG removed its October 2025 report "Total Experience: Redefining Excellence in the Age of Agentic AI" after GPTZero and the Financial Times found widespread citation and case-study problems. GPTZero said only five of 45 citations accurately matched their sources, while organizations including UBS, the UK's NHS, Swiss Federal Railways, and Transport for London reportedly disputed claims about their AI deployments. KPMG said it was reviewing the publication. A Big Four firm managed to publish a report selling confidence in agentic AI that became evidence for why confidence should maybe bring a receipt.

Incident Details

Severity:Facepalm
Company:KPMG
Perpetrator:Consulting firm
Incident Date:
Blast Radius:KPMG report withdrawn from websites; named organizations disputed case studies; public credibility hit to a major AI consulting publication

KPMG's "Total Experience: Redefining Excellence in the Age of Agentic AI" report was supposed to show how organizations were using agentic AI to improve customer experience. Instead, it became a tidy demonstration of the most embarrassing failure mode in the category it was promoting: plausible-looking text with citations that collapse when touched.

The report was published in October 2025. In June 2026, GPTZero published a forensic review of the document. The headline finding was ugly: out of 45 citations, GPTZero said only five accurately pointed to real supporting sources. Many others were described as mangled, paraphrased, partly fabricated, misattributed, or too vague to verify. The Financial Times then checked several of the underlying claims, and TechCrunch reported that KPMG removed the report while investigating.

The result was not a small typo correction. It was a major consulting brand pulling a report about AI business adoption after the report itself appeared to suffer from AI-style hallucination problems. There are easier ways to make a point about governance, but few are so economical.

Case studies with no case

The most damaging part was not merely bad footnotes. Reports from GPTZero, TechCrunch, and The Register described case studies or claims involving UBS, the UK's National Health Service, Swiss Federal Railways, Transport for London, and Emirates. Several organizations reportedly told the Financial Times that the report's claims about their AI usage were false, misleading, or unsupported.

This distinction matters. A broken citation is sloppy. A broken case study is reputationally radioactive. If a consulting report says a named organization deployed a shiny agentic AI system, readers may treat that as market evidence. Executives cite it. Vendors use it in decks. Journalists summarize it. Competitors feel pressure to catch up. Then the named organization says, in effect, "We did not do that." At that point, the report is no longer evidence of adoption. It is evidence of how adoption narratives get laundered through authoritative PDFs.

GPTZero used the term "vibe citing" for this pattern: references that look like citations but have been bent, fused, or invented enough that they no longer do the job citations exist to do. The phrase is useful because the failure is not always a completely fictional source. Sometimes the source exists, but the title is wrong. Sometimes the source is real, but the claim attached to it is not there. Sometimes fragments from different references get glued together into one scholarly-looking chimera. It is the citation equivalent of a cardboard fire door.

Why this is worse from a consulting firm

Consulting firms do not publish AI reports as casual blog posts. These documents are sales collateral, thought leadership, market signal, and executive persuasion machinery all wearing the same blazer. A Big Four logo tells readers that someone did the adult work: checked sources, verified claims, called named organizations, and made sure the numbers were not decorative.

That is why the KPMG incident lands differently from an ordinary hallucinated blog. KPMG sells advice on AI adoption, governance, risk, transformation, and operational discipline. The report's subject was agentic AI, a category whose pitch depends heavily on trust: let software take more initiative, touch more workflows, and make more decisions. Publishing unsupported claims in that context is not just embarrassing. It undercuts the advice being sold.

KPMG told outlets that it removed the report and was reviewing the circumstances around its publication. The firm also pointed to guidelines on responsible AI use, including human oversight, content validation, and source verification. That is the right set of nouns. The problem is that nouns are cheap. The report suggests the actual workflow did not reliably enforce them.

The old research problem at AI speed

Bad corporate reports existed before generative AI. So did lazy sourcing, over-claimed examples, and footnotes that wilt under inspection. The AI angle here is the shape and scale of the defects. Dozens of citations were reportedly affected. Claims about named organizations appeared to drift beyond what sources supported. The errors looked like the familiar LLM habit of producing exactly the reference-shaped object requested, with the truth value treated as an implementation detail.

That is what makes AI-assisted research dangerous in professional publishing. It can make weak sourcing look stronger than it is. A human who cannot find the perfect example may leave the claim out or mark it uncertain. A generative system may synthesize the missing bridge because the document expects a bridge. If the review process checks whether the prose reads well rather than whether each claim survives source inspection, the hallucination graduates from draft to publication.

The downstream cost is not limited to KPMG's embarrassment. Reports from large firms feed other reports. They show up in decks, procurement discussions, board briefings, journalism, and policy conversations. They can also become training material or retrieval content for future AI systems, letting one bad PDF teach the next generation of slop where to stand.

How this should have been caught

This was a basic verification failure. Every cited source should have been opened. Every quoted or paraphrased claim should have been checked against the source. Every named-company case study should have been confirmed with the company or with a primary source that clearly said what the report claimed. If AI helped draft, summarize, or locate sources, that should have increased the verification burden, not reduced it.

The lesson is tedious because quality control usually is. If your AI workflow produces citations, treat them as suspect until a human verifies them. If it produces named case studies, verify them with primary evidence. If a source title looks slightly off, do not shrug. That is often the little thread that unravels the sweater.

KPMG's report did not fail because AI is unusable for research assistance. It failed because the publication process appears to have let authoritative-looking AI-shaped material pass for verified research. For a firm advising clients on agentic AI, that is a painful demonstration. The PDF effectively became its own warning label: do not trust a polished output merely because it has citations and a logo.

Discussion