White House MAHA report shipped fake studies and OpenAI citation markers

Gold-standard science, bargain-bin citations

On May 22, 2025, the White House Make America Healthy Again Commission released a federal public-health report on childhood chronic disease. HHS announced it as a major assessment meant to identify root causes and help shape future health policy. The commission was chaired by Health and Human Services Secretary Robert F. Kennedy Jr., and the document leaned heavily on hundreds of citations to project scientific authority.

A week later, that authority started shedding parts.

On May 29, NOTUS reported that the report cited studies that did not appear to exist and misinterpreted other research, according to researchers and listed authors. One citation pointed to a supposed JAMA Pediatrics article about adolescent mental health and substance use during the COVID-19 pandemic. The title did not match a real paper in the cited issue, and the DOI link did not work. NOTUS found seven references to reports that did not exist, and the administration later updated the report to remove them.

PolitiFact followed with a detailed fact-check on May 30. It found that several citations contained titles of nonexistent papers or mischaracterized real ones. It also noted a key AI tell reported by the Washington Post: some references contained "oaicite" in their URLs, a marker associated by ChatGPT users with OpenAI citation output. Congressional oversight Democrats later sent HHS a letter saying public reporting showed AI appeared to have been used in drafting the report, citing nonexistent studies, broken links, misrepresented findings, and those "oaicite" markers.

The White House response was to frame the problems as formatting issues. The report was updated repeatedly. That may be a politically convenient category, but a fake study is not a formatting issue. Formatting is when the comma is in the wrong place. A study that nobody wrote is a different species of problem.

What went wrong

The failure was not one stray typo in a 500-source bibliography. The pattern looked like machine-assisted citation generation without enough human verification.

Fake academic citations tend to have a recognizable shape. They use credible journal names. They give plausible article titles. They attach real researchers or realistic subject terms to claims that sound like they belong in the field. Sometimes they even generate DOI-shaped strings. The result is a reference that can survive a quick visual scan while being useless to anyone trying to verify the claim.

The MAHA report had several of those traits. PolitiFact documented fake or suspect references connected to claims about adolescent mental health, drug advertising, antidepressants, and asthma medication. It also described legitimate citations whose findings were overstated or misrepresented. The updated version replaced some false citations with real sources and revised some language tied to the replaced references.

The Washington Post reported that at least 37 footnotes appeared multiple times in an initial version reviewed by the paper, that several studies did not exist, and that some references included "oaicite" attached to URLs. The Post quoted White House press secretary Karoline Leavitt saying there were formatting issues being addressed and HHS saying minor citation and formatting errors had been corrected while the report's substance remained the same.

That defense misses the operational issue. Citations are not ornamental. In a public-health report, they are the route back to evidence. If the route points to nowhere, readers cannot tell whether the claim rests on real research, a misread abstract, a political preference, or a model filling in the blank because the prompt sounded confident.

Why the AI evidence matters

No official source in the public record plainly admitted, "we used ChatGPT to draft this report." The evidence is still strong enough to treat this as an AI-citation incident. The markers, the hallucinated paper titles, the broken DOIs, the duplicated and garbled references, and expert assessments all point in the same direction: someone appears to have used generative AI, or AI-assisted citation tooling, in a way that produced reference-shaped garbage.

That distinction is worth keeping. The story is not that every sentence in the report was generated by AI, but that a federal health-policy document displayed the failure mode associated with AI-generated references, and then had to be corrected after outside reporters and fact-checkers did the verification work its authors should have done before publication.

The House Oversight Democrats' June 2 letter sharpened the accountability problem. Acting Ranking Member Stephen Lynch asked HHS for information about drafting, review, publication, amendment, and possible AI reliance. The letter accused the White House of updating the report after reporting surfaced, including removing "oaicite" markers and fixing citations. It also warned that using AI to manufacture fake science for a policy agenda would undermine scientific integrity.

That is partisan language from one side of Congress, so it should be read as oversight pressure rather than neutral adjudication. The underlying evidence it cites, though, comes from multiple reporters, fact-checkers, and named experts reviewing the report's references.

Why this belongs here

This is a public-sector hallucinated-citation story in a health-policy context. The commission published a report meant to inform federal action on children's health. The report used citations to claim scientific grounding. External reviewers found citations to nonexistent studies, mischaracterized research, broken links, and AI-associated markers. The government then updated the document.

That is a concrete failure. The blast radius was not a chatbot giving one silly answer to one user but an official report carrying fake evidence into public debate, media coverage, and policy discussion. Even if the corrected report retained many of its conclusions, the original review process failed at the part where evidence has to exist before it gets footnoted.

The incident also illustrates why hallucinated citations are especially corrosive in policy work. A fake citation does not merely embarrass the author. It creates a false trail for readers, journalists, scientists, and lawmakers. It lets a claim borrow authority from a paper that was never written. In public health, where decisions can influence funding, regulation, treatment norms, and public trust, that is not harmless sloppiness.

The fix is dull and unavoidable: every cited paper must be opened, read, and matched to the claim it supports. DOI links need to resolve. Titles need to exist. Authors need to have written the work attributed to them. If an AI tool is used anywhere in the drafting or reference-gathering process, the verification burden goes up, not down.

A federal health report should not need reporters to discover that its scientific scaffolding includes imaginary beams. Yet here we are, watching citation QA become the latest public service someone tried to outsource to autocomplete and then relabel as formatting.

Vibe Graveyard

White House MAHA report shipped fake studies and OpenAI citation markers

Incident Details

Tech Stack

References

Gold-standard science, bargain-bin citations

What went wrong

Why the AI evidence matters

Why this belongs here

Discussion