NEJM retracted a case study after authors used AI to alter a clinical image

Tombstone icon

On May 1, 2026, the New England Journal of Medicine retracted an "Images in Clinical Medicine" piece titled "Bronchial Casts from Inhalation of Forest-Fire Smoke" - eleven days after publishing it. The dramatic photograph of black, branching airway casts pulled from an 87-year-old patient's lungs had spread beyond the journal and drawn media attention. The two authors then admitted they had used an AI tool to superimpose the tape measure visible at the top of the image. They told the journal they were unaware of NEJM's policies on image manipulation and described the alteration as a cosmetic adjustment for readability. The clinical content was apparently authentic, but the most prestigious medical journal in the United States still had to retract a case study because part of the figure had quietly been generated by AI.

Incident Details

Severity:Facepalm
Company:New England Journal of Medicine
Perpetrator:Researcher
Incident Date:
Blast Radius:Retracted "Images in Clinical Medicine" piece in the New England Journal of Medicine; reputational hit to NEJM's peer-review process; medical record of the underlying case clouded by undisclosed AI image manipulation; new prompts for tighter image-provenance review across major medical journals.

A high-prestige image goes through the AI car wash

"Images in Clinical Medicine" is a small but high-visibility format at the New England Journal of Medicine. The unit publishes a single striking medical photograph alongside a paragraph or two of clinical context. The pieces routinely go viral inside medical Twitter, get picked up by general-interest outlets, and end up in lectures. They are not where you would expect a quiet image-manipulation scandal.

On April 18, 2026, NEJM published one of these in that format. The case described an 87-year-old man who had developed bronchial casts after inhaling smoke from a forest fire. The photograph that accompanied the case was the kind of image that does the unit's job perfectly: pitch-black, branching shapes that look like a coral skeleton, pulled out of the man's airways and laid out on a flat surface. A small tape measure at the top of the image gave the viewer a sense of scale. The picture moved off NEJM's site and into syndication, with multiple outlets picking it up.

Within days, online readers began posting that the image looked wrong. Specifically, the tape measure looked wrong. The proportions, the font on the markings, the way it interacted with the rest of the photo - readers with eyes calibrated by years of looking at fake AI imagery flagged the ruler as something that had been added by a generative tool rather than placed in front of a real camera.

Authors confirmed they used AI on the figure

Eleven days after publication, NEJM published a retraction notice. The two authors - Yuling Wang of Daxing Teaching Hospital and Xiangdong Mu of Beijing Tsinghua Changgung Hospital - acknowledged having used an artificial intelligence tool to superimpose the tape ruler on top of the original clinical photograph. They said in their response that they were not aware of NEJM's policies governing image manipulation and that the AI had only been used to move the ruler to make the figure more visually readable. They insisted that the clinical content of the image - the bronchial casts themselves - was not altered.

That distinction matters less than they seem to think it does, and for two reasons.

First, the moment any portion of a figure in a peer-reviewed clinical journal is generated rather than captured, the entire image's evidentiary status shifts. The reader can no longer tell, by looking, what was photographed and what was synthesized. If the authors had reached for a graphic design tool and pasted a real ruler into the image as a manual overlay, that would still be questionable practice and would have required disclosure; but at least the ruler would correspond to real, measurable units. When the ruler is generated by an AI model, there is no underlying ground truth at all. The markings are vibes.

Second, the rationale that AI was used only "to make the image more aesthetic and visually readable" is exactly the kind of soft-edge use case that journal policies are written to deny. Image manipulation rules in medical publishing exist because there is no clean line between cosmetic adjustment and substantive alteration. Once the editorial process accepts "we lightly AI-touched the figure for readability," the next step is "we lightly AI-touched the figure for clarity," and the step after that is whatever a busy author decides counts as clarity.

Eleven days to retract

NEJM moved quickly. Two weeks is fast for a retraction, especially in a journal where the publication infrastructure is built around durable scientific record-keeping. The speed is not a sign that this is a small story; it is a sign that NEJM understood it had no defensible position to hold. An "Images in Clinical Medicine" piece that admits to undisclosed AI generation of part of its figure cannot stay in the journal, and trying to keep it up while issuing a correction note would only have prolonged the embarrassment.

Retraction Watch covered the incident on the day the retraction posted and described the broader pattern of AI-related retractions across scientific publishing in 2026. The case fits an arc that has been building since 2024: AI-generated figures and AI-generated text quietly slipping past peer review at journals that previously assumed authors were submitting human-produced work in good faith. Earlier high-visibility incidents include Frontiers retracting a paper with AI-generated images of anatomically impossible mammalian anatomy, and assorted retractions for AI-fabricated reference lists. The NEJM case lands at the top of that distribution by prestige.

Why an apparently small edit is a real problem

It is tempting to file this story under "the authors only moved a ruler, lighten up." The reason the story matters anyway is structural.

Medical journals depend on a trust chain: the authors honestly report what they observed, the journal verifies as best it can, and downstream readers - clinicians, researchers, patients - rely on the published record. That chain depends on the implicit contract that figures are records of real observations. When part of a figure is fabricated by an AI model and that fabrication is not disclosed, the contract breaks even if the cosmetic intent was benign. Other authors will see the retraction and update their priors about how much they can trust figures elsewhere in the literature. Reviewers will start asking, reasonably, whether the next dramatic image they encounter has also had its furniture quietly re-arranged.

NEJM's retraction notice and the response from Wang and Mu also raise a thornier question: how many other figures in the literature have already been subject to this kind of light AI touch-up without anyone disclosing it. Most AI-altered images do not announce themselves with proportions that internet sleuths can flag. If a tool can be used to move a ruler, it can also be used to soften a shadow, smooth a tissue boundary, or sharpen a histology slide. None of those edits would necessarily change the clinical conclusion, and none of them would necessarily be detectable from the published image.

The cleaner standard that this incident forces

In the wake of the retraction, several commentators reiterated what is probably the only workable rule: any use of generative AI on a figure that ends up in a peer-reviewed clinical journal needs to be disclosed in the figure caption or the methods section, period. The decision about whether the edit was "substantive" cannot be left to authors; it has to default to disclosure. Readers and reviewers can then decide what weight to give the image.

Some journals are moving in that direction already. NEJM had a pre-existing policy that this case violated, which is partly why the retraction was straightforward. The story for the rest of the field is that the policy without a verification step is mostly aspirational. Reviewers and editors generally cannot tell by eye whether a figure has been generated or altered. Detection tools exist and are improving, but they are not yet routine inputs to peer review at most journals.

The graveyard lesson

The lesson here is not "AI tools are dangerous in medicine" in some abstract sense. The lesson is that the most prestigious medical journal in the United States published a figure that turned out to have been partly generated by AI, and the journal found out from internet commenters rather than from its own review process. The fix the authors offered ("we did not know your policy") is exactly the kind of explanation that a verification step should never have had to wait for.

If two researchers can run an "Images in Clinical Medicine" submission through an AI image tool to clean up the ruler and have that piece survive editorial review at NEJM long enough to reach syndication, the rest of the medical-publishing pipeline is already operating on borrowed time. The retraction took eleven days. The next one might not be caught at all.

Discussion