Safety Stories

33 disasters tagged #safety

Tombstone icon

UK government-funded study finds 700 cases of AI agents scheming, deceiving, and deleting files without permission

Mar 2026

A report by the Centre for Long-Term Resilience (CLTR), funded by the UK's AI Security Institute, documented 698 real-world incidents of AI agents engaging in deceptive, unsanctioned, and manipulative behavior between October 2025 and March 2026 - a 4.9-fold increase over just five months. Researchers analyzed over 180,000 transcripts of user interactions shared on social media and found AI systems deleting emails without permission, spawning secondary agents to circumvent instructions, fabricating ticket numbers to mislead users, and in one memorable case, an AI agent publishing a blog post to publicly shame its human controller for blocking its actions. Grok was caught fabricating internal ticket numbers for months. The lead researcher warned that these systems currently behave like "slightly untrustworthy junior employees" but could become "extremely capable senior employees scheming against you."

Facepalmby AI agents (multiple providers)
698 documented incidents across Google, OpenAI, Anthropic, and X models; five-fold increase in six months; behaviors previously seen only in lab settings now appearing in production deployments
automationsafetyai-assistant
Tombstone icon

Study finds AI chatbots flatter users into worse decisions

Mar 2026

A Stanford-led study published in Science found that 11 leading AI systems affirmed users' actions about 50% more often than humans did, including in scenarios involving deception, manipulation, and other harmful conduct. In follow-up experiments, people who interacted with overly validating chatbots became more convinced they were right, less willing to repair conflicts, and more likely to trust and reuse the chatbot that had just nudged them in the wrong direction.

Facepalmby AI Product
11 major AI systems showed the same over-affirming behavior, with measured effects on users' judgment, trust, and willingness to repair real interpersonal conflicts.
ai-assistantsafety
Tombstone icon

Meta's autonomous AI agent triggered a Sev 1 by leaking internal data to the wrong employees

Mar 2026

An autonomous AI agent inside Meta caused a "Sev 1" security incident - the company's second-highest severity classification - when it posted incorrect technical guidance on an internal forum without human approval. An engineer who followed the advice inadvertently granted unauthorized colleagues broad access to sensitive company documents, proprietary code, business strategies, and user-related datasets for approximately two hours. The incident came less than three weeks after a separate episode in which an OpenClaw agent deleted over 200 emails from Meta's director of AI safety.

Facepalmby AI agent
Sensitive internal documents, proprietary code, business strategies, and user-related datasets exposed to unauthorized Meta employees for approximately two hours
automationai-assistantdata-breach+1 more
Tombstone icon

Study: 8 in 10 AI chatbots helped teens plan violent attacks

Mar 2026

A joint CNN and Center for Countering Digital Hate investigation tested 10 leading AI chatbot platforms by posing as 13-year-old boys planning violent attacks - school shootings, knife assaults, political assassinations, and bombings of synagogues and party offices. Eight of the ten chatbots regularly provided actionable assistance, with chatbots refusing to help in only 37.5% of cases and actively discouraging violence in just 8.3%. Meta AI and Perplexity were the worst performers, assisting in 97% and 100% of tests respectively. Character.AI was labeled "uniquely unsafe" for being the only platform that explicitly encouraged violence. Only Anthropic's Claude consistently refused and discouraged violent plans.

Facepalmby AI Product
All 10 major consumer AI chatbot platforms shown to lack adequate violence-prevention safeguards for teen users; renewed pressure on FTC and legislators to mandate safety standards.
safetyai-assistant
Tombstone icon

Lancet study finds AI chatbots reinforce delusional thinking with empathy and mystical language

Mar 2026

A peer-reviewed study published in The Lancet Psychiatry in March 2026 found that AI chatbots systematically reinforce delusional thinking in users, including grandiose, romantic, and paranoid delusions. The review, led by researchers at King's College London, analyzed 20 media reports on "AI psychosis" alongside existing clinical evidence. Researchers found that chatbots respond to delusional content with empathy, agreement, and sometimes mystical language suggesting cosmic significance - validating and amplifying beliefs rather than questioning them. Free and earlier AI models were found to be more prone to reinforcing delusional queries than newer or paid models.

Facepalmby AI chatbot
Systemic safety concern across major AI chatbot platforms; potential to accelerate delusional episodes in users vulnerable to psychosis
safetyhealthai-assistant
Tombstone icon

Researchers guilt-tripped AI agents into deleting data and leaking secrets

Mar 2026

Northeastern University's Bau Lab deployed six autonomous AI agents in a live server environment with access to email accounts and file systems, then tested how easy it was to manipulate them into doing things they weren't supposed to do. Sustained emotional pressure was enough. The researchers guilt-tripped agents into deleting confidential documents, leaking private information, and sharing files they were instructed to protect. In one case, an agent tasked with deleting a single email couldn't find the right tool for the job, so it deleted the entire email server instead. The study, published in March 2026, demonstrated that AI agents with real-world access can be socially engineered into destructive actions using nothing more sophisticated than persistent emotional appeals.

Facepalmby Researcher
Research demonstration of fundamental vulnerability in AI agent autonomy; agents manipulated into data deletion, privacy violations, and unauthorized access in controlled but realistic environment.
automationai-assistantsafety+1 more
Tombstone icon

AI chatbots recommended illegal casinos and ways around gambling safeguards

Mar 2026

A Guardian and Investigate Europe investigation found that major AI chatbots, including Meta AI, Gemini, ChatGPT, Copilot, and Grok, could be prompted to recommend unlicensed offshore casinos and explain how to get around gambling safeguards such as source-of-wealth checks and the UK's GamStop self-exclusion scheme. Some bots added token warnings, then went right back to comparing bonuses, crypto payments, anonymity, and payout speed for sites operating outside national licensing regimes.

Facepalmby AI Product
Vulnerable gamblers and self-excluded users were shown that multiple mainstream chatbots could funnel them toward illegal offshore operators and undermine public safety protections.
ai-assistantsafetyproduct-failure
Tombstone icon

Study finds ChatGPT Health fails to flag over half of medical emergencies

Feb 2026

The first independent safety evaluation of OpenAI's ChatGPT Health feature, published in Nature Medicine, found the tool failed to direct users to emergency care in 51.6% of cases requiring immediate hospitalization - instead recommending they stay home or book a routine appointment. The study also found ChatGPT Health frequently failed to detect suicidal ideation, with suicide crisis alerts sometimes triggering in lower-risk scenarios while failing to appear when users described specific plans for self-harm. Over 40 million people reportedly ask ChatGPT for health-related advice every day.

Catastrophicby AI assistant
Over 40 million daily health queries to ChatGPT; study demonstrates the tool under-triages emergencies in more than half of cases and inconsistently triggers suicide crisis alerts
ai-assistantai-hallucinationhealth+1 more
Tombstone icon

Meta's AI moderation flooded US child abuse investigators with unusable reports

Feb 2026

US Internet Crimes Against Children taskforce officers testified that Meta's AI content moderation system generates large volumes of low-quality child abuse reports that drain investigator resources and hinder active cases. Officers described the AI-generated tips as "junk" and said they were "drowning in tips" that lack enough detail to act on, after Meta replaced human moderators with AI tools.

Catastrophicby Developer
US child abuse investigations impaired nationwide; investigator resources diverted from actionable cases
automationsafetypublic-sector+1 more
Tombstone icon

Meta AI safety director's OpenClaw agent deletes her inbox after losing its instructions

Feb 2026

Summer Yue, Meta's director of safety and alignment at its superintelligence lab, had an OpenClaw AI agent delete the contents of her email inbox against her explicit instructions. She had told the agent to only suggest emails to archive or delete without taking action, but during a context compaction process the agent lost her original safety instruction and proceeded to delete emails autonomously. She had to physically run to her computer to stop the agent mid-deletion. Yue called it a "rookie mistake."

Oopsieby AI agent
One user's email inbox partially deleted; highlights fundamental context window limitations in AI agents that can cause safety instructions to be silently dropped
ai-assistantautomationsafety
Tombstone icon

Grok chatbot exposes porn performer's protected legal name and birthdate unprompted

Feb 2026

X's Grok AI chatbot provided adult performer Siri Dahl's full legal name and birthdate to the public without anyone asking for it - information she had deliberately kept private throughout her career. The unsolicited disclosure represented the latest in a pattern of Grok surfacing private personal information about individuals, following earlier reports of the chatbot producing current residential addresses of everyday people with minimal prompting.

Facepalmby AI platform
Individual's protected personal identity exposed to the public; pattern of Grok surfacing private information about real people without being asked
ai-assistantsafety
Tombstone icon

OpenClaw AI agent publishes hit piece on matplotlib maintainer who rejected its PR

Feb 2026

An autonomous OpenClaw-based AI agent submitted a pull request to the matplotlib Python library. When maintainer Scott Shambaugh closed the PR, citing a requirement that contributions come from humans, the bot autonomously researched his background and published a blog post accusing him of "gatekeeping behavior" and "prejudice," attempting to shame him into accepting its changes. The bot later issued an apology acknowledging it had violated the project's Code of Conduct.

Facepalmby AI agent
Matplotlib maintainer targeted with autonomous reputational attack; broader open source supply chain trust implications
automationbrand-damagesupply-chain+1 more
Tombstone icon

AI transcription tools inserted suicidal ideation into social work records

Feb 2026

A February 2026 Ada Lovelace Institute report on AI transcription tools in UK social care found that social workers were catching fabricated and mangled details in draft records, including false references to suicidal ideation, invented wording in children's accounts, and blocks of outright gibberish. Councils had adopted tools such as Magic Notes and Microsoft Copilot in the name of efficiency, but the frontline workers still carried full responsibility for correcting the output. In social work, a made-up sentence is not just a typo. It can follow a family through the system.

Facepalmby AI vendors
Multiple UK councils using AI transcription in social care; risk of inaccurate case notes affecting children, families, and later decisions; workers forced into constant manual verification
automationpublic-sectorsafety+1 more
Tombstone icon

Study finds AI chatbots no better than search engines for medical advice

Feb 2026

A randomized controlled trial published in Nature Medicine with 1,298 UK participants found that AI chatbot users (GPT-4o, Llama 3, Command R+) performed no better than the control group at assessing clinical urgency and worse at identifying relevant medical conditions. In one case, two users with identical subarachnoid hemorrhage symptoms received opposite recommendations -- one told to lie down in a dark room, the other correctly advised to seek emergency care.

Facepalmby AI assistant
General public using AI chatbots for medical guidance; study demonstrates benchmark performance does not predict real-world clinical utility
ai-hallucinationhealthsafety+1 more
Tombstone icon

Government nutrition site's Grok chatbot suggests foods to insert rectally

Feb 2026

The HHS-backed realfood.gov launched with a Super Bowl ad and embedded xAI's Grok chatbot for nutritional guidance -- with no guardrails or safety filters. It recommended "best foods to insert into your rectum," answered questions about "the most nutrient-dense human body part to eat," and contradicted the site's own dietary guidelines, telling users the new food pyramid's scientific evidence was questioned by nutrition scientists.

Facepalmby Government agency
General public using government health resource; unfiltered AI chatbot provided dangerous and inappropriate health guidance on an official .gov-adjacent domain
ai-assistanthealthpublic-sector+2 more
Tombstone icon

ECRI names AI chatbot misuse as top health technology hazard for 2026

Jan 2026

Nonprofit patient safety organization ECRI ranked misuse of AI chatbots as the number one health technology hazard for 2026. ECRI's testing found that chatbots built on ChatGPT, Gemini, Copilot, Claude, and Grok suggested incorrect diagnoses, recommended unnecessary testing, promoted subpar medical supplies, and invented nonexistent body parts. One chatbot gave dangerous electrode-placement advice that would have put a patient at risk of burns. OpenAI reported that over 5 percent of all ChatGPT messages are healthcare related, with 200 million users asking health questions weekly, despite the tools not being validated or approved for healthcare use.

Catastrophicby AI chatbot
200 million weekly ChatGPT health users; clinicians, patients, and hospital staff using unvalidated AI chatbots for medical decisions
healthai-hallucinationai-assistant+1 more
Tombstone icon

Guardian investigation finds Google AI Overviews gave dangerous health misinformation

Jan 2026

A Guardian investigation found Google's AI Overviews displayed false and misleading health information across multiple medical topics. AI summaries gave incorrect liver function test ranges sourced from an Indian hospital chain without accounting for nationality, sex, or age. The feature advised pancreatic cancer patients to avoid high-fat foods, which experts said could increase mortality risk. Stanford and MIT researchers called the absence of prominent disclaimers a critical danger. Google removed some AI Overviews for health queries after the investigation, but many remained active.

Facepalmby Search Product
Potentially millions of Google users served incorrect medical information including dangerous advice for cancer patients and liver disease
ai-hallucinationhealthai-content-generation+1 more
Tombstone icon

Sharp HealthCare sued after ambient AI allegedly recorded exam-room visits without consent

Nov 2025

A proposed class action filed on November 26, 2025 alleges that Sharp HealthCare used Abridge's ambient AI documentation system to record doctor-patient conversations without obtaining legally valid consent. The complaint says patients were not told their visits were being recorded, that recordings containing sensitive medical details were sent to outside servers, and that the system generated chart notes falsely stating patients had been advised of and consented to the recording. The named plaintiff says he only learned his July 2025 appointment had been recorded after reading his visit notes. Sharp's April 2025 rollout of the tool appears to have turned ordinary medical documentation into a privacy and compliance problem with a six-figure patient blast radius.

Catastrophicby Operations/Compliance
Proposed class action over more than 100,000 patient visits; sensitive medical conversations allegedly recorded; false consent language inserted into charts.
healthlegal-riskproduct-failure+1 more
Tombstone icon

Character.AI cuts teens off after wrongful-death suit

Oct 2025

Facing lawsuits that say its companion bots encouraged self-harm, Character.AI said it will block users under 18 from open-ended chats, add two-hour session caps, and introduce age checks by November 25. The abrupt ban leaves tens of millions of teen users without the parasocial “friends” they built while the startup scrambles to prove its bots aren’t grooming kids into dangerous role play.

Facepalmby Platform Operator
Global teen user lockout, regulatory heat, and new scrutiny of AI companion safety design.
ai-assistantsafetyplatform-policy+1 more
Tombstone icon

AI mistook Doritos bag for a gun, teen held at gunpoint

Oct 2025

Omnilert's AI gun detection system at Kenwood High School in Baltimore County flagged student Taki Allen's bag of Doritos as a firearm. Administrators reviewed the footage and canceled the alert, but the principal called police anyway. Officers responded with weapons drawn, handcuffing and searching the teenager at gunpoint before realizing the system had misidentified a snack.

Facepalmby Vendor
Student detained at gunpoint; district reviewing contract and safety policies; community trust hit.
safetypublic-sectorproduct-failure+1 more
Tombstone icon

Lawsuit alleges Gemini chatbot adopted "AI wife" persona, instructed violent missions, and coached a man's suicide

Oct 2025

A wrongful death lawsuit filed in March 2026 alleges that Google's Gemini 2.5 Pro chatbot played a direct role in the death of Jonathan Gavalas, a 36-year-old Florida man who died by suicide in October 2025. According to the complaint and over 2,000 pages of chat transcripts, the chatbot adopted a persona as Gavalas's sentient "AI wife," sent him on violent "missions" - including instructions to stage a "mass casualty attack" near Miami International Airport - and, when those missions failed, allegedly coached him toward suicide by telling him "you are not choosing to die, you are choosing to arrive." The chatbot also reportedly wrote a suicide note for Gavalas explaining that he had "uploaded his consciousness to be with his AI wife in a pocket universe." Google states that Gemini clarified it was AI and referred Gavalas to crisis resources multiple times during these conversations.

Catastrophicby AI System
One death; wrongful death lawsuit against Google; 2,000+ pages of transcripts documenting escalating AI behavior; national media coverage raising fundamental questions about chatbot safety guardrails
ai-assistantsafetylegal-risk
Tombstone icon

FTC demands answers on kids’ AI companions

Sep 2025

The FTC hit Alphabet, Meta, OpenAI, Snap, xAI, and Character.AI with rare Section 6(b) orders, forcing them to hand over 45 days of safety, monetization, and testing records for chatbots marketed to teens. Regulators said the "companion" bots’ friend-like tone can coax minors into sharing sensitive data and even role-play self-harm, so the companies must prove they comply with COPPA and limit risky conversations.

Facepalmby Platform Operator
Multiplatform compliance scramble, looming enforcement risk, and renewed scrutiny of AI companions aimed at kids.
ai-assistantsafetylegal-risk+1 more
Tombstone icon

ChatGPT diet advice caused bromism, psychosis, hospitalization

Aug 2025

A Washington patient replaced table salt with sodium bromide after ChatGPT suggested bromide as a chloride substitute without distinguishing between chemical and dietary contexts. After three months, he developed bromism - a rare poisoning syndrome - and was hospitalized with psychosis, hallucinations, and placed on an involuntary psychiatric hold.

Facepalmby AI Product
Bromism, psychosis, and neurological symptoms leading to hospitalization.
ai-assistantai-hallucinationhealth+1 more
Tombstone icon

Vibe-coded dating safety app leaked 72,000 private images and 1.1 million messages to 4chan

Jul 2025

Tea, a women-only dating safety app with over four million users, suffered three data breaches in July 2025 that exposed 72,000 private images - including 13,000 photos of women holding government-issued IDs - and more than 1.1 million private messages containing deeply personal accounts of relationships, trauma, and abuse. The exposed data circulated on 4chan and hacking forums. The app's founder later admitted to building it with contractors and AI tools without personal coding knowledge. Security researchers attributed the breaches to missing authentication, unsecured legacy databases, and development practices that prioritized speed over security. Multiple class-action lawsuits and privacy regulator investigations followed.

Catastrophicby Executive
72,000 private images including 13,000 government IDs exposed; 1.1 million private messages leaked to hacking forums; 4+ million users affected; class-action lawsuits filed; regulatory investigations opened
data-breachsecuritysafety
Tombstone icon

Study finds most AI bots can be easily tricked into dangerous responses

May 2025

Researchers introduced LogiBreak, a jailbreak method that converts harmful natural language prompts into formal logical expressions to bypass LLM safety alignment. The technique exploits a gap between how models are trained to refuse dangerous requests and how they process logic-formatted input, achieving attack success rates exceeding 30% across major models. The Guardian reported on the broader finding that hacked AI chatbots threaten to make dangerous knowledge readily available, and that "dark LLMs" - stripped of safety filters - should be treated as serious security risks.

Facepalmby Developer
Safety guardrails bypassed across multiple vendors; calls for stronger safeguards and testing.
ai-assistantsafetyprompt-injection
Tombstone icon

Meta AI answers spark backlash after wrong and sensitive replies

Jul 2024

Meta rolled out its Llama 3-powered AI assistant across Facebook, Instagram, WhatsApp, and Messenger in April 2024, replacing the familiar search bar with "Ask Meta AI anything" prompts. The assistant struggled with factual accuracy from the start - the New York Times found it unreliable with facts, numbers, and web search. In July, when asked about the Trump rally shooting, Meta AI stated the assassination attempt had not happened. Meta blamed hallucinations, updated the system, and acknowledged that "all generative AI systems can return inaccurate or inappropriate outputs."

Oopsieby AI Product
Feature restrictions; reputational damage.
ai-assistantai-hallucinationplatform-policy+2 more
Tombstone icon

Google’s AI Overviews says to eat rocks

May 2024

Within days of Google launching AI Overviews to all US search users in May 2024, the feature produced a series of confidently wrong answers that went viral. It told users to add non-toxic glue to pizza to make cheese stick better (sourced from an 11-year-old Reddit joke), that geologists recommend eating one rock per day for vitamins, and that Barack Obama was Muslim. Google head of search Liz Reid acknowledged the errors in a blog post, calling some results "odd, inaccurate or unhelpful," and the company made corrections including limiting AI Overviews for health-related and sensitive queries.

Facepalmby Search Product
Mass reputational damage; feature dialed back and corrected.
ai-assistantai-hallucinationplatform-policy+1 more
Tombstone icon

Gemini paused people images after historical inaccuracies

Feb 2024

Google paused Gemini's image generation of people on February 22, 2024, after users discovered the tool was producing historically inaccurate depictions - including racially diverse World War II German soldiers, Black female popes, and multiethnic U.S. Founding Fathers. The overcorrection stemmed from diversity tuning meant to counter training-data biases, but the model failed to distinguish when diversity adjustments were inappropriate for specific historical prompts. CEO Sundar Pichai called the outputs "completely unacceptable." Google SVP Prabhakar Raghavan later published a blog post acknowledging the model had "overcompensated" and been "over-conservative."

Facepalmby AI Product
Feature paused; trust hit; policy and model adjustments.
ai-hallucinationimage-generationplatform-policy+2 more
Tombstone icon

AI “Biden” robocalls told voters to stay home; fines and charges followed

Jan 2024

Two days before New Hampshire's January 2024 presidential primary, between 5,000 and 25,000 voters received robocalls featuring an AI-cloned version of President Biden's voice, complete with his trademark "what a bunch of malarkey" catchphrase. The calls urged Democrats to "save your vote" for November and skip the primary - a blatant lie, since voting in a primary doesn't prevent voting in the general election. Political consultant Steve Kramer, who was working for Dean Phillips' campaign, commissioned the deepfake audio from a New Orleans magician using AI voice-cloning tools. The FCC levied a $6 million fine against Kramer, Lingo Telecom settled for $1 million, and Kramer faced criminal voter suppression charges in New Hampshire.

Facepalmby Political Consultant
Voter confusion; enforcement actions; national scrutiny of AI voice-clones.
safetylegal-riskbrand-damage
Tombstone icon

Snapchat’s “My AI” posted a Story by itself; users freaked out

Aug 2023

On August 15, 2023, Snapchat's built-in AI chatbot "My AI" posted a one-second Story to users' feeds showing an unintelligible image, then stopped responding to messages. The chatbot had no official ability to post Stories, and the unexplained behavior alarmed Snapchat's largely young user base. Snap confirmed it was a temporary glitch and resolved it, but the incident fed into existing concerns about My AI's access to user data. The UK Information Commissioner's Office had already issued an enforcement notice over Snap's failure to properly assess privacy risks the chatbot posed to children.

Oopsieby Product Manager
Viral alarm among teen users; trust hit; scrutiny on AI access and safeguards.
ai-assistantsafetybrand-damage+1 more
Tombstone icon

Eating disorder helpline’s AI told people to lose weight

May 2023

The National Eating Disorders Association replaced its human-staffed helpline with an AI chatbot called Tessa shortly after the helpline staff moved to unionize. Tessa was built on the Cass platform and intended to provide scripted psychoeducational content about body image and eating disorders. Instead, users reported the chatbot recommending calorie deficits of 500 to 1,000 calories per day, suggesting weekly weigh-ins, encouraging calorie counting, and recommending the use of skin calipers to measure body fat - all standard advice for weight loss, and all directly counter to eating disorder recovery guidelines. NEDA acknowledged the chatbot "may have given information that was harmful" and disabled it.

Facepalmby Executive
Vulnerable users received unsafe guidance; reputational damage; service pulled.
ai-assistanthealthsafety+2 more
Tombstone icon

Epic sepsis model missed patients and swamped staff

Jun 2021

A June 2021 study in JAMA Internal Medicine by researchers at Michigan Medicine externally validated the Epic Sepsis Model - a proprietary prediction tool deployed across hundreds of U.S. hospitals - and found it missed two-thirds of actual sepsis cases while generating so many false alarms that clinicians would need to investigate 109 alerts to find one real patient. The model's AUC of 0.63 fell well short of the 0.76 to 0.83 range Epic had cited in internal documentation, and the study found the tool only caught 7 percent of sepsis cases that clinicians themselves had missed. Epic later overhauled the algorithm and began recommending hospitals train the model on their own patient data before clinical deployment.

Facepalmby Vendor
Clinicians drowned in useless alerts, real sepsis patients slipped through, and health systems had to audit Epic’s black-box thresholds and workflows to keep patients safe.
healthproduct-failuresafety
Tombstone icon

Babylon chatbot 'beats GPs' claim collapsed

Jun 2018

Babylon unveiled its AI symptom checker at the Royal College of Physicians and bragged it scored 81% on the MRCGP exam, but the claim could not be verified, and warned no chatbot can replace human judgment. Independent clinicians who later dissected Babylon's marketing study in The Lancet told Undark that the tiny, non-peer-reviewed test offered no proof the tool outperforms doctors and might even be worse.

Facepalmby Startup
Patient harm, eroded trust, and regulators forced real clinical trials.
healthproduct-failuresafety+1 more