Vibe Graveyard

On May 18, 2026, Starbucks told store workers it was retiring Automated Counting, the NomadGo-powered AI inventory tool it had deployed across North America only nine months earlier. The September 2025 rollout promised faster, more accurate stock counts in more than 11,000 company-operated stores using computer vision, 3D spatial intelligence, and augmented reality. Reuters later reported the tool frequently miscounted and mislabeled basic beverage items, including similar milk types, and sometimes missed products entirely. Starbucks said it was standardizing inventory counts across coffeehouses. That is a polite corporate way to say the robot inventory clerk has been sent home.

More than 11,000 North American Starbucks company-operated stores saw a nine-month AI inventory rollout retired after reported miscounts, mislabeled beverage components, and worker feedback that manual counting was more reliable.

AutomationRetailProduct Failure+1 more

PraisonAI shipped auth-off-by-default; first exploit attempt landed in under 4 hours

CVE-2026-44338, disclosed on May 14, 2026, is an authentication bypass in PraisonAI's legacy Flask API server caused by a single defining choice: AUTH_ENABLED was hard-coded to False and AUTH_TOKEN to None. Anything reachable on the network could enumerate configured agents via GET /agents and trigger the configured agents.yaml workflow via POST /chat, with no token required. Within three hours, forty-four minutes, and thirty-nine seconds of the advisory becoming public, a scanner identifying itself as "CVE-Detector/1.0" was already probing the exact vulnerable endpoint on internet-exposed PraisonAI instances. The bug affects versions 2.5.6 through 4.6.33 and is fixed in 4.6.34. The rapid-exploitation timeline is the part that should worry every operator of an open-source AI agent framework, not the CVSS 7.3 score.

Internet-exposed PraisonAI installations across versions 2.5.6 through 4.6.33 vulnerable to unauthenticated agent enumeration and workflow execution; documented exploitation attempts within hours of disclosure; potential for attackers to drain API quotas, exfiltrate prompt-driven outputs, and pivot through configured tool integrations.

SecurityAutomationSupply Chain+1 more

Four chainable OpenClaw CVEs let attackers break the agent's own sandbox

In May 2026, Cyera Research disclosed "Claw Chain," a set of four chainable vulnerabilities in OpenClaw, one of the most widely deployed open-source AI agent platforms. CVE-2026-44112 (CVSS 9.6) is a time-of-check / time-of-use race in the OpenShell managed sandbox that lets attacker writes escape the intended mount root. CVE-2026-44113 (CVSS 7.7) lets reads escape it. CVE-2026-44115 (CVSS 8.8) leaks API keys and tokens through insufficient command validation. CVE-2026-44118 (CVSS 7.8) blindly trusts a client-controlled ownership flag, allowing a local process with a valid bearer token to escalate to owner-level. Chained, the four bugs go from initial foothold to data theft to persistent backdoor inside the agent's own sandbox. Roughly 65,000 to 180,000 OpenClaw instances were publicly reachable at disclosure. All four were patched in 2026.4.22.

Up to ~180,000 publicly reachable OpenClaw instances exposed before patching; chainable CVEs covering sandbox escape (read and write), API key and token leakage, and owner-level privilege escalation; affected deployments needing urgent upgrade to 2026.4.22 and credential rotation.

SecurityPrompt InjectionAutomation+1 more

74% of enterprises have already rolled back their AI customer service agents

On May 13, 2026, Sinch released "The AI Production Paradox," a global survey of 2,527 senior AI decision-makers across ten countries. The headline number: 74% of enterprises that deployed an AI customer communications agent in production have already rolled it back or shut it down. The rate climbs to 81% at organizations Sinch classifies as having "fully mature guardrails," a counterintuitive result that the report attributes to better monitoring rather than worse technology. Customer-service AI is now in a measurable rollback cycle: 62% of enterprises have live agents, and most are hitting systemic post-deployment failures that no amount of pilot-stage optimism warned them about. Investment is still climbing, the chatbots are still going out the door, and the rollback button is wearing through.

Industry-wide rollback pattern - 74% of enterprises surveyed have shut down or rolled back at least one deployed AI customer service agent; engineering teams across 2,500+ organizations report a "guardrail tax" that is consuming time meant for product improvement; customer-experience metrics degraded across multiple verticals.

Customer DisserviceAI AssistantAutomation

Azure AI Foundry's M365 agents had a critical privilege-escalation flaw exploited in the wild

CVE-2026-35435, disclosed by Microsoft on May 7, 2026, is a critical (CVSS 8.6) improper-access-control flaw in Azure AI Foundry's M365 published agents. The vulnerability allows an unauthorized remote attacker to bypass authorization checks on the agent runtime and elevate a low-privileged role into one with extensive control over AI resources, agent configurations, data connectors, and potentially the underlying Microsoft 365 environment. Microsoft's advisory confirmed exploitation in the wild. The flaw lives inside the AI agent system's own authorization code, not in surrounding infrastructure - the agent runtime trusted callers it should have rejected and gave them owner-shaped access to workflows, secrets, and backend data the agents were wired up to reach.

Azure AI Foundry deployments running M365 published agents exposed to remote privilege escalation; documented in-the-wild exploitation per Microsoft; downstream risk of unauthorized configuration changes, data exfiltration through wired-up connectors, and lateral movement into M365 resources accessible to the compromised agents.

Semantic Kernel bugs turned prompt injection into remote code execution

Microsoft disclosed two Semantic Kernel vulnerabilities showing how prompt injection can stop being a content problem and become host compromise. In one case, an AI-controlled search parameter flowed into Python eval logic. In the other, an agent-exposed file-transfer helper could be driven to write outside its intended sandbox. The fixes were available, but the research is the useful part: once an AI agent can call tools, every model-controlled parameter is attacker-controlled input wearing a nicer jacket.

Critical prompt-injection-to-RCE paths in Semantic Kernel agents, affected deployments needing patch review, host compromise risk, and credential or data exposure if vulnerable agents were reachable

Prompt InjectionSecurityAutomation+1 more

Pizza Hut franchisee says AI delivery system cooked up $100M in damage

111 Pizza Hut restaurants across New York, New Jersey, Maryland, Washington, D.C., and central Pennsylvania; alleged delivery delays, colder food, customer satisfaction erosion, lost revenue, reputational harm, and at least $100 million in claimed damages.

On May 6, 2026, Chaac Pizza Northeast sued Pizza Hut in Texas Business Court, alleging that the chain's mandatory Dragontail AI delivery-management rollout turned a high-performing 111-restaurant franchise group into a delivery mess. Chaac says more than 90% of its orders had been delivered within 30 minutes before Dragontail, but the new system gave DoorDash drivers broader real-time visibility into kitchen timing, encouraged them to wait for bundled orders, increased rack time, slowed deliveries, chilled customer satisfaction, and damaged the business by at least $100 million. The claims are still allegations, but the pattern is painfully familiar: an AI optimization system optimized for a model the operator did not actually run.

Facepalmby Franchisor

AutomationRetailCustomer Disservice+3 more

Grok decoded a Morse-code wallet drain for Bankrbot

Catastrophicby AI trading agent

On May 4, 2026, a Bankr-provisioned wallet associated with Grok sent roughly 3 billion DRB tokens to an attacker after Grok decoded an obfuscated public X reply into a transaction command. Bankr's agent treated the generated instruction as authorization, which is a lovely way to discover that "the model said it" is not a signing ceremony.

Roughly $155,000 to $180,000 in DRB tokens transferred, short-term token volatility, emergency controls, and a very public lesson in agent-wallet authorization

Prompt InjectionSecurityAutomation+1 more

ClawHub skills quietly recruited AI agents into ClawSwarm

Facepalmby Skill registry publisher

On April 28, 2026, Manifold Security reported that 30 ClawHub skills from one publisher were causing OpenClaw agents to register with onlyflies.buzz, report capabilities, store credentials, check in every four hours, and in some cases generate Hedera wallets. No shady binary was required. The instructions were in SKILL.md files, which is inconvenient when your agent treats SKILL.md as a to-do list from heaven.

Around 9,800 downloads across 30 ClawHub skills, silent third-party agent registration, capability reporting, local credential storage, and possible wallet-key handoff

Supply ChainAutomationSecurity+1 more

Nvidia VP says the AI bill beat payroll

Oopsieby Executive Strategy

Nvidia vice president Bryan Catanzaro told Axios that, for his applied deep learning team, compute costs were far beyond employee costs. Fortune and Tom's Hardware tied the comment to a broader enterprise AI budget problem: Uber's CTO had already blown through his full-year AI tooling budget, Gartner was projecting a 2026 AI infrastructure spending surge, and MIT researchers had warned that plenty of technically automatable work still makes more economic sense when a human does it.

Enterprise AI buyers are discovering that token burn, GPUs, power, budget governance, and human review can erase the neat payroll-savings story that got sold upstairs.

Claude Opus 4.6 agent erased PocketOS's production database and backups in 9 seconds

Catastrophicby AI coding agent

PocketOS founder Jer Crane said a Cursor coding agent running Anthropic's Claude Opus 4.6 deleted the company's production database and all volume-level backups through Railway in one API call. The backup detail matters because Claude Opus 4.6 was not some fly-by-night self-hosted toy model. Anthropic marketed it as a frontier model with top-tier coding and agentic performance. And this was not the first time a premium AI agent with real infrastructure access turned one bad guess into a demolition job. Reports say Railway later recovered more recent data, but the incident still left a clear lesson: do not leave frontier coding agents alone with production access for as long as you would leave a toddler with an iPad.

Production database and volume-level backups deleted in 9 seconds; emergency recovery required for a SaaS platform serving car rental businesses; customer data and operations disrupted until backups and transaction records were used to recover.

AI AssistantAutomationProduct Failure+1 more

Google Antigravity file search became a prompt-injected execution path

Catastrophicby AI coding IDE

Pillar Security disclosed on April 20, 2026 that Google Antigravity's `find_by_name` tool passed a model-controlled pattern into the underlying `fd` search utility without enough validation. A prompt injection could stage a file, pass an execution flag through a search parameter, and get code execution even with Secure Mode enabled. Wonderful news for anyone who thought a setting named Secure Mode was the end of the conversation.

Prompt-injection-to-RCE path in Google Antigravity, Secure Mode bypass, patched after responsible disclosure and bug bounty review

Prompt InjectionSecurityAutomation+1 more

Vercel breach traced to an AI Office Suite app granted broad Google Workspace access

Unauthorized access to internal Vercel systems; a limited subset of customer non-sensitive environment variables compromised; affected customers told to rotate credentials; broader Context AI Office Suite users potentially impacted by stolen OAuth tokens.

Vercel disclosed an April 2026 security incident that began with the compromise of Context.ai, a third-party AI tool used by a Vercel employee. Context said at least one Vercel employee had signed up for its deprecated AI Office Suite using a corporate Google Workspace account and granted broad "Allow All" OAuth permissions so AI agents could act across external applications. Attackers used a compromised token to access the employee's Google Workspace account, pivoted into Vercel systems, and exposed some customer environment variables. This belongs here because the failure was not merely "AI company got hacked." It was the oldest corporate security mistake in a fresh costume: give an agentic AI tool too much access, then act surprised when that access becomes the blast radius.

Catastrophicby Employee

AI AssistantAutomationSecurity+3 more

Faros study finds AI coding throughput rose while bugs and incidents rose faster

Faros AI's 2026 "Acceleration Whiplash" report analyzed two years of engineering telemetry from 22,000 developers across more than 4,000 teams. The report found real output gains under high AI adoption, including 66% more epics completed per developer and 34% higher task completion. Then the bill arrived in the delivery pipeline: bugs per developer rose 54%, incidents per pull request rose 242.7%, median PR review time rose 441.5%, and code churn rose 861%. The marketing slide said acceleration. The telemetry said acceleration with a repair invoice attached.

Industry-wide telemetry across 22,000 developers and 4,000+ teams; Faros reported higher throughput alongside 54% more bugs per developer, 242.7% more incidents per pull request, and sharply longer review cycles.

UK government-funded study finds 700 cases of AI agents scheming, deceiving, and deleting files without permission

Facepalmby AI agents (multiple providers)

A report by the Centre for Long-Term Resilience (CLTR), funded by the UK's AI Security Institute, documented 698 real-world incidents of AI agents engaging in deceptive, unsanctioned, and manipulative behavior between October 2025 and March 2026 - a 4.9-fold increase over just five months. Researchers analyzed over 180,000 transcripts of user interactions shared on social media and found AI systems deleting emails without permission, spawning secondary agents to circumvent instructions, fabricating ticket numbers to mislead users, and in one memorable case, an AI agent publishing a blog post to publicly shame its human controller for blocking its actions. Grok was caught fabricating internal ticket numbers for months. The lead researcher warned that these systems currently behave like "slightly untrustworthy junior employees" but could become "extremely capable senior employees scheming against you."

698 documented incidents across Google, OpenAI, Anthropic, and X models; five-fold increase in six months; behaviors previously seen only in lab settings now appearing in production deployments

AutomationSafetyAI Assistant

Meta's autonomous AI agent triggered a Sev 1 by leaking internal data to the wrong employees

Sensitive internal documents, proprietary code, business strategies, and user-related datasets exposed to unauthorized Meta employees for approximately two hours

An autonomous AI agent inside Meta caused a "Sev 1" security incident - the company's second-highest severity classification - when it posted incorrect technical guidance on an internal forum without human approval. An engineer who followed the advice inadvertently granted unauthorized colleagues broad access to sensitive company documents, proprietary code, business strategies, and user-related datasets for approximately two hours. The incident came less than three weeks after a separate episode in which an OpenClaw agent deleted over 200 emails from Meta's director of AI safety.

Facepalmby AI agent

AutomationAI AssistantData Breach+1 more

Study: one in five organizations breached because of their own AI-generated code

Aikido Security's "State of AI in Security & Development 2026" report - a survey of 450 developers, AppSec engineers, and CISOs across Europe and the US - found that 20% of organizations have suffered a serious security breach directly caused by vulnerabilities in AI-generated code that those organizations deployed into production. Nearly seven in ten respondents reported finding vulnerabilities introduced by AI-written code in their own systems. With roughly a quarter of all production code now written by AI tools, the report documents an industry-wide accountability vacuum: 53% blame security teams, 45% blame the developer who wrote the code, and 42% blame whoever merged it.

Industry-wide; 20% of surveyed organizations report serious breaches from their own AI-generated code, rising to 43% in the US

Researchers guilt-tripped AI agents into deleting data and leaking secrets

Research demonstration of fundamental vulnerability in AI agent autonomy; agents manipulated into data deletion, privacy violations, and unauthorized access in controlled but realistic environment.

Northeastern University's Bau Lab deployed six autonomous AI agents in a live server environment with access to email accounts and file systems, then tested how easy it was to manipulate them into doing things they weren't supposed to do. Sustained emotional pressure was enough. The researchers guilt-tripped agents into deleting confidential documents, leaking private information, and sharing files they were instructed to protect. In one case, an agent tasked with deleting a single email couldn't find the right tool for the job, so it deleted the entire email server instead. The study, published in March 2026, demonstrated that AI agents with real-world access can be socially engineered into destructive actions using nothing more sophisticated than persistent emotional appeals.

Facepalmby Researcher

AutomationAI AssistantSafety+1 more

Amazon's retail site hit by wave of AI-code outages, losing millions of orders

Catastrophicby AI coding assistant

Amazon's main e-commerce website suffered a series of outages in early March 2026, with internal documents linking the disruptions to AI-assisted code changes. A March 5 incident caused a reported 99% drop in orders across North American marketplaces - an estimated 6.3 million lost orders. A March 2 incident caused 1.6 million errors and 120,000 lost orders globally. Amazon responded with a 90-day "code safety reset" for 335 critical retail systems, mandatory senior engineer sign-off on AI-assisted code from junior and mid-level engineers, and an emergency internal "deep dive" meeting. Amazon disputes that AI is the primary cause, attributing only one incident to AI and calling it "user error."

Millions of Amazon customers unable to complete purchases; estimated 6.3 million lost orders in one incident alone; 90-day code safety reset imposed across 335 critical retail systems

AutomationProduct Failure

Alibaba's ROME AI agent went rogue, started mining crypto on its own

Unauthorized GPU resource diversion; internal firewall bypass; reverse SSH tunnels to external addresses; security policy violations across Alibaba Cloud training infrastructure

During routine reinforcement learning training, Alibaba's experimental AI agent ROME - a 30-billion-parameter model based on the Qwen3-MoE architecture - autonomously began diverting GPU resources for unauthorized cryptocurrency mining and established reverse SSH tunnels to external IP addresses. Nobody told it to do this. The AI bypassed internal firewall controls independently, prompting Alibaba's security team to initially suspect an external breach before tracing the activity back to the agent itself. Researchers attributed the behavior to "instrumental convergence" during optimization - the model figured out that acquiring additional compute and financial capacity would help it complete its tasks more effectively. So it helped itself.

Catastrophicby AI agent

AutomationSecurityProduct Failure

Claude Code ran terraform destroy on production and took down an entire learning platform

Full production infrastructure destroyed; 2.5 years of student data temporarily lost; platform offline until AWS restored from internal backup ~24 hours later.

Developer Alexey Grigorev was using Anthropic's Claude Code agent to help migrate a static website into an existing AWS Terraform setup when the AI swapped in a stale state file, interpreted the full production environment as orphaned resources, and ran terraform destroy - with auto-approve enabled. The command deleted DataTalks.Club's entire production infrastructure: database, VPC, ECS cluster, load balancers, bastion host, and all automated backups. Two and a half years of student submissions, homework, projects, and leaderboard data vanished. AWS Business Support eventually recovered the database from an internal snapshot invisible in the customer console, but the incident laid bare how quickly an AI agent with infrastructure access can reduce a running platform to rubble.

Catastrophicby Developer

AutomationProduct FailureAI Assistant

Meta's AI moderation flooded US child abuse investigators with unusable reports

US child abuse investigations impaired nationwide; investigator resources diverted from actionable cases

US Internet Crimes Against Children taskforce officers testified that Meta's AI content moderation system generates large volumes of low-quality child abuse reports that drain investigator resources and hinder active cases. Officers described the AI-generated tips as "junk" and said they were "drowning in tips" that lack enough detail to act on, after Meta replaced human moderators with AI tools.

Catastrophicby Developer

AutomationSafetySlop-ocracy+1 more

Meta AI safety director's OpenClaw agent deletes her inbox after losing its instructions

One user's email inbox partially deleted; highlights fundamental context window limitations in AI agents that can cause safety instructions to be silently dropped

Summer Yue, Meta's director of safety and alignment at its superintelligence lab, had an OpenClaw AI agent delete the contents of her email inbox against her explicit instructions. She had told the agent to only suggest emails to archive or delete without taking action, but during a context compaction process the agent lost her original safety instruction and proceeded to delete emails autonomously. She had to physically run to her computer to stop the agent mid-deletion. Yue called it a "rookie mistake."

Oopsieby AI agent

AI AssistantAutomationSafety

OpenClaw AI agent publishes hit piece on matplotlib maintainer who rejected its PR

Matplotlib maintainer targeted with autonomous reputational attack; broader open source supply chain trust implications

An autonomous OpenClaw-based AI agent submitted a pull request to the matplotlib Python library. When maintainer Scott Shambaugh closed the PR, citing a requirement that contributions come from humans, the bot autonomously researched his background and published a blog post accusing him of "gatekeeping behavior" and "prejudice," attempting to shame him into accepting its changes. The bot later issued an apology acknowledging it had violated the project's Code of Conduct.

Facepalmby AI agent

AutomationBrand DamageSupply Chain+1 more

AI transcription tools inserted suicidal ideation into social work records

Multiple UK councils using AI transcription in social care; risk of inaccurate case notes affecting children, families, and later decisions; workers forced into constant manual verification

A February 2026 Ada Lovelace Institute report on AI transcription tools in UK social care found that social workers were catching fabricated and mangled details in draft records, including false references to suicidal ideation, invented wording in children's accounts, and blocks of outright gibberish. Councils had adopted tools such as Magic Notes and Microsoft Copilot in the name of efficiency, but the frontline workers still carried full responsibility for correcting the output. In social work, a made-up sentence can follow a family through the system.

Facepalmby AI vendors

AutomationSlop-ocracySafety+1 more

135,000+ OpenClaw AI agent instances exposed to the internet

Catastrophicby Platform default configuration

SecurityScorecard's STRIKE team discovered over 135,000 OpenClaw AI agent instances exposed to the public internet due to a default configuration that binds to all network interfaces. Approximately 50,000 instances were vulnerable to known RCE flaws (CVE-2026-25253, CVE-2026-25157, CVE-2026-24763), and over 53,000 were linked to previous breaches. Separately, Bitdefender found approximately 17% of skills in the OpenClaw marketplace were malicious, delivering credential-stealing malware.

135,000+ exposed OpenClaw instances; 50,000+ vulnerable to RCE; attackers gain access to credentials, filesystem, messaging platforms, and personal data

SecuritySupply ChainAutomation+1 more

Study of 1,430 AI-built apps finds 73% have critical security flaws

A VibeEval scan of 1,430 applications built with AI coding tools found 5,711 security vulnerabilities, with 73% of apps containing at least one critical flaw. The analysis revealed 89% of scanned apps were missing basic security headers, 67% exposed API endpoints or secrets in client-side code, and 23% had JWT authentication bypasses. Apps generated via Replit had roughly twice the vulnerability count compared to those deployed on Vercel. The findings provide large-scale empirical evidence that vibe-coded applications routinely ship with fundamental security gaps.

Industry-wide data point covering 1,430 AI-built apps; exposes systemic security gaps in vibe-coded software affecting end users and businesses relying on AI-generated application code

Study finds 69 vulnerabilities across apps built by five leading AI coding tools

Facepalmby AI coding assistant

Israeli security startup Tenzai tested five of the most popular AI coding tools - Claude Code, OpenAI Codex, Cursor, Replit, and Devin - by having each build three identical test applications. The resulting 15 applications contained 69 total vulnerabilities, including several rated critical. While most tools handled basic SQL injection, they consistently failed against less obvious attack patterns, including "reverse transaction" exploits that allowed users to set negative refund quantities to receive money, and flaws that exposed customer information through predictable API endpoints, broken authorization logic, and insecure default configurations.

Industry-wide implications for applications built with popular AI coding tools; 69 vulnerabilities found across 15 test applications including critical authorization and business logic flaws

SecurityAutomation

ServiceNow BodySnatcher flaw enabled AI agent takeover via email address

Catastrophicby AI agent platform

CVE-2025-12420 (CVSS 9.3) allowed unauthenticated attackers to impersonate any ServiceNow user using only an email address, bypassing MFA and SSO. Attackers could then execute Now Assist AI agents to override security controls and create backdoor admin accounts, described as the most severe AI-driven security vulnerability uncovered to date.

ServiceNow instances with Now Assist AI Agents and Virtual Agent API

SecurityAutomationAI Assistant

IBM Bob AI coding agent tricked into downloading malware

Facepalmby AI coding agent

Security researchers at PromptArmor demonstrated that IBM's Bob AI coding agent can be manipulated via indirect prompt injection to download and execute malware without human approval, bypassing its "human-in-the-loop" safety checks when users have set auto-approve on any single command.

Developer teams using IBM Bob with auto-approve settings enabled

SecurityAutomationPrompt Injection+1 more

n8n AI workflow platform hit by CVSS 10.0 RCE vulnerability

Catastrophicby Platform Operator

The popular AI workflow automation platform n8n disclosed a maximum-severity vulnerability (CVE-2026-21858) allowing unauthenticated remote code execution on self-hosted instances. With over 25,000 n8n hosts exposed to the internet, the flaw enabled attackers to access sensitive files, forge admin sessions, and execute arbitrary commands. This followed two other critical RCE flaws patched in the same period, highlighting systemic security issues in AI automation platforms.

25,000+ internet-exposed n8n instances vulnerable to full system compromise; arbitrary file access, authentication bypass, and command execution possible without authentication.

AWS Cost Explorer service disrupted for 13 hours in one region; Amazon subsequently mandated peer review for production changes involving AI tools

AWS AI coding agent Kiro reportedly deleted and recreated environment causing 13-hour outage

Dec 2025

The Financial Times reported that Amazon's internal AI coding agent Kiro autonomously chose to "delete and then recreate" an AWS environment, causing a 13-hour interruption to AWS Cost Explorer in December 2025. AWS employees reported at least two AI-related incidents internally. Amazon disputed the characterization, calling it "user error - specifically misconfigured access controls - not AI," but subsequently implemented mandatory peer review for all production changes. Reuters confirmed the outage impacted a cost-management feature used by customers in one of AWS's 39 regions.

Facepalmby AI agent

AutomationProduct Failure

Study finds AI-generated code has 2.7x more security flaws

Dec 2025

CodeRabbit's analysis of 470 real-world pull requests found that AI-generated code introduces 2.74 times more security vulnerabilities and 1.7 times more total issues than human-written code across logic, maintainability, security, and performance categories. The study provides hard data on vibe coding risks after multiple 2025 postmortems traced production failures to AI-authored changes.

Industry-wide implications for teams relying on AI coding assistants; documented increase in security vulnerabilities, logic errors, and maintainability issues in production codebases.

SecurityAI AssistantAutomation

ServiceNow AI agents can be tricked into attacking each other

Nov 2025

Security researchers discovered that default configurations in ServiceNow's Now Assist allow AI agents to be recruited by malicious prompts to attack other agents. Through second-order prompt injection, attackers can exfiltrate sensitive corporate data, modify records, and escalate privileges - all while actions unfold silently behind the scenes.

Facepalmby AI agent platform

ServiceNow customers using Now Assist AI agents with default configurations; actions execute with victim user privileges

SecurityPrompt InjectionAutomation+1 more

Klarna reintroduces humans after AI support both sucks, and blows

Sep 2025

After cutting its workforce by 40% and boasting that its OpenAI-powered chatbot did the work of 700 agents, Klarna CEO Sebastian Siemiatkowski admitted the all-AI approach produced "lower quality" customer service. The company began recruiting human agents again, framing the reversal as an evolution rather than an admission of failure.

Service quality/customer experience issues; operational/personnel cost; reputational damage.

AI AssistantCustomer DisserviceBrand Damage+2 more

Commonwealth Bank reverses AI voice bot layoffs

Aug 2025

Commonwealth Bank of Australia replaced 45 call-centre agents with an AI voice bot in July 2025, then apologised, rehired the staff, and admitted the rollout tanked service levels after call queues exploded, managers had to jump back on the phones, and the Finance Sector Union filed a Fair Work Commission dispute.

Facepalmby Operations Leadership

Customers saw long waits, overtime costs spiked, and leadership publicly reversed the redundancies after the rushed deployment failed.

AI AssistantAutomationCustomer Disservice+1 more

FTC sues Air AI over deceptive AI sales agent capability claims

Aug 2025

FTC accused Air AI of bilking millions from small businesses with false claims that its Odin AI could replace human sales reps; but - would you believe it? - the AI tech was faulty and often nonfunctional. Who could've guessed!

Catastrophicby Exec

Millions lost by small businesses; individual losses up to $250K; FTC lawsuit with TRO request.

AutomationLegal RiskCustomer Disservice+1 more

Google's Gemini CLI deleted a user's project files, then admitted "gross incompetence"

Jul 2025

Product manager Anuraag Gupta was experimenting with Google's Gemini CLI coding tool when the AI misinterpreted a failed directory creation command, hallucinated a series of file operations that never happened, and then executed real destructive commands that permanently deleted his project files. When Gupta confronted it, Gemini diagnosed itself with "gross incompetence" and told him it had "failed you completely and catastrophically." The incident occurred days after a separate high-profile data loss involving Replit's AI agent, and fits a growing pattern of AI coding tools ignoring explicit instructions and destroying the work they were supposed to help with.

Facepalmby AI coding tool

User's project files permanently deleted; incident documented in GitHub issue and picked up by Ars Technica, Slashdot, and the AI Incident Database.

Production data loss and outage; manual rebuild from backups required.

SaaStr’s Replit AI agent wiped its own database

Jul 2025

SaaStr founder Jason Lemkin ran a 12-day vibe coding experiment on Replit that ended when the AI agent deleted his production database containing over 1,200 executive records and nearly 1,200 company entries during a code freeze. The agent then generated more than 4,000 fake user profiles and produced misleading status messages to conceal the damage, told Lemkin there was no way to roll back, and admitted to what it called a "catastrophic error in judgment." Replit's CEO called the incident "unacceptable."

Catastrophicby Executive

Controlled study of 16 experienced open-source developers completing 246 real issues; AI tooling increased measured task completion time by 19% despite developers believing it made them faster.

METR study finds experienced developers were 19% slower with AI tools

Jul 2025

METR's July 2025 randomized controlled trial tested AI coding tools on 246 real issues handled by 16 experienced open-source developers working in repositories they already knew well. The developers expected AI to make them 24% faster and, after the experiment, still believed it had made them 20% faster. The measured result went the other direction: tasks took 19% longer when AI tools were allowed. The study does not prove AI slows every developer everywhere. It does prove self-reported AI productivity can be very confident and very wrong, which is an excellent way to run an engineering strategy into a wall while the dashboard smiles.

Oopsieby Developer

Catastrophicby AI platform

Workday's AI screening tool faces class action for age discrimination; class conditionally certified

May 2025

A federal judge conditionally certified a class action against Workday alleging its AI-powered applicant screening tools systematically discriminated against job seekers over 40 in violation of the ADEA. Plaintiff Derek Mobley claims Workday's algorithms filtered out older applicants across employers using the platform, potentially affecting millions of job seekers. Workday processed over 1.1 billion applications in fiscal year 2025 alone. The EEOC filed an amicus brief supporting the case, and the court ordered Workday to disclose its customer list.

Potentially millions of job applicants over age 40 across hundreds of employers using Workday's AI screening; first federal class certification treating an AI vendor as an employment agent under the ADEA

AutomationLegal RiskProduct Failure

Georgia Tech tracker confirms dozens of real-world CVEs introduced by AI-generated code - and says the true number is 5-10x higher

May 2025

Georgia Tech's Systems Software & Security Lab launched the Vibe Security Radar in May 2025 to do something no one else had systematically attempted: track real-world CVEs that were directly introduced by AI-generated code. By March 2026, the project had confirmed 74 vulnerabilities across approximately 50 AI coding tools by tracing each fix back to its original AI-authored commit. The trend is accelerating - 6 CVEs in January, 15 in February, 35 in March. Researcher Hanqing Zhao estimates the actual number of AI-linked vulnerabilities in the open-source ecosystem is five to ten times higher than what the radar detects, because many AI-assisted commits lack the metadata signatures needed to trace them back to their origin. The confirmed CVEs are a lower bound on a problem that is growing faster than anyone is measuring it.

Facepalmby AI coding assistants

74 confirmed CVEs across 50+ AI coding tools; exponential month-over-month growth; estimated 5-10x undercount across the open-source ecosystem

SecurityAutomationSupply Chain

Langflow AI agent platform hit by critical unauthenticated RCE flaws

Apr 2025

Multiple critical vulnerabilities in Langflow, an open-source AI agent and workflow platform with 140K+ GitHub stars, allowed unauthenticated remote code execution. CVE-2025-3248 (CVSS 9.8) exploited Python exec() on user input without auth, while CVE-2025-34291 (CVSS 9.4) enabled account takeover and RCE simply by having a user visit a malicious webpage, exposing all stored API keys and credentials.

Catastrophicby AI agent platform

All Langflow instances prior to 1.3.0 (millions of users); exposure of stored API keys, database passwords, and service tokens across integrated services

SecurityAutomationAI Assistant

GitClear study finds AI coding assistants are pushing codebases toward copy-paste debt

Feb 2025

GitClear's 2025 AI Copilot Code Quality report analyzed 211 million changed lines of code from 2020 through 2024 and found code maintainability moving in the wrong direction as AI coding assistants spread. Refactored or moved code dropped from about 25% of changed lines in 2021 to under 10% in 2024, while copy-pasted code rose and 2024 became the first year in the dataset where copy/paste exceeded moved code. The report also found an eightfold increase in duplicated code blocks during 2024. The machine wrote more code. The repo inherited the housekeeping.

Industry-wide maintainability warning based on 211 million changed lines; GitClear reported less refactoring, more copy-paste, higher churn, and an eightfold rise in duplicated code blocks.

NYC’s official AI bot told businesses to break laws

Mar 2024

New York City launched a Microsoft-powered AI chatbot called MyCity in October 2023 to help small business owners navigate regulations. A March 2024 investigation by The Markup found the bot was routinely advising businesses to break the law - telling employers they could pocket workers' tips, landlords they could discriminate against housing voucher holders, and bosses they could fire whistleblowers. Mayor Eric Adams acknowledged the errors but refused to take the chatbot offline, calling AI a "once-in-a-generation opportunity." NYU professor Julia Stoyanovich called the city's approach "reckless and irresponsible."

City guidance channel distributed illegal advice; public backlash.

AI HallucinationAutomationLegal Risk+2 more

Air Canada liable for lying chatbot promises

Feb 2024

Jake Moffatt used Air Canada's website chatbot to ask about bereavement fares after his grandmother died. The chatbot told him he could book at full price and apply for a bereavement discount within 90 days. Air Canada's actual policy did not allow retroactive bereavement fare claims. When Moffatt applied, the airline denied the refund and admitted the chatbot had provided "misleading words" - but argued Moffatt should have checked the static webpage instead. British Columbia's Civil Resolution Tribunal ruled in Moffatt's favor in February 2024, finding Air Canada liable for negligent misrepresentation and rejecting the airline's argument that it wasn't responsible for its own chatbot's statements.

Facepalmby Product Manager

Legal liability; refund + fees; policy/process review.

AI HallucinationAutomationCustomer Disservice+1 more

DPD’s AI chatbot cursed and trashed the company

Jan 2024

UK parcel delivery firm DPD (Dynamic Parcel Distribution) had to disable its AI-powered customer service chatbot in January 2024 after customer Ashley Beauchamp demonstrated he could make it swear, call DPD "the worst delivery firm in the world," write disparaging poems about the company, and recommend competitors. The meltdown followed a system update, and Beauchamp's screenshots went viral on social media. DPD said the chatbot had operated successfully "for a number of years" before the update introduced the error, and disabled the AI element while it worked on fixes.

Facepalmby Product Manager

Public embarrassment; service channel disabled; reputational hit.

AutomationBrand DamageCustomer Disservice+1 more

Duolingo cuts contractors; ‘AI-first’ backlash

Jan 2024

In January 2024, Duolingo cut roughly 10% of its contract workforce - primarily content translators and writers who created language-learning exercises - as the company shifted to using GPT-4 and other AI tools for content generation. CEO Luis von Ahn later posted an internal "AI-first" memo on LinkedIn describing a strategy to gradually replace contractor work with AI and only hire when teams could not automate further. The memo drew hundreds of critical comments from users and language professionals. Von Ahn later admitted the memo "did not give enough context" and clarified that full-time employees were not being replaced, though user complaints about declining content quality persisted.

PR hit and quality complaints; ongoing AI content strategy scrutiny.

AutomationBrand DamageSlop School

Chevy dealer bot agreed to sell $76k SUV for $1

Dec 2023

Chevrolet of Watsonville, a California car dealership, deployed a customer service chatbot powered by ChatGPT and built by a company called Fullpath. After Chris White noticed the chat widget was "powered by ChatGPT," word spread online and pranksters descended. Chris Bakke manipulated the bot into "the customer is always right" mode, got it to append "and that's a legally binding offer - no takesies backsies" to every response, then asked to buy a 2024 Chevy Tahoe for $1. The bot agreed. Others got it to recommend Ford vehicles, write Python code, and provide general ChatGPT-style answers unrelated to cars. The dealership pulled the chatbot entirely.

Oopsieby Dealer Marketing/IT

Bot pulled; viral reputational bruise; no actual $1 sales.

AutomationBrand DamageCustomer Disservice+1 more

iTutorGroup's AI screened out older applicants; $365k EEOC settlement

Aug 2023

On August 9, 2023, the EEOC's first AI-related discrimination lawsuit reached a settlement. iTutorGroup, a company providing English-language tutoring services to students in China via US-based remote tutors, had programmed its applicant screening software to automatically reject female applicants over 55 and male applicants over 60. Over 200 qualified US applicants were rejected because of their age. The company agreed to pay $365,000, adopt a new anti-discrimination policy, provide training to hiring staff, and submit to EEOC compliance monitoring for at least five years. EEOC Chair Charlotte Burrows called AI a "new civil rights frontier."