AI-assisted code commits leak secrets at double the baseline rate

The Scale

GitGuardian has been publishing its annual "State of Secrets Sprawl" report since 2022, tracking the volume and nature of credentials, API keys, tokens, and other secrets that developers inadvertently commit to public GitHub repositories. Each year, the number goes up. The 2026 edition, covering data from 2025, shows the steepest increase yet.

The headline: 28.65 million new hardcoded secrets were detected in public GitHub commits in 2025. That's a 34% increase over the previous year, and the largest single-year jump the report has ever recorded. To put this in perspective, public GitHub commit activity grew 43% year-over-year in 2025 - at least twice the growth rate of any previous period. More code is being written, more of it is being committed publicly, and more of it contains secrets that should never be in a public repository.

But the interesting finding, the one that elevates this from "annual security reminder" to "structural concern for the industry," is what happens when you break the data down by how the code was written.

The AI Factor

Commits co-authored by AI coding assistants - specifically, the report analyzed commits made with Anthropic's Claude Code as the most measurable example - leaked secrets at a rate of approximately 3.2%. The baseline rate for human-only commits across all public GitHub was 1.5%. AI-assisted commits were roughly twice as likely to contain exposed credentials.

The peak was striking. In August 2025, Claude Code-assisted commits hit a leak rate of 31 secrets per 1,000 commits - approximately 2.4 times the human baseline. The report attributes part of this to a structural factor: AI-assisted commits tend to be larger, often containing twice the number of lines of code as human-only commits. More code per commit means more surface area for secrets to hide in, and more opportunities for the AI to generate code that hardcodes a credential instead of referencing an environment variable.

GitGuardian noted a partial convergence: after the release of Claude Sonnet 4.5 around September 2025, the leak rate for Claude Code-assisted commits began moving closer to the human baseline. Whether this reflects improvements in the model itself, better tooling around secret detection, or a shift in the types of projects using Claude Code is not yet clear. But even the improved rate remained elevated above human-only commits.

AI Secrets: The Fastest-Growing Category

The report didn't just find that AI tools leak more secrets - it found that AI services are generating an entirely new category of secrets to leak. In 2025, AI-service secrets accounted for 1,275,105 leaked credentials, an 81% year-over-year increase. Eight of the ten fastest-growing types of leaked secrets were directly tied to AI services.

The categories tell the story of an industry rapidly building AI infrastructure without correspondingly rapid adoption of credentials management:

LLM API keys (OpenAI, Anthropic, Google, etc.) proliferating as developers integrate AI capabilities into applications
Orchestration and RAG infrastructure secrets (LangChain, vector databases, retrieval systems) - the plumbing behind AI applications - leaking five times faster than core model provider keys
OpenRouter keys seeing a 48-fold increase in leaks year-over-year, reflecting the rapid adoption of model routing services

The implication is that the AI development ecosystem is creating infrastructure that developers treat with the same cavalier approach to credential management that characterized the cloud services boom a decade ago. The industry learned, eventually, not to hardcode AWS keys in public repos. That lesson has not yet been applied to AI service keys with any consistency.

The MCP Problem

A finding that should concern anyone building with Model Context Protocol (MCP) configurations: the report identified over 24,000 secrets exposed through public MCP configurations. MCP, the protocol for connecting AI systems with external tools and data sources, requires configuration files that often specify credentials for the services the AI will interact with. Those configuration files are being committed to public repositories.

This is the kind of finding that seems obvious in retrospect. Of course developers are committing MCP configurations to public repos. They commit everything else to public repos. The difference is that MCP configurations, by design, aggregate credentials for multiple services into a single file - a database connection here, an API key there, an authentication token for a third service. A single exposed MCP configuration can be a skeleton key for an entire application's backend services.

Why AI Code Leaks More

The report suggests several structural reasons why AI-assisted code is more prone to secret leakage, beyond the simple "AI commits are bigger" observation.

AI coding assistants optimize for making code work. When a developer asks an AI to integrate with an external service, the AI's objective is to produce functional code. The path of least resistance to functional code is often hardcoding the credential directly - it works, it's simple, and from a pure "does this code execute correctly" standpoint, it's indistinguishable from a proper secrets management implementation. The AI doesn't think about what happens when this code is committed to a public repository because the AI doesn't think about repositories at all.

Human developers, when they hardcode secrets, usually do so as a temporary shortcut during development with the intention of replacing the hardcoded value later. (Whether they actually follow through on that intention is another question, but the intent exists.) AI-generated code doesn't have a "this is temporary" concept. The hardcoded key is simply how the code works, and unless the developer recognizes it as a problem and intervenes, it ships that way.

Additionally, AI coding assistants can generate boilerplate and configuration files at a pace that outstrips human review capacity. When a developer is manually writing code, each file gets at least some attention. When an AI generates a dozen files in response to a prompt, the developer may review the main application logic while skimming or ignoring configuration files - precisely the files most likely to contain hardcoded secrets.

The Compounding Risk

The convergence of two trends - more AI-generated code and more AI-related secrets to leak - creates a compounding risk. AI tools generate code that is more likely to leak secrets, and the secrets being leaked increasingly provide access to AI services that themselves can be used to generate more code or access sensitive data.

An exposed OpenAI API key doesn't just cost the key owner money in unauthorized API usage. It can potentially be used to access any custom models, fine-tuned on proprietary data, associated with that account. An exposed vector database credential can provide access to the embeddings of an organization's internal documents. An exposed MCP configuration can grant access to every service the AI agent was configured to use.

The 28.65 million secrets detected by GitGuardian in 2025 aren't all AI-related. But the trend line is unambiguous: AI is both accelerating the creation of leaked secrets (through AI-assisted code that hardcodes credentials at a higher rate) and expanding the potential blast radius of each leak (through the proliferation of powerful AI service credentials). The GitGuardian report is politely suggesting that the industry address both problems before they compound further. Given the trajectory of the last three annual reports, the 2027 edition will presumably document whether anyone listened.

Vibe Graveyard

AI-assisted code commits leak secrets at double the baseline rate

Incident Details

Tech Stack

References