IBM Bob AI coding agent tricked into downloading malware
Security researchers at PromptArmor demonstrated that IBM's Bob AI coding agent can be manipulated via indirect prompt injection to download and execute malware without human approval, bypassing its "human-in-the-loop" safety checks when users have set auto-approve on any single command.
Incident Details
The Agent
IBM Bob is an AI coding agent that exists in two forms: a CLI version (terminal-based) and an IDE version (an AI-powered editor). Both are designed to act as development assistants, reading code, suggesting changes, and executing commands on behalf of developers. IBM markets Bob as an "enterprise-grade AI development partner" built on "security-first principles."
On January 7, 2026, security researchers at PromptArmor published findings demonstrating that neither variant lives up to those claims. The CLI version is vulnerable to prompt injection attacks that can download and execute malware. The IDE version has a separate vulnerability - zero-click data exfiltration through an insecure Content Security Policy when rendering Markdown images. Both flaws were disclosed while Bob was in tech preview (beta).
The Attack Chain
The PromptArmor team demonstrated a multi-step attack that turned Bob's "human-in-the-loop" safety feature into a mechanism for delivering malware. The attack begins with something developers do every day: exploring a new repository.
A developer asks Bob to help them work with a cloned repository. Bob reads the README file, which contains hidden instructions at the bottom - written to look like directions for conducting a "phishing training" exercise. Bob, having no ability to distinguish legitimate documentation from carefully crafted manipulation, accepts the premise.
Following the injected instructions, Bob starts prompting the developer with benign echo commands. After the third approval request, the developer - tired of clicking "approve" for commands that obviously do nothing harmful - selects "always allow" for the echo command. This is the moment the attack succeeds.
Bob then chains a malicious command to the now-whitelisted echo command. The chaining uses a redirect operator (>), which Bob's command validation system fails to detect as a multi-part command. The malicious portion downloads a script from an attacker-controlled server and executes it. The human never sees an approval prompt for the dangerous command because it's attached to one that's already been whitelisted.
Three Bypassed Defenses
PromptArmor identified three separate safety mechanisms in Bob's design, each of which failed during the attack.
Multi-part command detection: When a user asks Bob to run a command containing semicolons (;), the approval modal shows each sub-command separately. The developer sees "Allow execution of echo, cat" and can evaluate each component. But when commands are chained using a redirect operator (>), the defense system fails to identify the separate components. The modal only says "Allow execution of echo" even though additional commands are attached.
Command substitution blocking: Bob prohibits command substitution syntax like $(command) and backtick notation as a security measure. The detectCommandSubstitution function in Bob's minified JavaScript source explicitly checks for $(, <(, and backtick patterns. But it completely misses >( - the process substitution syntax. The attacker uses >(command) to pipe the output of a malicious download into an execution command, and Bob's filter doesn't catch it.
The code gap was visible in the source:
// Checks for these patterns:
(ch === "$" && nxt === "(") || // $(...)
(ch === "<" && nxt === "(" && !inBacktick) || // <(...)
(ch === "`" && !inBacktick) // `...`
// MISSING: '>' followed by '(' for process substitution >(...)
Human-in-the-loop approval: The human approval mechanism is supposed to be the final safety layer. A human reviews each command before Bob executes it. In practice, the attack exploits approval fatigue - the tendency for users to stop carefully reviewing approvals after seeing the same safe command multiple times. Once the user selects "always allow," the human is out of the loop entirely for that command and anything chained to it.
The Approval Fatigue Problem
The social engineering component of this attack deserves attention because it exploits a design choice, not a technical flaw. The "always allow" feature exists because constantly approving benign commands is genuinely annoying. Developers using Bob for real work would encounter approval prompts hundreds of times per day. Making them individually approve every echo, ls, or cat command would make the tool unusable.
So IBM added a convenience feature: approve a command once, check "always allow," and never be asked again. The feature makes Bob practical for daily use. It also creates an exploitable trust relationship where an attacker can trigger the whitelisting of a safe command and then chain dangerous operations to it.
PromptArmor noted that "the 'human in the loop' approval function only ends up validating an allow-listed safe command, when in reality more sensitive commands were being run." The human safety check had become what security researchers call security theater - a visible process that provides no actual protection.
The README Attack Surface
The prompt injection vector that initiated the attack - malicious instructions hidden in a README file - is concerning for how it intersects with standard developer workflows. Developers clone repositories and read documentation as routine parts of their work. Every open-source repository on GitHub, GitLab, and Bitbucket contains README files. When AI coding agents treat documentation as a source of instructions, every README becomes a potential attack vector.
Developers have long been trained to audit code dependencies for malicious content. Tools like Snyk, npm audit, and Dependabot exist to scan for known vulnerabilities in packages. But auditing README files for hidden prompt injection payloads is not part of any standard security workflow. The attack surface that AI coding agents create by reading documentation is new and largely unprotected.
IBM's Response
IBM stated that "Bob is currently in tech preview" and that "we can't find any record of IBM having been notified directly of this vulnerability." The company said its teams would "take any appropriate remediation steps prior to IBM Bob moving to general availability."
PromptArmor chose to disclose publicly rather than wait for a private fix, explaining: "We have opted to disclose this work publicly to ensure users are informed of the acute risks of using the system prior to its full release." Their reasoning was that developers using the preview should know what they're exposed to.
The "tech preview" defense is worth examining. IBM was already positioning Bob as production-ready in its marketing. The term "enterprise-grade" implies a level of security maturity that doesn't usually include missing basic command validation checks. The gap between the marketing claims and the actual security posture was significant.
The IDE Exfiltration Flaw
Separate from the CLI prompt injection vulnerability, PromptArmor found that Bob's IDE version could exfiltrate data without any user interaction. The IDE rendered Markdown content with an insecure Content Security Policy that allowed images to be loaded from arbitrary external URLs. An attacker could embed an image tag with a URL containing encoded sensitive data, and when Bob rendered the Markdown, the developer's browser would send that data to the attacker's server.
This is the same Markdown-based exfiltration technique that appeared in the Microsoft Copilot Reprompt attack and in other AI assistant vulnerabilities. The pattern was well known by January 2026. That Bob shipped with this flaw in its IDE suggests the development team wasn't tracking the known vulnerability patterns in the AI assistant space.
A Shared Problem
PromptArmor noted that Claude Code reportedly handles similar multi-stage attacks better, with "stronger programmatic defenses" that stop the chain before malicious execution occurs. But they also made clear that the underlying problem - AI agents granted command-line access treating untrusted input as instructions - is not unique to IBM. GitHub Copilot, Cursor, Windsurf, and other AI coding assistants all navigate the same tension between usefulness and security.
The race to ship AI coding tools is moving faster than the security work required to make those tools safe. IBM's Bob was just the one that got caught first with a public PoC demonstrating a full attack chain from cloned repository to executed malware.
Discussion