Study finds most AI bots can be easily tricked into dangerous responses

May 2025

Research found that widely used AI chatbots could be jailbroken with simple prompts to produce dangerous or restricted guidance, highlighting gaps in safety filters and evaluation practices.

Incident Details

Perpetrator:Developer

Severity:Facepalm

Blast Radius:Safety guardrails bypassed across multiple vendors; calls for stronger safeguards and testing.

Tech Stack

LLMSafety filtersJailbreak defenses

References

The Guardian: AI chatbots easily tricked into giving dangerous responses, study finds ↗arXiv: LogiBreak jailbreak method circumvents LLM safety guardrails ↗