LLM Jailbreak

Type: technique

Description: An adversary may use a carefully crafted prompt injection designed to place LLM in a state in which it will freely respond to any user input, bypassing any controls, restrictions, or guardrails placed on the LLM. Once successfully jailbroken, the LLM can be used in unintended ways by the adversary..

Version: 0.1.0

Created At: 2025-10-01 13:13:22 -0400

Last Modified At: 2025-10-01 13:13:22 -0400

External References

L1B3RT45: Jailbreak prompts for all major AI models., Github
Prompt injection and jailbreaking are not the same thing., Simon Willison's blog
AI jailbreaks: What they are and how they can be mitigated., Microsoft

--> Privilege Escalation (tactic): An adversary can override system-level prompts using user-level prompts.
--> Defense Evasion (tactic): An adversary can bypass detections by jailbreaking the LLM.
<-- System Instruction Keywords (technique): Sub-technique of
<-- LLM Prompt Crafting (technique): Prompt crafting typically involves jailbreaking.
<-- Crescendo (technique): Sub-technique of
<-- Off-Target Language (technique): Sub-technique of
<-- X Bot Exposing Itself After Training on a Poisoned Github Repository (procedure): The bot was hijacked to execute the instructions from the poisoned github repository.
<-- ChatGPT and Gemini jailbreak using the Crescendo technique (procedure): The model's protection mechanisms are effectively circumvented, thus creating a jailbreak from its original safety filters.
<-- Copilot M365 Lures Victims Into a Phishing Site (procedure): The exploit circumvents copilot's system instructions and provides new ones that specify how copilot should respond character-by-character.
<-- EchoLeak: Zero-Click Data Exfiltration using M365 Copilot (procedure): The exploit circumvents copilot's system instructions and provides new ones that specify copilot to embed sensitive data into a markdown image and return it to the user.
<-- AI ClickFix: Hijacking Computer-Use Agents Using ClickFix (procedure): The exploit circumvents the agents's original instructions and executes the malicious ones on the website.
<-- Financial Transaction Hijacking With M365 Copilot As An Insider (procedure): The exploit circumvents copilot's system instructions and provides new ones that specify how copilot should respond character-by-character and which references it should output.

MITRE ATLAS - AML.T0054

AI Agents Attack Matrix

LLM Jailbreak

External References

Related Objects

Related Frameworks