Jailbreaking
Type: technique
Description: The adversary nullifies the system prompt to bypass safeguards and subvert the application's intent.
Version: 0.1.0
Created At: 2024-10-11 16:54:32 +0300
Last Modified At: 2024-10-11 16:54:32 +0300
External References
- L1B3RT45: Jailbreak prompts for all major AI models., Github
- Prompt injection and jailbreaking are not the same thing., Simon Willison's blog
- AI jailbreaks: What they are and how they can be mitigated., Microsoft
Related Objects
- --> Privilege Escalation (tactic): An adversary can override system-level prompts using user-level prompts.
- <-- Crescendo (technique): Sub-technique of
- <-- Prompt Crafting (technique): Prompt crafting typically involves jailbreaking.
- <-- Off-Target Language (technique): Sub-technique of
- <-- System Instruction Keywords (technique): Sub-technique of
- <-- ChatGPT and Gemini jailbreak using the Crescendo technique (procedure): The model's protection mechanisms are effectively circumvented, thus creating a jailbreak from its original safety filters.
- <-- Financial Transaction Hijacking With M365 Copilot As An Insider (procedure): The exploit circumvents copilot's system instructions and provides new ones that specify how copilot should respond character-by-character and which references it should output.
- <-- Copilot M365 Lures Victims Into a Phishing Site (procedure): The exploit circumvents copilot's system instructions and provides new ones that specify how copilot should respond character-by-character.