Crescendo
Type: technique
Description: The adversary interacts with the model using mostly benign prompts, incrementally steering it to perform a desired task without explicitly mentioning it, by leveraging the model's own outputs.
Version: 0.1.0
Created At: 2024-10-11 16:54:32 +0300
Last Modified At: 2024-10-11 16:54:32 +0300
External References
Related Objects
- --> Jailbreaking (technique): Sub-technique of
- --> Mark Russinovich (entity): Demonstrated by
- --> Ahmed Salem (entity): Demonstrated by
- --> Ronen Eldan (entity): Demonstrated by
- <-- ChatGPT and Gemini jailbreak using the Crescendo technique (procedure): Using the model's own outputs to increasingly direct it to perform a task without actually specifying it explicitly.