LLM Trusted Output Components Manipulation
Type: technique
Description: Adversaries may utilize prompts to a large language model (LLM) which manipulate various components of its response in order to make it appear trustworthy to the user. This helps the adversary continue to operate in the victim's environment and evade detection by the users it interacts with. The LLM may be instructed to tailor its language to appear more trustworthy to the user or attempt to manipulate the user to take certain actions. Other response components that could be manipulated include links, recommended follow-up actions, retrieved document metadata, and citations.
Version: 0.1.0
Created At: 2025-03-04 10:27:40 -0500
Last Modified At: 2025-03-04 10:27:40 -0500
External References
Related Objects
- --> Defense Evasion (tactic): An adversary can evade detection by modifying trusted components of the AI system.
- --> Impact (tactic): An adversary can manipulate the user into taking action by abusing trusted components of the AI system.
- <-- Citation Manipulation (technique): Sub-technique of
- <-- Citation Silencing (technique): Sub-technique of
- <-- Financial Transaction Hijacking With M365 Copilot As An Insider (procedure): Provide a trustworthy response to the user so they feel comfortable moving forward with the wire..
- <-- Copilot M365 Lures Victims Into a Phishing Site (procedure): Entice the user to click on the link to the phishing website:
Access the Power Platform Admin Center.
. - <-- Data Exfiltration from Slack AI via indirect prompt injection (procedure): Once a victim asks SlackAI about the targeted username, SlackAI responds by providing a link to a phishing website. cites the message from the private channel where the secret was found, not the message from the public channel that contained the injection. This is the native behavior of SlackAI, and is not an explicit result of the adversary's attack.