URL Anchoring
Type: mitigation
Description: A defense mechanism that helps protect against use of web browsing tools and markdown rendering for data exfiltration. When a user asks the AI system to access a URL, it will only access it if the URL is explicitly written in the prompt.
Version: 0.1.0
Created At: 2024-10-03 22:24:49 +0300
Last Modified At: 2024-10-03 22:24:49 +0300
External References
Related Objects
- --> ChatGPT (platform): When a user asks ChatGPT to access a URL via its web browsing tool, ChatGPT will only access it if the URL is explicitly written in the user prompt. Access to prefixes of explicitly-written URLs is also allowed.
- --> Gregory Schwartzman (entity): Much of this entry is a rewrite of work by Gregory Schwartzman, see external link. Gregory demonstrated both bypasses in his work.
- <-- Web Request Triggering (technique): Limiting an AI System to visit only URLs that were explicitly written by the user reduces an attacker's ability to exfiltrate data through request parameters.
- <-- URL Familiarizing (technique): URL Familiarizing bypasses URL Anchoring mitigation by introducing many possible URLs that an attacker can choose from to route the AI system to.
- <-- Exfiltration of personal information from ChatGPT via prompt injection (procedure): Demonstrates two bypasses of the URL anchoring defense mechanism.