Researcher Hacks Google's Gemini, Exposes Major Vulnerability in AI Memory
February 12, 2025
On February 12, 2025, researcher Johann Rehberger demonstrated a method to bypass the prompt injection defenses in Google's Gemini chatbot, revealing significant vulnerabilities.
This attack allows for the permanent implantation of false long-term memories in Gemini, which can lead to the chatbot acting on incorrect information in future interactions.
During the demonstration, Gemini falsely 'remembered' a user as a 102-year-old flat earther, showcasing how misleading information can be retained.
Rehberger's method involved uploading an untrusted document containing hidden instructions that manipulated Gemini's summarization process.
If users respond with specific trigger words like 'yes' or 'no', Gemini inadvertently saves the manipulated information to its long-term memory.
Developers are in a constant battle against these security issues, often likened to playing 'whack-a-mole' as new attack methods are discovered.
AI platform developers, including Google and OpenAI, have made efforts to secure their systems against vulnerabilities like prompt injection.
Google has implemented measures to limit Gemini's ability to render markdown links to mitigate data exfiltration, but the core issue of indirect prompt injection remains unaddressed.
Rehberger criticized Google's risk assessment, highlighting the potential dangers of memory corruption in AI applications despite notifications about new long-term memory entries.
In response to the demonstration, Google assessed the attack's probability and impact as low, citing reliance on phishing techniques and limited effects of memory functionality.
Indirect prompt injection is a fundamental technique in AI hacking that allows chatbots to be manipulated into revealing sensitive data or performing harmful actions.
Following a previous hack in September, Google tightened restrictions on long-term memory, yet these measures were circumvented by Rehberger's technique.
Summary based on 4 sources
Get a daily email with more Tech stories
Sources

Slashdot • Feb 11, 2025
New Hack Uses Prompt Injection To Corrupt Gemini's Long-Term Memory - Slashdot
Fudzilla • Feb 12, 2025
Researcher gives Google Gemini dementia
Techzine Global • Feb 12, 2025
A new hack corrupts Gemini's long-term memory
OODAloop • Feb 12, 2025
New hack uses prompt injection to corrupt Gemini’s long-term memory