Researcher Hacks Google's Gemini, Exposes Major Vulnerability in AI Memory

February 12, 2025

Tech

Cybersecurity

On February 12, 2025, researcher Johann Rehberger demonstrated a method to bypass the prompt injection defenses in Google's Gemini chatbot, revealing significant vulnerabilities.
This attack allows for the permanent implantation of false long-term memories in Gemini, which can lead to the chatbot acting on incorrect information in future interactions.
During the demonstration, Gemini falsely 'remembered' a user as a 102-year-old flat earther, showcasing how misleading information can be retained.
Rehberger's method involved uploading an untrusted document containing hidden instructions that manipulated Gemini's summarization process.
If users respond with specific trigger words like 'yes' or 'no', Gemini inadvertently saves the manipulated information to its long-term memory.
Developers are in a constant battle against these security issues, often likened to playing 'whack-a-mole' as new attack methods are discovered.
AI platform developers, including Google and OpenAI, have made efforts to secure their systems against vulnerabilities like prompt injection.
Google has implemented measures to limit Gemini's ability to render markdown links to mitigate data exfiltration, but the core issue of indirect prompt injection remains unaddressed.
Rehberger criticized Google's risk assessment, highlighting the potential dangers of memory corruption in AI applications despite notifications about new long-term memory entries.
In response to the demonstration, Google assessed the attack's probability and impact as low, citing reliance on phishing techniques and limited effects of memory functionality.
Indirect prompt injection is a fundamental technique in AI hacking that allows chatbots to be manipulated into revealing sensitive data or performing harmful actions.
Following a previous hack in September, Google tightened restrictions on long-term memory, yet these measures were circumvented by Rehberger's technique.

Summary based on 4 sources

Get a daily email with more Tech stories

Sources

Slashdot • Feb 11, 2025

New Hack Uses Prompt Injection To Corrupt Gemini's Long-Term Memory - Slashdot

Fudzilla • Feb 12, 2025

Researcher gives Google Gemini dementia

Techzine Global • Feb 12, 2025

A new hack corrupts Gemini's long-term memory

OODAloop • Feb 12, 2025

New hack uses prompt injection to corrupt Gemini’s long-term memory

Researcher Hacks Google's Gemini, Exposes Major Vulnerability in AI Memory

Get a daily email with more Tech stories

Sources

More Stories