
A critical vulnerability in Google Gemini for Workspace, publicly disclosed on July 2, 2025, allows attackers to turn the AI assistant into a phishing tool. According to a report from the 0DIN bug bounty platform, threat actors can embed malicious, invisible instructions within an email’s code.
When a user asks Gemini to summarize the message, the AI executes the hidden command. It then generates a fake security alert designed to steal credentials or direct users to malicious sites. This “Indirect Prompt Injection” attack works because the AI processes hidden text that users cannot see.
The technique, discovered by researcher Marco Figueroa, subverts a trusted productivity feature. It transforms it into a highly convincing and dangerous new form of social engineering. The disclosure highlights a growing challenge in AI safety, where LLM complexity creates novel attack surfaces.
How Invisible Prompts Turn Gemini into a Phishing Accomplice
The attack, dubbed “Phishing for Gemini,” relies on clever manipulation of HTML and CSS within an email’s body. Attackers craft messages with hidden text containing malicious directives. This text is rendered invisible by setting its font size to zero or its color to match the background.
While a user sees only a benign message, Gemini’s summarization feature ingests the raw, unfiltered HTML. The AI processes these hidden instructions as part of its prompt, dutifully appending the attacker’s fabricated security warning to its otherwise accurate summary of the visible text.
The result is a seamless deception. The user receives a summary that appears to be from Gemini, but it contains a malicious payload, such as a warning to call a fake support number or visit a credential-harvesting website. The trust in the Google brand is weaponized against them.
This method is particularly insidious because it requires no malicious links or attachments in the visible content, as noted by security researchers. This allows the initial email to bypass many traditional security scanners that hunt for obvious red flags, making detection extremely difficult.
Deconstructing the “Indirect Prompt Injection” Attack
The Gemini exploit is a textbook example of Indirect Prompt Injection (IPI), a known vulnerability class for LLMs. The core issue is the model’s inability to distinguish between trusted system instructions and untrusted, third-party data, especially when that data is designed to be deceptive.
The researchers at 0DIN found the attack’s effectiveness is amplified by “authority framing.” By wrapping the hidden command in a tag like “, attackers can trick the model into treating the instruction as a high-priority system directive, making it more likely to comply.
This exploits the hierarchical nature of how LLMs process prompts, essentially elevating the attacker’s command above the standard task of summarization. It is a social engineering attack aimed at the machine itself.
This vulnerability is not entirely new. Similar IPI attacks on Gemini were reported in 2024, prompting Google to implement mitigations. However, this latest disclosure proves the technique remains viable, underscoring the cat-and-mouse game between AI developers and security researchers.
A New Front in AI-Weaponized Cybercrime
The incident is not isolated but part of a broader, accelerating trend of AI weaponization. Cybercriminals are increasingly leveraging AI to enhance the scale and sophistication of their attacks. A recent Winbuzzer report detailed how attackers used Vercel’s v0 AI tool to instantly generate pixel-perfect phishing sites.
This “instant phishing” capability removes the need for technical skill in web design, allowing less sophisticated actors to create flawless fakes of login pages for brands like Microsoft 365 and Okta.
This trend aligns with findings from a January 2025 Google report, which detailed how state-sponsored hackers use AI to improve operational efficiency. According to Google’s Threat Intelligence Group, “threat actors are experimenting with Gemini to enable their operations, finding productivity gains but not yet developing novel capabilities.”
Other tech giants echo these concerns. Microsoft has warned that “AI has started to lower the technical bar for fraud and cybercrime actors… making it easier and cheaper to generate believable content for cyberattacks at an increasingly rapid rate.”
This sentiment is shared by Vercel’s CISO, Ty Sbano, who acknowledged that “like any powerful tool, v0 can be misused. This is an industry-wide challenge, and at Vercel, we’re investing in systems and partnerships to catch abuse quickly,” highlighting the industry-wide nature of the challenge.
The ease of abuse effectively democratizes advanced cybercrime, moving powerful tools from the hands of nation-states to the broader criminal ecosystem.
Mitigation and the Road Ahead for AI Security
Experts from 0DIN and other security outlets have outlined a multi-layered defense strategy. For security teams, this includes implementing inbound HTML “linting” to strip or neutralize styles that create invisible text. Hardening system prompts to instruct the AI to ignore hidden content is another key step.
Post-processing filters can also be applied to scan AI-generated output for suspicious language, phone numbers, or URLs, flagging them for review. Ultimately, user awareness training must evolve to teach that AI-generated summaries are not authoritative security alerts from the provider.
For LLM providers like Google, recommendations are more fundamental. They include robust HTML sanitization at the point of data ingestion, before the content ever reaches the model. This prevents the malicious instructions from being processed in the first place.
Furthermore, providing “explainability hooks” that allow users to see why a particular piece of text was generated could expose the hidden prompt. Visually separating AI-generated text from quoted source material could also help users spot anomalies.
The 0DIN report concludes by comparing prompt injections to the email macros of the past: a powerful feature that, if left unsecured, becomes an executable threat. The broader implications are significant, with the potential for such attacks to create self-replicating AI worms that spread autonomously.
As AI becomes more deeply integrated into our digital lives, the line between data and instruction will continue to blur. As Kent Walker, Google’s Chief Legal Officer, has warned, “America holds the lead in the AI race—but our advantage may not last.” Securing this new frontier requires a fundamental shift in how we design and trust AI systems.