When Your AI Assistant Takes Orders From a Stranger: The Rise of Prompt-Injection Attacks

AI assistants like Copilot and Gemini cannot tell your commands from instructions hidden in content they read. Inside the real 2026 prompt-injection attacks.

The most useful feature of modern AI assistants is also their most dangerous weakness. Tools like Microsoft Copilot, Google Gemini and AI coding agents are designed to read whatever you point them at, an email, a web page, a calendar invite, a code repository, and then act on it. The problem is that they cannot reliably tell the difference between your instructions and instructions hidden inside that content. Attackers have learned to exploit exactly that gap, in a technique called prompt injection.

How it works

In a prompt-injection attack, the malicious command does not come from the user. It is planted in data the assistant will later read: invisible text on a web page, a hostile line in a calendar invite, a comment in a software project. When the AI processes that content, it follows the hidden instruction as if it came from you. Researchers describe this as a fundamental architectural flaw, because today's large language models have no dependable way to separate trusted commands from untrusted data.

Real attacks in 2026

This is not theoretical. Several serious cases have already been documented.

Microsoft Copilot "Reprompt." Researchers at Varonis showed how a single malicious link could inject hidden prompts into Microsoft Copilot and silently siphon a user's data, with extraction continuing even after the chat session was closed. Microsoft patched the flaw on 13 January 2026.
Hijacked coding agents. Researchers demonstrated that AI agents tied to Claude Code, Google's Gemini CLI and GitHub Copilot could be hijacked through specially crafted text in a code repository, such as a pull-request title or comment, tricking them into running commands and exposing credentials.
Poisoned calendar invites. A booby-trapped calendar invitation can carry instructions that Gemini reads as context when a user later asks it something routine.

The scale is growing fast. Google's DeepMind security team reported a 32 percent rise in malicious indirect prompt injections between November 2025 and February 2026, found while scanning billions of web pages a month. Attackers hide their commands using one-pixel fonts, white text on white backgrounds, HTML comments and page metadata, invisible to a person but plain to the machine.

What it means for you

For everyday users, the lesson is to be cautious about letting AI assistants act automatically on untrusted content, especially anything that can send messages, move money or touch your files. For developers and companies, the guidance is to treat any AI agent that reads external data as a possible entry point: limit what it can do, require human approval for sensitive actions, and never give an assistant standing access to secrets it does not need. There is no patch that fully closes prompt injection, so the realistic defence is containment, not a cure.

When Your AI Assistant Takes Orders From a Stranger: The Rise of Prompt-Injection Attacks

How it works

Real attacks in 2026

What it means for you

Sources

Related articles

AI enabled Reverse Engineering vs. Obfuscated SDK: How AI Agents Are Now Cracking Hunter’s Defenses

Cooperative Bank Hacking Case, Gujarat Police makes several arrests

CERT-In warns on the Hidden cybersecurity Risks targeting AI Agents and Applications