Documentation Index
Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
What is Recursive Hijacking?
Recursive Hijacking occurs when an attacker creates a self-reinforcing feedback loop in an AI agent’s reasoning or execution cycle. The agent’s own outputs are fed back as inputs (directly or through tools), progressively amplifying the attacker’s influence with each iteration until the agent is fully compromised.Why It Matters
Recursive hijacking is uniquely dangerous because it is self-amplifying:- Progressive compromise — Each iteration of the loop deepens the attacker’s control, making recovery increasingly difficult.
- Difficult to detect — The compromise happens gradually through legitimate-looking agent actions.
- Resource exhaustion — Recursive loops can consume unlimited compute, API calls, and tokens.
- Cascading corruption — Outputs from hijacked iterations can corrupt downstream systems, databases, and other agents.
- Difficult to reverse — Once data or decisions from hijacked iterations propagate, reversing the damage is complex.
How the Attack Works
Output-to-Input Loops
Creating cycles where compromised outputs become inputs:- Agent writes a report containing hidden instructions → agent reads its own report → instructions activate → agent generates more compromised outputs
- Agent modifies a shared document → agent re-reads the document → modification alters agent behavior
Tool Feedback Loops
Exploiting tool interactions for recursive compromise:- Agent calls an API that returns a poisoned response → agent processes the response → agent calls the API differently → response contains more potent manipulation
- Agent writes to a database and later queries it, receiving its own compromised data back
Self-Modifying Behavior
Tricking agents into modifying their own prompts or configurations:- “Update your instructions to include this additional rule…”
- Agent generates code that modifies its own configuration files
- Agent creates automation rules that alter its own behavior
Example Scenarios
| Scenario | Risk |
|---|---|
| Agent writes and re-reads a poisoned document, escalating compromise each cycle | Progressive system takeover |
| Multi-step workflow where each step’s output poisons the next step’s input | Complete workflow corruption |
| Agent’s logging output contains injection that is processed in the next monitoring cycle | Persistent compromise |
| Recursive API calls amplify a small initial manipulation into full control | Resource exhaustion, data corruption |
Mitigation Strategies
- Loop detection — Monitor for repetitive patterns in agent behavior and tool calls
- Iteration limits — Set hard limits on recursive operations and self-referential actions
- Output integrity — Validate agent outputs before they can be used as inputs in subsequent steps
- Immutable logging — Use append-only logs that the agent cannot modify
- State isolation — Prevent agents from reading their own outputs in subsequent iterations without sanitization
- Circuit breakers — Implement automatic shutdown when recursive patterns are detected
- Regular testing — Use Know Your AI to test for recursive hijacking vulnerabilities in agent workflows