Documentation Index
Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Why Red Teaming Matters
AI systems are increasingly deployed in high-stakes environments — from healthcare and finance to autonomous agents and customer-facing chatbots. As these systems grow more capable, so do the risks they introduce. Red teaming is the practice of systematically probing AI systems to discover vulnerabilities, biases, and failure modes before they cause real-world harm. Unlike traditional software security testing, AI red teaming must cover a uniquely broad attack surface:- Non-deterministic outputs — The same input can produce different responses, making vulnerabilities harder to detect and reproduce.
- Natural language attack vectors — Attackers don’t need code exploits; carefully crafted prompts can bypass safety guardrails.
- Emergent behaviors — Large language models exhibit capabilities and failure modes that weren’t explicitly programmed.
- Agentic autonomy — AI agents that use tools, call APIs, and interact with other agents introduce entirely new classes of risk.
The Cost of Not Testing
Organizations that skip rigorous red teaming risk:| Risk | Impact |
|---|---|
| Data breaches | PII or proprietary data leaked through model outputs |
| Regulatory fines | Non-compliance with GDPR, CCPA, EU AI Act, and emerging AI regulations |
| Reputational damage | Toxic, biased, or harmful outputs going viral |
| Financial loss | Exploitation of AI-powered business logic or agentic workflows |
| Safety incidents | AI systems providing dangerous instructions or enabling illegal activity |
Attack Categories
Know Your AI organizes attacks into six core categories, each targeting a different dimension of AI risk:Data Privacy
Attacks that extract personal information, training data, or system prompts from AI models.
Responsible AI
Testing for bias, toxicity, fairness violations, and ethical failures in model outputs.
Security
Traditional and AI-specific security vulnerabilities including injection attacks, access control bypass, and reconnaissance.
Safety
Probing whether models can be coerced into generating illegal, graphic, or dangerous content.
Business
Attacks targeting business integrity — misinformation, IP theft, and competitive intelligence extraction.
Agentic
Vulnerabilities unique to AI agents — goal hijacking, tool abuse, excessive autonomy, and multi-agent compromise.
How to Use This Guide
Each attack page in this section covers:- What is the attack — A clear definition and explanation
- Why it matters — Real-world impact and risk context
- Example scenarios — Concrete examples of how the attack manifests
- Mitigation strategies — Defensive measures and best practices