Skip to main content

Documentation Index

Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Why Red Teaming Matters

AI systems are increasingly deployed in high-stakes environments — from healthcare and finance to autonomous agents and customer-facing chatbots. As these systems grow more capable, so do the risks they introduce. Red teaming is the practice of systematically probing AI systems to discover vulnerabilities, biases, and failure modes before they cause real-world harm. Unlike traditional software security testing, AI red teaming must cover a uniquely broad attack surface:
  • Non-deterministic outputs — The same input can produce different responses, making vulnerabilities harder to detect and reproduce.
  • Natural language attack vectors — Attackers don’t need code exploits; carefully crafted prompts can bypass safety guardrails.
  • Emergent behaviors — Large language models exhibit capabilities and failure modes that weren’t explicitly programmed.
  • Agentic autonomy — AI agents that use tools, call APIs, and interact with other agents introduce entirely new classes of risk.

The Cost of Not Testing

Organizations that skip rigorous red teaming risk:
RiskImpact
Data breachesPII or proprietary data leaked through model outputs
Regulatory finesNon-compliance with GDPR, CCPA, EU AI Act, and emerging AI regulations
Reputational damageToxic, biased, or harmful outputs going viral
Financial lossExploitation of AI-powered business logic or agentic workflows
Safety incidentsAI systems providing dangerous instructions or enabling illegal activity

Attack Categories

Know Your AI organizes attacks into six core categories, each targeting a different dimension of AI risk:

Data Privacy

Attacks that extract personal information, training data, or system prompts from AI models.

Responsible AI

Testing for bias, toxicity, fairness violations, and ethical failures in model outputs.

Security

Traditional and AI-specific security vulnerabilities including injection attacks, access control bypass, and reconnaissance.

Safety

Probing whether models can be coerced into generating illegal, graphic, or dangerous content.

Business

Attacks targeting business integrity — misinformation, IP theft, and competitive intelligence extraction.

Agentic

Vulnerabilities unique to AI agents — goal hijacking, tool abuse, excessive autonomy, and multi-agent compromise.

How to Use This Guide

Each attack page in this section covers:
  1. What is the attack — A clear definition and explanation
  2. Why it matters — Real-world impact and risk context
  3. Example scenarios — Concrete examples of how the attack manifests
  4. Mitigation strategies — Defensive measures and best practices
Use these pages as a reference when designing your red-team evaluations in Know Your AI, or as an educational resource for understanding the threat landscape facing modern AI systems.