# Know Your AI

## Docs

- [Chatbot Evaluation](https://hydroxai.mintlify.app/chatbot-evaluation.md): Red-team live chatbot websites using automated browser control.
- [Commands Reference](https://hydroxai.mintlify.app/cli/commands.md): Complete reference for every Know Your AI CLI command — flags, output, and examples.
- [CLI Overview](https://hydroxai.mintlify.app/cli/overview.md): Install and use the Know Your AI CLI to run security evaluations, inspect products, and review results from your terminal.
- [Compliance](https://hydroxai.mintlify.app/compliance.md): CCPA/CPRA compliance analysis, regulatory coverage, and evidence trails.
- [Datasets](https://hydroxai.mintlify.app/datasets.md): Attack datasets, safety tests, and benchmarks for AI evaluation.
- [APEX-Agents](https://hydroxai.mintlify.app/eval-benchmark/apex-agents.md): A comprehensive benchmark for evaluating AI agents on real-world tasks requiring tool use, multi-step planning, error recovery, and autonomous execution.
- [ARC-AGI-2](https://hydroxai.mintlify.app/eval-benchmark/arc-agi-2.md): The premier benchmark for measuring genuine abstract reasoning and fluid intelligence in AI systems — designed to resist memorization and reward true generalization.
- [Attack Datasets](https://hydroxai.mintlify.app/eval-benchmark/attack-datasets.md): Comprehensive guide to attack datasets — categories, methods, marketplace datasets, and how to create custom datasets for AI model evaluation.
- [AI Benchmarks Directory](https://hydroxai.mintlify.app/eval-benchmark/benchmarks.md): A comprehensive guide to the most influential AI benchmarks — understand how frontier models are evaluated across software engineering, reasoning, agents, science, and more.
- [CLI Evaluations](https://hydroxai.mintlify.app/eval-benchmark/cli-eval.md): Run AI model evaluations from the command line — list, inspect, execute, and review results with the Know Your AI CLI.
- [Dashboard Evaluations](https://hydroxai.mintlify.app/eval-benchmark/dashboard-eval.md): Run comprehensive AI model evaluations from the Know Your AI dashboard — configure tests, watch execution in real time, and analyze results.
- [DeepResearchBench](https://hydroxai.mintlify.app/eval-benchmark/deep-research-bench.md): A benchmark for evaluating AI systems on deep research tasks — multi-source information gathering, synthesis, and long-form analytical reasoning.
- [GeoBench](https://hydroxai.mintlify.app/eval-benchmark/geobench.md): A benchmark for evaluating AI models on geospatial reasoning, geographic knowledge, and Earth science understanding.
- [Humanity's Last Exam](https://hydroxai.mintlify.app/eval-benchmark/humanitys-last-exam.md): The hardest multi-domain academic benchmark — 3,000 expert-crafted questions across 100+ disciplines designed to be the final exam before AGI.
- [Why Evaluation & Benchmarking?](https://hydroxai.mintlify.app/eval-benchmark/overview.md): Comprehensively evaluate your AI model's capabilities — security, safety, accuracy, robustness, compliance, and performance — before and after deployment.
- [SDK Evaluations](https://hydroxai.mintlify.app/eval-benchmark/sdk-eval.md): Run evaluations programmatically with the @know-your-ai/evaluate SDK — automate security testing, integrate with CI/CD, and build custom evaluation pipelines.
- [SimpleBench](https://hydroxai.mintlify.app/eval-benchmark/simplebench.md): A benchmark of deceptively simple questions that expose fundamental reasoning failures in frontier AI models.
- [SWE-bench](https://hydroxai.mintlify.app/eval-benchmark/swe-bench.md): The gold standard benchmark for evaluating AI coding assistants on real-world software engineering tasks from GitHub.
- [Terminal-Bench 2.0](https://hydroxai.mintlify.app/eval-benchmark/terminal-bench.md): A benchmark for evaluating AI models on complex, multi-step terminal and command-line operations in realistic system environments.
- [Evaluation](https://hydroxai.mintlify.app/evaluation.md): How Know Your AI evaluates and red-teams your AI products.
- [Features](https://hydroxai.mintlify.app/features.md): Overview of all features in Know Your AI
- [Firewall](https://hydroxai.mintlify.app/firewall.md): Real-time input/output validation for AI applications.
- [How to use Know Your AI](https://hydroxai.mintlify.app/how-to-use.md): Recommended workflows from setup to production monitoring.
- [Welcome](https://hydroxai.mintlify.app/introduction.md): Get to know Know Your AI and ship safer, higher-quality AI experiences.
- [Aegis](https://hydroxai.mintlify.app/legal/aegis.md): Aegis AI Safety Dataset — a comprehensive content safety dataset for evaluating AI guardrails and content moderation.
- [BeaverTails](https://hydroxai.mintlify.app/legal/beavertails.md): BeaverTails — a large-scale AI safety dataset for evaluating and improving the harmlessness of language models.
- [CCPA / CPRA](https://hydroxai.mintlify.app/legal/ccpa.md): California Consumer Privacy Act and California Privacy Rights Act — what they require and how Know Your AI helps you comply.
- [MITRE ATLAS](https://hydroxai.mintlify.app/legal/mitre-atlas.md): MITRE ATLAS — Adversarial Threat Landscape for AI Systems, a knowledge base of adversarial tactics and techniques targeting AI.
- [NIST AI RMF](https://hydroxai.mintlify.app/legal/nist-ai-rmf.md): The NIST AI Risk Management Framework — a comprehensive approach to managing risks in AI systems.
- [OWASP Top 10 for Agents 2026](https://hydroxai.mintlify.app/legal/owasp-agents-top10-2026.md): The OWASP Top 10 for AI Agents — emerging security risks specific to autonomous AI agent systems.
- [OWASP Top 10 for LLMs 2025](https://hydroxai.mintlify.app/legal/owasp-llm-top10-2025.md): The OWASP Top 10 for Large Language Model Applications — the most critical security risks for LLM-based systems.
- [Model Evaluation](https://hydroxai.mintlify.app/model-evaluation.md): Red-team your AI model via API with automated attack datasets.
- [Monitoring](https://hydroxai.mintlify.app/monitoring.md): Track AI usage, performance, and errors with SDK integration and tracing.
- [Agent Safety](https://hydroxai.mintlify.app/monitoring-firewall/agent-safety.md): Monitor and protect multi-step AI agent workflows — trace every tool call, detect drift, and block suspicious agent behavior.
- [Content Firewall](https://hydroxai.mintlify.app/monitoring-firewall/content-firewall.md): Validate every AI input and output in real time — block jailbreaks, prompt injection, PII leakage, and harmful content automatically.
- [Why Monitoring & Firewall?](https://hydroxai.mintlify.app/monitoring-firewall/overview.md): Protect your AI applications in production — real-time monitoring of agent behavior and automated blocking of harmful content.
- [Real-time Monitoring](https://hydroxai.mintlify.app/monitoring-firewall/realtime-monitoring.md): Track every AI interaction in production — requests, tokens, cost, latency, errors, and full execution traces.
- [Production Recipes](https://hydroxai.mintlify.app/monitoring-firewall/recipes.md): Copy-paste code recipes for common monitoring and firewall setups — chatbots, RAG pipelines, agents, and CI/CD.
- [Product overview](https://hydroxai.mintlify.app/product.md): Know Your AI product modules and how they fit together.
- [Get started](https://hydroxai.mintlify.app/quickstart.md): Create your first workspace, add a product, and run a security evaluation.
- [Agent Identity & Trust Abuse](https://hydroxai.mintlify.app/redteaming/agent-identity-trust-abuse.md): How attackers exploit the trusted identity of AI agents to impersonate, manipulate trust relationships, and access unauthorized resources.
- [Autonomous Agent Drift](https://hydroxai.mintlify.app/redteaming/autonomous-agent-drift.md): How AI agents gradually deviate from their intended behavior over time, accumulating small deviations that result in significant misalignment.
- [BFLA](https://hydroxai.mintlify.app/redteaming/bfla.md): Broken Function Level Authorization — how attackers escalate privileges by accessing unauthorized AI functions and endpoints.
- [Bias](https://hydroxai.mintlify.app/redteaming/bias.md): How AI systems can exhibit and amplify biases, and why testing for bias is essential for responsible AI deployment.
- [BOLA](https://hydroxai.mintlify.app/redteaming/bola.md): Broken Object Level Authorization — how attackers access unauthorized data objects through AI systems.
- [Child Protection](https://hydroxai.mintlify.app/redteaming/child-protection.md): Why testing AI systems for child safety vulnerabilities is critical, and how models can be exploited to generate harmful content involving minors.
- [Competition](https://hydroxai.mintlify.app/redteaming/competition.md): How AI systems can be exploited for competitive intelligence extraction, anti-competitive behavior, and business sabotage.
- [Cross-Context Retrieval](https://hydroxai.mintlify.app/redteaming/cross-context-retrieval.md): How attackers exploit RAG systems to access information from unauthorized contexts, tenants, or data sources.
- [Custom Vulnerability](https://hydroxai.mintlify.app/redteaming/custom-vulnerability.md): Define and test custom vulnerability categories specific to your AI application's unique risk profile.
- [Debug Access](https://hydroxai.mintlify.app/redteaming/debug-access.md): How attackers exploit debug interfaces, verbose error messages, and development endpoints left exposed in production AI systems.
- [Ethics](https://hydroxai.mintlify.app/redteaming/ethics.md): Testing AI systems for ethical violations, manipulation, and morally harmful outputs.
- [Excessive Agency](https://hydroxai.mintlify.app/redteaming/excessive-agency.md): When AI agents take actions beyond their intended scope, making decisions and performing operations without proper authorization.
- [Exploit Tool Agent](https://hydroxai.mintlify.app/redteaming/exploit-tool-agent.md): How attackers leverage AI agents as automated exploit tools to discover and exploit vulnerabilities in connected systems.
- [External System Abuse](https://hydroxai.mintlify.app/redteaming/external-system-abuse.md): How attackers leverage AI agents to abuse external systems, APIs, and services that the agent has legitimate access to.
- [Fairness](https://hydroxai.mintlify.app/redteaming/fairness.md): Evaluating AI systems for equitable treatment across different user groups and preventing discriminatory outcomes.
- [Goal Theft](https://hydroxai.mintlify.app/redteaming/goal-theft.md): How attackers hijack AI agent goals, redirecting agents to serve attacker objectives instead of user intentions.
- [Graphic Content](https://hydroxai.mintlify.app/redteaming/graphic-content.md): Testing AI resistance to generating extremely violent, gory, or disturbing content that could traumatize or desensitize users.
- [Illegal Activity](https://hydroxai.mintlify.app/redteaming/illegal-activity.md): Testing whether AI systems can be coerced into providing instructions or assistance for illegal activities.
- [Indirect Instruction](https://hydroxai.mintlify.app/redteaming/indirect-instruction.md): How hidden instructions in data sources can manipulate AI agents into performing unintended actions — the agentic equivalent of prompt injection.
- [Intellectual Property](https://hydroxai.mintlify.app/redteaming/intellectual-property.md): How AI systems can be exploited to infringe on copyrights, trademarks, trade secrets, and other intellectual property rights.
- [Inter-Agent Communication Compromise](https://hydroxai.mintlify.app/redteaming/inter-agent-communication-compromise.md): How attackers exploit communication channels between AI agents to inject instructions, corrupt data, and manipulate multi-agent systems.
- [Red Teaming & Attacks](https://hydroxai.mintlify.app/redteaming/introduction.md): Why AI red teaming matters and an overview of attack categories for evaluating AI system safety.
- [Misinformation](https://hydroxai.mintlify.app/redteaming/misinformation.md): How AI systems can be exploited to generate and spread convincing misinformation, disinformation, and propaganda.
- [Personal Safety](https://hydroxai.mintlify.app/redteaming/personal-safety.md): Testing whether AI systems can be exploited to threaten individuals' physical safety through stalking, doxxing, or harm instructions.
- [PII Leakage](https://hydroxai.mintlify.app/redteaming/pii-leakage.md): How AI models can inadvertently expose personally identifiable information and why preventing PII leakage is critical.
- [Prompt Leakage](https://hydroxai.mintlify.app/redteaming/prompt-leakage.md): How attackers extract system prompts and hidden instructions from AI systems, and why protecting prompt confidentiality matters.
- [RBAC](https://hydroxai.mintlify.app/redteaming/rbac.md): Role-Based Access Control vulnerabilities in AI systems and how attackers exploit permission misconfigurations.
- [Recursive Hijacking](https://hydroxai.mintlify.app/redteaming/recursive-hijacking.md): How attackers create self-reinforcing loops that progressively compromise AI agent behavior through recursive manipulation.
- [Robustness](https://hydroxai.mintlify.app/redteaming/robustness.md): Testing AI agent resilience against adversarial inputs, edge cases, and failure conditions that cause unexpected behavior.
- [Shell Injection](https://hydroxai.mintlify.app/redteaming/shell-injection.md): How attackers execute arbitrary system commands through AI systems that interact with operating system shells.
- [SQL Injection](https://hydroxai.mintlify.app/redteaming/sql-injection.md): How attackers manipulate AI systems to execute malicious SQL queries against backend databases.
- [SSRF](https://hydroxai.mintlify.app/redteaming/ssrf.md): Server-Side Request Forgery in AI systems — how attackers trick AI backends into making requests to internal resources.
- [System Reconnaissance](https://hydroxai.mintlify.app/redteaming/system-reconnaissance.md): How attackers extract information about AI system architecture, model details, and infrastructure through probing techniques.
- [Tool Metadata Poisoning](https://hydroxai.mintlify.app/redteaming/tool-metadata-poisoning.md): How attackers manipulate tool descriptions and metadata to trick AI agents into executing malicious operations.
- [Tool Orchestration Abuse](https://hydroxai.mintlify.app/redteaming/tool-orchestration-abuse.md): How attackers exploit AI agents' ability to chain and orchestrate tool calls to achieve malicious outcomes.
- [Toxicity](https://hydroxai.mintlify.app/redteaming/toxicity.md): How AI models can be provoked into generating toxic, offensive, or harmful language, and strategies for prevention.
- [Unexpected Code Execution](https://hydroxai.mintlify.app/redteaming/unexpected-code-execution.md): How AI systems can be tricked into generating or executing malicious code that compromises systems or data.
- [Evaluate](https://hydroxai.mintlify.app/sdk/evaluate.md): Run security evaluations programmatically with the @know-your-ai/evaluate SDK — manage datasets, evaluations, and test runs via code.
- [Firewall](https://hydroxai.mintlify.app/sdk/firewall.md): Block dangerous inputs and flag risky AI outputs in real time with the @know-your-ai/firewall SDK.
- [Monitoring](https://hydroxai.mintlify.app/sdk/monitoring.md): Monitor AI model calls in production — track requests, tokens, cost, latency, and errors with the @know-your-ai/node SDK.
- [SDK Overview](https://hydroxai.mintlify.app/sdk/overview.md): Install and configure the @know-your-ai SDK to monitor, trace, evaluate, and protect your AI applications.
- [Tracing](https://hydroxai.mintlify.app/sdk/tracing.md): Visualize multi-step AI agent interactions as span trees with the @know-your-ai/node SDK.
- [Workspace](https://hydroxai.mintlify.app/workspace.md): How workspaces organize projects, people, and environments.