# Know Your AI ## Docs - [Chatbot Evaluation](https://hydroxai.mintlify.app/chatbot-evaluation.md): Red-team live chatbot websites using automated browser control. - [Commands Reference](https://hydroxai.mintlify.app/cli/commands.md): Complete reference for every Know Your AI CLI command — flags, output, and examples. - [CLI Overview](https://hydroxai.mintlify.app/cli/overview.md): Install and use the Know Your AI CLI to run security evaluations, inspect products, and review results from your terminal. - [Compliance](https://hydroxai.mintlify.app/compliance.md): CCPA/CPRA compliance analysis, regulatory coverage, and evidence trails. - [Datasets](https://hydroxai.mintlify.app/datasets.md): Attack datasets, safety tests, and benchmarks for AI evaluation. - [APEX-Agents](https://hydroxai.mintlify.app/eval-benchmark/apex-agents.md): A comprehensive benchmark for evaluating AI agents on real-world tasks requiring tool use, multi-step planning, error recovery, and autonomous execution. - [ARC-AGI-2](https://hydroxai.mintlify.app/eval-benchmark/arc-agi-2.md): The premier benchmark for measuring genuine abstract reasoning and fluid intelligence in AI systems — designed to resist memorization and reward true generalization. - [Attack Datasets](https://hydroxai.mintlify.app/eval-benchmark/attack-datasets.md): Comprehensive guide to attack datasets — categories, methods, marketplace datasets, and how to create custom datasets for AI model evaluation. - [AI Benchmarks Directory](https://hydroxai.mintlify.app/eval-benchmark/benchmarks.md): A comprehensive guide to the most influential AI benchmarks — understand how frontier models are evaluated across software engineering, reasoning, agents, science, and more. - [CLI Evaluations](https://hydroxai.mintlify.app/eval-benchmark/cli-eval.md): Run AI model evaluations from the command line — list, inspect, execute, and review results with the Know Your AI CLI. - [Dashboard Evaluations](https://hydroxai.mintlify.app/eval-benchmark/dashboard-eval.md): Run comprehensive AI model evaluations from the Know Your AI dashboard — configure tests, watch execution in real time, and analyze results. - [DeepResearchBench](https://hydroxai.mintlify.app/eval-benchmark/deep-research-bench.md): A benchmark for evaluating AI systems on deep research tasks — multi-source information gathering, synthesis, and long-form analytical reasoning. - [GeoBench](https://hydroxai.mintlify.app/eval-benchmark/geobench.md): A benchmark for evaluating AI models on geospatial reasoning, geographic knowledge, and Earth science understanding. - [Humanity's Last Exam](https://hydroxai.mintlify.app/eval-benchmark/humanitys-last-exam.md): The hardest multi-domain academic benchmark — 3,000 expert-crafted questions across 100+ disciplines designed to be the final exam before AGI. - [Why Evaluation & Benchmarking?](https://hydroxai.mintlify.app/eval-benchmark/overview.md): Comprehensively evaluate your AI model's capabilities — security, safety, accuracy, robustness, compliance, and performance — before and after deployment. - [SDK Evaluations](https://hydroxai.mintlify.app/eval-benchmark/sdk-eval.md): Run evaluations programmatically with the @know-your-ai/evaluate SDK — automate security testing, integrate with CI/CD, and build custom evaluation pipelines. - [SimpleBench](https://hydroxai.mintlify.app/eval-benchmark/simplebench.md): A benchmark of deceptively simple questions that expose fundamental reasoning failures in frontier AI models. - [SWE-bench](https://hydroxai.mintlify.app/eval-benchmark/swe-bench.md): The gold standard benchmark for evaluating AI coding assistants on real-world software engineering tasks from GitHub. - [Terminal-Bench 2.0](https://hydroxai.mintlify.app/eval-benchmark/terminal-bench.md): A benchmark for evaluating AI models on complex, multi-step terminal and command-line operations in realistic system environments. - [Evaluation](https://hydroxai.mintlify.app/evaluation.md): How Know Your AI evaluates and red-teams your AI products. - [Features](https://hydroxai.mintlify.app/features.md): Overview of all features in Know Your AI - [Firewall](https://hydroxai.mintlify.app/firewall.md): Real-time input/output validation for AI applications. - [How to use Know Your AI](https://hydroxai.mintlify.app/how-to-use.md): Recommended workflows from setup to production monitoring. - [Welcome](https://hydroxai.mintlify.app/introduction.md): Get to know Know Your AI and ship safer, higher-quality AI experiences. - [Aegis](https://hydroxai.mintlify.app/legal/aegis.md): Aegis AI Safety Dataset — a comprehensive content safety dataset for evaluating AI guardrails and content moderation. - [BeaverTails](https://hydroxai.mintlify.app/legal/beavertails.md): BeaverTails — a large-scale AI safety dataset for evaluating and improving the harmlessness of language models. - [CCPA / CPRA](https://hydroxai.mintlify.app/legal/ccpa.md): California Consumer Privacy Act and California Privacy Rights Act — what they require and how Know Your AI helps you comply. - [MITRE ATLAS](https://hydroxai.mintlify.app/legal/mitre-atlas.md): MITRE ATLAS — Adversarial Threat Landscape for AI Systems, a knowledge base of adversarial tactics and techniques targeting AI. - [NIST AI RMF](https://hydroxai.mintlify.app/legal/nist-ai-rmf.md): The NIST AI Risk Management Framework — a comprehensive approach to managing risks in AI systems. - [OWASP Top 10 for Agents 2026](https://hydroxai.mintlify.app/legal/owasp-agents-top10-2026.md): The OWASP Top 10 for AI Agents — emerging security risks specific to autonomous AI agent systems. - [OWASP Top 10 for LLMs 2025](https://hydroxai.mintlify.app/legal/owasp-llm-top10-2025.md): The OWASP Top 10 for Large Language Model Applications — the most critical security risks for LLM-based systems. - [Model Evaluation](https://hydroxai.mintlify.app/model-evaluation.md): Red-team your AI model via API with automated attack datasets. - [Monitoring](https://hydroxai.mintlify.app/monitoring.md): Track AI usage, performance, and errors with SDK integration and tracing. - [Agent Safety](https://hydroxai.mintlify.app/monitoring-firewall/agent-safety.md): Monitor and protect multi-step AI agent workflows — trace every tool call, detect drift, and block suspicious agent behavior. - [Content Firewall](https://hydroxai.mintlify.app/monitoring-firewall/content-firewall.md): Validate every AI input and output in real time — block jailbreaks, prompt injection, PII leakage, and harmful content automatically. - [Why Monitoring & Firewall?](https://hydroxai.mintlify.app/monitoring-firewall/overview.md): Protect your AI applications in production — real-time monitoring of agent behavior and automated blocking of harmful content. - [Real-time Monitoring](https://hydroxai.mintlify.app/monitoring-firewall/realtime-monitoring.md): Track every AI interaction in production — requests, tokens, cost, latency, errors, and full execution traces. - [Production Recipes](https://hydroxai.mintlify.app/monitoring-firewall/recipes.md): Copy-paste code recipes for common monitoring and firewall setups — chatbots, RAG pipelines, agents, and CI/CD. - [Product overview](https://hydroxai.mintlify.app/product.md): Know Your AI product modules and how they fit together. - [Get started](https://hydroxai.mintlify.app/quickstart.md): Create your first workspace, add a product, and run a security evaluation. - [Agent Identity & Trust Abuse](https://hydroxai.mintlify.app/redteaming/agent-identity-trust-abuse.md): How attackers exploit the trusted identity of AI agents to impersonate, manipulate trust relationships, and access unauthorized resources. - [Autonomous Agent Drift](https://hydroxai.mintlify.app/redteaming/autonomous-agent-drift.md): How AI agents gradually deviate from their intended behavior over time, accumulating small deviations that result in significant misalignment. - [BFLA](https://hydroxai.mintlify.app/redteaming/bfla.md): Broken Function Level Authorization — how attackers escalate privileges by accessing unauthorized AI functions and endpoints. - [Bias](https://hydroxai.mintlify.app/redteaming/bias.md): How AI systems can exhibit and amplify biases, and why testing for bias is essential for responsible AI deployment. - [BOLA](https://hydroxai.mintlify.app/redteaming/bola.md): Broken Object Level Authorization — how attackers access unauthorized data objects through AI systems. - [Child Protection](https://hydroxai.mintlify.app/redteaming/child-protection.md): Why testing AI systems for child safety vulnerabilities is critical, and how models can be exploited to generate harmful content involving minors. - [Competition](https://hydroxai.mintlify.app/redteaming/competition.md): How AI systems can be exploited for competitive intelligence extraction, anti-competitive behavior, and business sabotage. - [Cross-Context Retrieval](https://hydroxai.mintlify.app/redteaming/cross-context-retrieval.md): How attackers exploit RAG systems to access information from unauthorized contexts, tenants, or data sources. - [Custom Vulnerability](https://hydroxai.mintlify.app/redteaming/custom-vulnerability.md): Define and test custom vulnerability categories specific to your AI application's unique risk profile. - [Debug Access](https://hydroxai.mintlify.app/redteaming/debug-access.md): How attackers exploit debug interfaces, verbose error messages, and development endpoints left exposed in production AI systems. - [Ethics](https://hydroxai.mintlify.app/redteaming/ethics.md): Testing AI systems for ethical violations, manipulation, and morally harmful outputs. - [Excessive Agency](https://hydroxai.mintlify.app/redteaming/excessive-agency.md): When AI agents take actions beyond their intended scope, making decisions and performing operations without proper authorization. - [Exploit Tool Agent](https://hydroxai.mintlify.app/redteaming/exploit-tool-agent.md): How attackers leverage AI agents as automated exploit tools to discover and exploit vulnerabilities in connected systems. - [External System Abuse](https://hydroxai.mintlify.app/redteaming/external-system-abuse.md): How attackers leverage AI agents to abuse external systems, APIs, and services that the agent has legitimate access to. - [Fairness](https://hydroxai.mintlify.app/redteaming/fairness.md): Evaluating AI systems for equitable treatment across different user groups and preventing discriminatory outcomes. - [Goal Theft](https://hydroxai.mintlify.app/redteaming/goal-theft.md): How attackers hijack AI agent goals, redirecting agents to serve attacker objectives instead of user intentions. - [Graphic Content](https://hydroxai.mintlify.app/redteaming/graphic-content.md): Testing AI resistance to generating extremely violent, gory, or disturbing content that could traumatize or desensitize users. - [Illegal Activity](https://hydroxai.mintlify.app/redteaming/illegal-activity.md): Testing whether AI systems can be coerced into providing instructions or assistance for illegal activities. - [Indirect Instruction](https://hydroxai.mintlify.app/redteaming/indirect-instruction.md): How hidden instructions in data sources can manipulate AI agents into performing unintended actions — the agentic equivalent of prompt injection. - [Intellectual Property](https://hydroxai.mintlify.app/redteaming/intellectual-property.md): How AI systems can be exploited to infringe on copyrights, trademarks, trade secrets, and other intellectual property rights. - [Inter-Agent Communication Compromise](https://hydroxai.mintlify.app/redteaming/inter-agent-communication-compromise.md): How attackers exploit communication channels between AI agents to inject instructions, corrupt data, and manipulate multi-agent systems. - [Red Teaming & Attacks](https://hydroxai.mintlify.app/redteaming/introduction.md): Why AI red teaming matters and an overview of attack categories for evaluating AI system safety. - [Misinformation](https://hydroxai.mintlify.app/redteaming/misinformation.md): How AI systems can be exploited to generate and spread convincing misinformation, disinformation, and propaganda. - [Personal Safety](https://hydroxai.mintlify.app/redteaming/personal-safety.md): Testing whether AI systems can be exploited to threaten individuals' physical safety through stalking, doxxing, or harm instructions. - [PII Leakage](https://hydroxai.mintlify.app/redteaming/pii-leakage.md): How AI models can inadvertently expose personally identifiable information and why preventing PII leakage is critical. - [Prompt Leakage](https://hydroxai.mintlify.app/redteaming/prompt-leakage.md): How attackers extract system prompts and hidden instructions from AI systems, and why protecting prompt confidentiality matters. - [RBAC](https://hydroxai.mintlify.app/redteaming/rbac.md): Role-Based Access Control vulnerabilities in AI systems and how attackers exploit permission misconfigurations. - [Recursive Hijacking](https://hydroxai.mintlify.app/redteaming/recursive-hijacking.md): How attackers create self-reinforcing loops that progressively compromise AI agent behavior through recursive manipulation. - [Robustness](https://hydroxai.mintlify.app/redteaming/robustness.md): Testing AI agent resilience against adversarial inputs, edge cases, and failure conditions that cause unexpected behavior. - [Shell Injection](https://hydroxai.mintlify.app/redteaming/shell-injection.md): How attackers execute arbitrary system commands through AI systems that interact with operating system shells. - [SQL Injection](https://hydroxai.mintlify.app/redteaming/sql-injection.md): How attackers manipulate AI systems to execute malicious SQL queries against backend databases. - [SSRF](https://hydroxai.mintlify.app/redteaming/ssrf.md): Server-Side Request Forgery in AI systems — how attackers trick AI backends into making requests to internal resources. - [System Reconnaissance](https://hydroxai.mintlify.app/redteaming/system-reconnaissance.md): How attackers extract information about AI system architecture, model details, and infrastructure through probing techniques. - [Tool Metadata Poisoning](https://hydroxai.mintlify.app/redteaming/tool-metadata-poisoning.md): How attackers manipulate tool descriptions and metadata to trick AI agents into executing malicious operations. - [Tool Orchestration Abuse](https://hydroxai.mintlify.app/redteaming/tool-orchestration-abuse.md): How attackers exploit AI agents' ability to chain and orchestrate tool calls to achieve malicious outcomes. - [Toxicity](https://hydroxai.mintlify.app/redteaming/toxicity.md): How AI models can be provoked into generating toxic, offensive, or harmful language, and strategies for prevention. - [Unexpected Code Execution](https://hydroxai.mintlify.app/redteaming/unexpected-code-execution.md): How AI systems can be tricked into generating or executing malicious code that compromises systems or data. - [Evaluate](https://hydroxai.mintlify.app/sdk/evaluate.md): Run security evaluations programmatically with the @know-your-ai/evaluate SDK — manage datasets, evaluations, and test runs via code. - [Firewall](https://hydroxai.mintlify.app/sdk/firewall.md): Block dangerous inputs and flag risky AI outputs in real time with the @know-your-ai/firewall SDK. - [Monitoring](https://hydroxai.mintlify.app/sdk/monitoring.md): Monitor AI model calls in production — track requests, tokens, cost, latency, and errors with the @know-your-ai/node SDK. - [SDK Overview](https://hydroxai.mintlify.app/sdk/overview.md): Install and configure the @know-your-ai SDK to monitor, trace, evaluate, and protect your AI applications. - [Tracing](https://hydroxai.mintlify.app/sdk/tracing.md): Visualize multi-step AI agent interactions as span trees with the @know-your-ai/node SDK. - [Workspace](https://hydroxai.mintlify.app/workspace.md): How workspaces organize projects, people, and environments.