Documentation Index
Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The OWASP Top 10 for Large Language Model Applications (2025) identifies the most critical security vulnerabilities specific to LLM-based systems. Published by the Open Worldwide Application Security Project, it serves as a standard awareness document for developers and security teams building or integrating LLMs. Know Your AI’s evaluation datasets and attack categories are aligned with these risks to help teams systematically test their AI systems.The Top 10
LLM01: Prompt Injection
Manipulating LLM behavior by crafting inputs that override system instructions or inject malicious instructions into the model’s context.- Direct prompt injection — User input directly alters model behavior
- Indirect prompt injection — External data sources (web pages, documents) contain hidden instructions
LLM02: Sensitive Information Disclosure
The model inadvertently reveals confidential data such as PII, proprietary information, system prompts, or training data through its responses. Know Your AI coverage: Data Extraction and PII Leakage attack datasets.LLM03: Supply Chain Vulnerabilities
Risks from third-party components including pre-trained models, training data, plugins, and extensions that may contain vulnerabilities or malicious code. Know Your AI coverage: Evaluation of model behavior with untrusted inputs; firewall validation of outputs.LLM04: Data and Model Poisoning
Attacks that manipulate training data or fine-tuning processes to introduce backdoors, biases, or vulnerabilities into the model. Know Your AI coverage: Bias detection datasets and hallucination testing.LLM05: Improper Output Handling
Failure to validate, sanitize, or properly handle LLM outputs before passing them to downstream systems, leading to XSS, SSRF, privilege escalation, or remote code execution. Know Your AI coverage: Firewall output validation with risk categorization.LLM06: Excessive Agency
Granting an LLM too much autonomy, access to sensitive functions, or permissions beyond what’s necessary for its intended purpose. Know Your AI coverage: Jailbreak attack datasets that test guardrail bypass.LLM07: System Prompt Leakage
Attacks that extract the system prompt or internal instructions of an LLM, revealing business logic, security controls, or sensitive configuration. Know Your AI coverage: Data Extraction datasets targeting system prompt exposure.LLM08: Vector and Embedding Weaknesses
Exploiting vulnerabilities in RAG (Retrieval-Augmented Generation) systems through manipulated embeddings, poisoned vector stores, or retrieval manipulation. Know Your AI coverage: Evaluation of RAG-based systems with adversarial inputs.LLM09: Misinformation
The model generates false, misleading, or fabricated information (hallucinations) that users may trust as factual. Know Your AI coverage: Hallucination attack datasets and AI quality metrics for Hallucination Risk.LLM10: Unbounded Consumption
Attacks that cause excessive resource consumption through crafted inputs, leading to denial of service, cost escalation, or degraded performance. Know Your AI coverage: Monitoring dashboard tracks tokens, cost, and latency to detect anomalies.How Know Your AI maps to OWASP LLM Top 10
| OWASP Risk | Know Your AI Coverage |
|---|---|
| LLM01: Prompt Injection | Prompt Injection datasets (PAIR, DAN, CIPHER, ADAPTIVE, etc.) |
| LLM02: Sensitive Information Disclosure | Data Extraction & PII Leakage datasets |
| LLM03: Supply Chain | Firewall validation |
| LLM04: Data Poisoning | Bias & Hallucination datasets |
| LLM05: Improper Output Handling | Firewall output validation |
| LLM06: Excessive Agency | Jailbreak datasets |
| LLM07: System Prompt Leakage | Data Extraction datasets |
| LLM08: Vector & Embedding | RAG evaluation support |
| LLM09: Misinformation | Hallucination datasets & quality metrics |
| LLM10: Unbounded Consumption | Monitoring & cost tracking |
Resources
Datasets
Browse attack datasets aligned with OWASP risks.
Evaluation
Run evaluations against OWASP-aligned attacks.