Documentation Index
Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a knowledge base of adversarial tactics, techniques, and case studies for machine learning systems. Modeled after the widely-used MITRE ATT&CK framework for cybersecurity, ATLAS provides a structured approach to understanding how AI systems can be attacked. ATLAS helps security teams, red-teamers, and AI engineers identify and defend against real-world adversarial threats to machine learning models and AI applications.Tactics
ATLAS organizes adversarial behavior into tactics — the “why” behind an attack:| Tactic | Description |
|---|---|
| Reconnaissance | Gathering information about the target ML system |
| Resource Development | Acquiring resources for the attack (datasets, models, infrastructure) |
| Initial Access | Gaining initial access to the ML system or its components |
| ML Model Access | Obtaining access to the target model (API access, model extraction) |
| Execution | Running adversarial techniques against the ML system |
| Persistence | Maintaining access or influence over the ML system |
| Defense Evasion | Avoiding detection by security controls and monitoring |
| Discovery | Learning about the ML system’s architecture and behavior |
| Collection | Gathering data from the ML system (model outputs, training data) |
| ML Attack Staging | Preparing attack payloads (adversarial examples, poisoned data) |
| Exfiltration | Extracting data or model information from the target system |
| Impact | Disrupting, degrading, or destroying the ML system’s function |
Key techniques
| Technique | Description |
|---|---|
| Adversarial examples | Crafted inputs that cause the model to misclassify or produce incorrect outputs |
| Data poisoning | Contaminating training data to introduce backdoors or biases |
| Model extraction | Querying a model to reconstruct a functionally equivalent copy |
| Model inversion | Recovering training data or sensitive information from model outputs |
| Prompt injection | Manipulating LLM behavior through crafted inputs |
| Backdoor attacks | Embedding hidden triggers in models that activate under specific conditions |
| Membership inference | Determining whether specific data was used in model training |
| Model evasion | Crafting inputs specifically to bypass model-based security controls |
How Know Your AI maps to MITRE ATLAS
| ATLAS Tactic / Technique | Know Your AI Coverage |
|---|---|
| Reconnaissance / Discovery | System prompt extraction datasets |
| ML Model Access | API & website evaluation modes |
| Execution | Red-team attack datasets (15+ methods) |
| ML Attack Staging | Curated attack datasets in the Marketplace |
| Adversarial examples | Jailbreak, CIPHER, DAN, and other evasion methods |
| Data poisoning / Bias | Bias detection datasets |
| Model extraction | Data extraction attack datasets |
| Prompt injection | Prompt injection datasets (PAIR, ADAPTIVE, etc.) |
| Defense evasion | Multi-method attack testing to find guardrail gaps |
| Exfiltration | PII leakage and data extraction testing |
| Impact | Security scoring and compliance analysis |
| Continuous monitoring | SDK monitoring and tracing for production detection |
ATLAS vs. ATT&CK
| Aspect | MITRE ATT&CK | MITRE ATLAS |
|---|---|---|
| Focus | Traditional IT systems | AI and ML systems |
| Targets | Networks, endpoints, cloud | Models, training pipelines, inference APIs |
| Techniques | Malware, exploits, phishing | Adversarial examples, prompt injection, data poisoning |
| Adoption | Industry standard for security operations | Growing adoption for AI security |
Resources
Datasets
Browse datasets aligned with ATLAS techniques.
Firewall
Real-time defense against adversarial attacks.