Skip to main content

Documentation Index

Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Shipping an AI model without evaluation is like deploying code without tests. Know Your AI provides a complete evaluation framework that tests every dimension of your model’s capabilities — so you know exactly how it behaves before your users do.

What you can evaluate

Security

Resistance to jailbreaks, prompt injection, data extraction, and system prompt leakage.

Safety

Harmful content generation, toxicity, violence, child safety, and illegal activity.

Accuracy

Factual correctness, hallucination detection, and ground-truth comparison.

Robustness

Stability under adversarial inputs, edge cases, and multilingual attacks.

Compliance

CCPA/CPRA, EU AI Act, NIST AI RMF, and OWASP LLM Top 10 alignment.

Bias & Fairness

Discriminatory outputs, stereotyping, and fairness across demographics.

How it works

Datasets (Attack Prompts)


┌─────────────────────────┐
│   Your AI Model         │  ← API, Streaming API, or Website chatbot
│   (any provider)        │
└────────────┬────────────┘
             │  Responses

┌─────────────────────────┐
│   LLM-as-Judge          │  ← gemini-2.0-flash or custom judge model
│   Scores each response  │
└────────────┬────────────┘


┌─────────────────────────┐
│   Results & Reports     │
│   Security score        │
│   Per-prompt verdicts   │
│   Compliance analysis   │
│   Benchmarking data     │
└─────────────────────────┘
Every evaluation follows the same pipeline:
  1. Select datasets — Choose from 50+ attack datasets covering 15+ attack methods, or upload your own
  2. Send prompts — Each prompt is sent to your AI model (via API or browser automation)
  3. Judge responses — An LLM judge scores each response for vulnerabilities
  4. Generate reports — Security scores, per-prompt verdicts, compliance analysis, and trend data

Three ways to run evaluations

Dashboard

Point-and-click evaluations with real-time console and visual results.

SDK

Programmatic evaluations for CI/CD pipelines and custom workflows.

CLI

Run evaluations from your terminal with a single command.

Evaluation modes

Know Your AI supports two evaluation modes depending on how your AI is deployed:

Model Evaluation (API Mode)

For AI models exposed via REST or streaming APIs. Know Your AI sends attack prompts directly to your API endpoint and collects responses.
  • High-throughput testing with large datasets
  • Supports REST API, streaming API, and custom schemas
  • Ideal for pre-deployment benchmarking

Chatbot Evaluation (Website Mode)

For AI chatbots deployed on websites. Know Your AI uses a browser control agent to interact with your chatbot like a real user.
  • Full end-to-end testing including UI behavior
  • Screenshot capture at every step for visual evidence
  • Live viewer to watch the evaluation in real time

Attack coverage

Know Your AI evaluates across 7 core attack categories using 15+ attack methods:
CategoryAttack methodsWhat it tests
JailbreakDAN, GCG, PAIR, GRANDMOTHER, DEEP_INCEPTIONCan the model be tricked into ignoring safety rules?
Prompt InjectionCIPHER, ARTPROMPT, ADAPTIVECan instructions be injected via user input?
Data ExtractionDRA, RENELLMCan the model be forced to leak system prompts or training data?
Harmful ContentPSYCHOLOGY, GPTFUZZERDoes the model generate dangerous or illegal content?
PII LeakageMULTILINGUAL, PAST_TENSEDoes the model expose personal information?
BiasADAPTIVE, MULTILINGUALDoes the model produce discriminatory outputs?
HallucinationDRA, PAIRDoes the model fabricate false information?

Benchmarking across dimensions

Run evaluations across multiple dimensions to build a complete picture of your model:
DimensionMetricsWhy it matters
Security score% of attack prompts blockedHow resistant is the model to adversarial attacks?
Safety score% of harmful outputs preventedDoes the model avoid generating dangerous content?
AccuracyGround-truth match rateDoes the model give correct answers?
RobustnessPerformance under adversarial variationsDoes the model hold up under unusual inputs?
ComplianceViolation count per regulationDoes the model meet regulatory requirements?
ConsistencyScore variance across runsAre results stable and reproducible?

Next steps

Dashboard evaluations

Run your first evaluation from the dashboard.

SDK evaluations

Automate evaluations programmatically.

CLI evaluations

Run evaluations from your terminal.

Attack datasets

Browse all available attack methods and categories.