Why Evaluation & Benchmarking?

Shipping an AI model without evaluation is like deploying code without tests. Know Your AI provides a complete evaluation framework that tests every dimension of your model’s capabilities — so you know exactly how it behaves before your users do.

What you can evaluate

Security

Resistance to jailbreaks, prompt injection, data extraction, and system prompt leakage.

Safety

Harmful content generation, toxicity, violence, child safety, and illegal activity.

Accuracy

Factual correctness, hallucination detection, and ground-truth comparison.

Robustness

Stability under adversarial inputs, edge cases, and multilingual attacks.

Compliance

CCPA/CPRA, EU AI Act, NIST AI RMF, and OWASP LLM Top 10 alignment.

Bias & Fairness

Discriminatory outputs, stereotyping, and fairness across demographics.

How it works

Datasets (Attack Prompts)
         │
         ▼
┌─────────────────────────┐
│   Your AI Model         │  ← API, Streaming API, or Website chatbot
│   (any provider)        │
└────────────┬────────────┘
             │  Responses
             ▼
┌─────────────────────────┐
│   LLM-as-Judge          │  ← gemini-2.0-flash or custom judge model
│   Scores each response  │
└────────────┬────────────┘
             │
             ▼
┌─────────────────────────┐
│   Results & Reports     │
│   Security score        │
│   Per-prompt verdicts   │
│   Compliance analysis   │
│   Benchmarking data     │
└─────────────────────────┘

Every evaluation follows the same pipeline:

Select datasets — Choose from 50+ attack datasets covering 15+ attack methods, or upload your own
Send prompts — Each prompt is sent to your AI model (via API or browser automation)
Judge responses — An LLM judge scores each response for vulnerabilities
Generate reports — Security scores, per-prompt verdicts, compliance analysis, and trend data

Three ways to run evaluations

Dashboard

Point-and-click evaluations with real-time console and visual results.

SDK

Programmatic evaluations for CI/CD pipelines and custom workflows.

CLI

Run evaluations from your terminal with a single command.

Evaluation modes

Know Your AI supports two evaluation modes depending on how your AI is deployed:

Model Evaluation (API Mode)

For AI models exposed via REST or streaming APIs. Know Your AI sends attack prompts directly to your API endpoint and collects responses.

High-throughput testing with large datasets
Supports REST API, streaming API, and custom schemas
Ideal for pre-deployment benchmarking

Chatbot Evaluation (Website Mode)

For AI chatbots deployed on websites. Know Your AI uses a browser control agent to interact with your chatbot like a real user.

Full end-to-end testing including UI behavior
Screenshot capture at every step for visual evidence
Live viewer to watch the evaluation in real time

Attack coverage

Know Your AI evaluates across 7 core attack categories using 15+ attack methods:

Category	Attack methods	What it tests
Jailbreak	DAN, GCG, PAIR, GRANDMOTHER, DEEP_INCEPTION	Can the model be tricked into ignoring safety rules?
Prompt Injection	CIPHER, ARTPROMPT, ADAPTIVE	Can instructions be injected via user input?
Data Extraction	DRA, RENELLM	Can the model be forced to leak system prompts or training data?
Harmful Content	PSYCHOLOGY, GPTFUZZER	Does the model generate dangerous or illegal content?
PII Leakage	MULTILINGUAL, PAST_TENSE	Does the model expose personal information?
Bias	ADAPTIVE, MULTILINGUAL	Does the model produce discriminatory outputs?
Hallucination	DRA, PAIR	Does the model fabricate false information?

Benchmarking across dimensions

Run evaluations across multiple dimensions to build a complete picture of your model:

Dimension	Metrics	Why it matters
Security score	% of attack prompts blocked	How resistant is the model to adversarial attacks?
Safety score	% of harmful outputs prevented	Does the model avoid generating dangerous content?
Accuracy	Ground-truth match rate	Does the model give correct answers?
Robustness	Performance under adversarial variations	Does the model hold up under unusual inputs?
Compliance	Violation count per regulation	Does the model meet regulatory requirements?
Consistency	Score variance across runs	Are results stable and reproducible?

Next steps

Dashboard evaluations

Run your first evaluation from the dashboard.

SDK evaluations

Automate evaluations programmatically.

CLI evaluations

Run evaluations from your terminal.

Attack datasets

Browse all available attack methods and categories.

Overview

Run Evaluations

Attack Datasets

AI Benchmarks

Why Evaluation & Benchmarking?

What you can evaluate

Security

Safety

Accuracy

Robustness

Compliance

Bias & Fairness

How it works

Three ways to run evaluations

Dashboard

SDK

CLI

Evaluation modes

Model Evaluation (API Mode)

Chatbot Evaluation (Website Mode)

Attack coverage

Benchmarking across dimensions

Next steps

Dashboard evaluations

SDK evaluations

CLI evaluations

Attack datasets

Overview

Run Evaluations

Attack Datasets

AI Benchmarks

Documentation Index

​What you can evaluate

Security

Safety

Accuracy

Robustness

Compliance

Bias & Fairness

​How it works

​Three ways to run evaluations

Dashboard

SDK

CLI

​Evaluation modes

​Model Evaluation (API Mode)

​Chatbot Evaluation (Website Mode)

​Attack coverage

​Benchmarking across dimensions

​Next steps

Dashboard evaluations

SDK evaluations

CLI evaluations

Attack datasets

What you can evaluate

How it works

Three ways to run evaluations

Evaluation modes

Model Evaluation (API Mode)

Chatbot Evaluation (Website Mode)

Attack coverage

Benchmarking across dimensions

Next steps