Real-time Monitoring

Monitoring gives you complete visibility into your production AI system. Every model call, tool invocation, and agent step is captured and surfaced in real-time dashboards.

Setup (3 minutes)

Install

npm install @know-your-ai/node

Initialize

import * as KnowYourAI from '@know-your-ai/node';

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  environment: 'production',
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

Instrument your AI client

import { GoogleGenAI } from '@google/genai';

const genAI = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY! });
const client = KnowYourAI.instrumentGoogleGenAIClient(genAI);

// All calls through `client` are now automatically tracked

That’s it. Every AI call through the instrumented client is now captured.

What gets tracked

Request metrics

Every model call automatically captures:

Metric	Description
Provider & model	Which AI provider and model was called
Operation	`generateContent`, `sendMessage`, `generateContentStream`, etc.
Duration	Total response time in milliseconds
Token usage	Input tokens, output tokens, cached tokens, reasoning tokens
Cost	Estimated cost per request based on model pricing
Streaming metrics	TTFB, tokens/sec, chunk count (for streaming calls)
Error details	Error type, status code, retryability
Tool calls	Function calls made by the model

Dashboard views

The monitoring dashboard provides:

Request volume — Total requests over time with error overlay
Token usage — Input/output/total tokens with daily/hourly granularity
Cost tracking — Running cost estimates by model and time period
Latency percentiles — p50, p95, p99 response times
Error rates — Error count and rate by type (rate limit, auth, timeout, etc.)
Provider & model split — Traffic distribution across providers and models

Example: track a chatbot

import * as KnowYourAI from '@know-your-ai/node';
import { GoogleGenAI } from '@google/genai';

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  environment: 'production',
  release: '2.1.0',
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

const genAI = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY! });
const client = KnowYourAI.instrumentGoogleGenAIClient(genAI);

// Chat session — each message is tracked individually
const chat = client.chats.create({
  model: 'gemini-2.0-flash',
  history: [
    { role: 'user', parts: [{ text: 'You are a customer support agent.' }] },
  ],
});

// Each sendMessage is an individual tracked event
const response = await chat.sendMessage({ message: 'How do I reset my password?' });
console.log(response.text);

In the dashboard you’ll see:

Model: gemini-2.0-flash
Operation: sendMessage
Input/output tokens
Response time
Cost estimate

Example: track streaming responses

const stream = await client.models.generateContentStream({
  model: 'gemini-2.0-flash',
  contents: 'Write a detailed analysis of quantum computing trends.',
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text() || '');
}

Streaming calls capture additional metrics:

Metric	Description
TTFB	Time to first byte — how fast the first chunk arrives
Throughput	Tokens per second
Chunk count	Number of streamed chunks
Avg chunk interval	Average time between chunks

Example: custom event callback

React to every captured event in your own code — log to your system, trigger alerts, or forward to your analytics pipeline:

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  onCapture: (data) => {
    // Log to your own system
    console.log(JSON.stringify({
      timestamp: data.timestamp,
      model: data.model,
      operation: data.operation,
      duration: data.duration,
      tokens: data.tokenUsage?.totalTokens,
      cost: data.cost?.totalCost,
      error: data.error?.type,
    }));

    // Alert on high latency
    if (data.duration && data.duration > 10000) {
      alertTeam(`Slow AI response: ${data.model} took ${data.duration}ms`);
    }

    // Alert on errors
    if (data.error) {
      alertTeam(`AI error: ${data.error.type} — ${data.error.message}`);
    }
  },
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

Tracing: visualize agent workflows

For multi-step AI workflows (agents, RAG pipelines, chains), tracing builds a span tree so you can see exactly what happened:

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  traceMode: true,
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

// Wrap your entire workflow in a trace
await KnowYourAI.withTrace(
  { name: 'customer-support', userId: 'user-123', sessionId: 'sess-456' },
  async () => {
    // Step 1: Classify the user intent
    const classification = await KnowYourAI.withGeneration('classify-intent', async (gen) => {
      gen.setModel('gemini-2.0-flash');
      const res = await client.models.generateContent({
        model: 'gemini-2.0-flash',
        contents: 'User says: I cannot log in',
      });
      gen.setUsage({ inputTokens: 50, outputTokens: 10 });
      return res.text;
    });

    // Step 2: Search knowledge base
    const retriever = KnowYourAI.startRetriever('kb-search', {});
    retriever.setQuery('login issues troubleshooting');
    const docs = await searchKnowledgeBase('login issues');
    retriever.setDocuments(docs);
    retriever.end();

    // Step 3: Generate final response
    await KnowYourAI.withGeneration('generate-response', async (gen) => {
      gen.setModel('gemini-2.0-flash');
      const res = await client.models.generateContent({
        model: 'gemini-2.0-flash',
        contents: `Context: ${JSON.stringify(docs)}\nIntent: ${classification}\nGenerate helpful response.`,
      });
      gen.setUsage({ inputTokens: 500, outputTokens: 200 });
      return res.text;
    });
  }
);

In the tracing view you’ll see:

customer-support (trace)
├── classify-intent (generation) — 150ms, 60 tokens
├── kb-search (retriever) — 80ms, 3 documents
└── generate-response (generation) — 1200ms, 700 tokens

Each span shows inputs, outputs, duration, tokens, cost, and errors — making it easy to pinpoint bottlenecks or failures.

Privacy controls

Control what data is sent to the dashboard:

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  // Don't send actual message content
  recordInputs: false,
  recordOutputs: false,
  // Still capture performance metrics
  recordRequestParams: true,
  enableCostEstimation: true,
  // Sample 50% of traffic in high-volume environments
  sampleRate: 0.5,
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

Graceful shutdown

Flush all pending events before your process exits:

process.on('SIGTERM', async () => {
  await KnowYourAI.getClient()?.flush(5000); // 5s timeout
  process.exit(0);
});

Overview

Content Firewall

Agent Safety

Recipes

Real-time Monitoring

Setup (3 minutes)

What gets tracked

Request metrics

Dashboard views

Example: track a chatbot

Example: track streaming responses

Example: custom event callback

Tracing: visualize agent workflows

Privacy controls

Graceful shutdown

Overview

Real-time Monitoring

Content Firewall

Agent Safety

Recipes

Documentation Index

​Setup (3 minutes)

​What gets tracked

​Request metrics

​Dashboard views

​Example: track a chatbot

​Example: track streaming responses

​Example: custom event callback

​Tracing: visualize agent workflows

​Privacy controls

​Graceful shutdown

Setup (3 minutes)

What gets tracked

Request metrics

Dashboard views

Example: track a chatbot

Example: track streaming responses

Example: custom event callback

Tracing: visualize agent workflows

Privacy controls

Graceful shutdown