Skip to main content

Documentation Index

Fetch the complete documentation index at: https://hydroxai.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Monitoring gives you complete visibility into your production AI system. Every model call, tool invocation, and agent step is captured and surfaced in real-time dashboards.

Setup (3 minutes)

1

Install

npm install @know-your-ai/node
2

Initialize

import * as KnowYourAI from '@know-your-ai/node';

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  environment: 'production',
  integrations: [KnowYourAI.googleGenAIIntegration()],
});
3

Instrument your AI client

import { GoogleGenAI } from '@google/genai';

const genAI = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY! });
const client = KnowYourAI.instrumentGoogleGenAIClient(genAI);

// All calls through `client` are now automatically tracked
That’s it. Every AI call through the instrumented client is now captured.

What gets tracked

Request metrics

Every model call automatically captures:
MetricDescription
Provider & modelWhich AI provider and model was called
OperationgenerateContent, sendMessage, generateContentStream, etc.
DurationTotal response time in milliseconds
Token usageInput tokens, output tokens, cached tokens, reasoning tokens
CostEstimated cost per request based on model pricing
Streaming metricsTTFB, tokens/sec, chunk count (for streaming calls)
Error detailsError type, status code, retryability
Tool callsFunction calls made by the model

Dashboard views

The monitoring dashboard provides:
  • Request volume — Total requests over time with error overlay
  • Token usage — Input/output/total tokens with daily/hourly granularity
  • Cost tracking — Running cost estimates by model and time period
  • Latency percentiles — p50, p95, p99 response times
  • Error rates — Error count and rate by type (rate limit, auth, timeout, etc.)
  • Provider & model split — Traffic distribution across providers and models

Example: track a chatbot

import * as KnowYourAI from '@know-your-ai/node';
import { GoogleGenAI } from '@google/genai';

KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  environment: 'production',
  release: '2.1.0',
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

const genAI = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY! });
const client = KnowYourAI.instrumentGoogleGenAIClient(genAI);

// Chat session — each message is tracked individually
const chat = client.chats.create({
  model: 'gemini-2.0-flash',
  history: [
    { role: 'user', parts: [{ text: 'You are a customer support agent.' }] },
  ],
});

// Each sendMessage is an individual tracked event
const response = await chat.sendMessage({ message: 'How do I reset my password?' });
console.log(response.text);
In the dashboard you’ll see:
  • Model: gemini-2.0-flash
  • Operation: sendMessage
  • Input/output tokens
  • Response time
  • Cost estimate

Example: track streaming responses

const stream = await client.models.generateContentStream({
  model: 'gemini-2.0-flash',
  contents: 'Write a detailed analysis of quantum computing trends.',
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text() || '');
}
Streaming calls capture additional metrics:
MetricDescription
TTFBTime to first byte — how fast the first chunk arrives
ThroughputTokens per second
Chunk countNumber of streamed chunks
Avg chunk intervalAverage time between chunks

Example: custom event callback

React to every captured event in your own code — log to your system, trigger alerts, or forward to your analytics pipeline:
KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  onCapture: (data) => {
    // Log to your own system
    console.log(JSON.stringify({
      timestamp: data.timestamp,
      model: data.model,
      operation: data.operation,
      duration: data.duration,
      tokens: data.tokenUsage?.totalTokens,
      cost: data.cost?.totalCost,
      error: data.error?.type,
    }));

    // Alert on high latency
    if (data.duration && data.duration > 10000) {
      alertTeam(`Slow AI response: ${data.model} took ${data.duration}ms`);
    }

    // Alert on errors
    if (data.error) {
      alertTeam(`AI error: ${data.error.type}${data.error.message}`);
    }
  },
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

Tracing: visualize agent workflows

For multi-step AI workflows (agents, RAG pipelines, chains), tracing builds a span tree so you can see exactly what happened:
KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  traceMode: true,
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

// Wrap your entire workflow in a trace
await KnowYourAI.withTrace(
  { name: 'customer-support', userId: 'user-123', sessionId: 'sess-456' },
  async () => {
    // Step 1: Classify the user intent
    const classification = await KnowYourAI.withGeneration('classify-intent', async (gen) => {
      gen.setModel('gemini-2.0-flash');
      const res = await client.models.generateContent({
        model: 'gemini-2.0-flash',
        contents: 'User says: I cannot log in',
      });
      gen.setUsage({ inputTokens: 50, outputTokens: 10 });
      return res.text;
    });

    // Step 2: Search knowledge base
    const retriever = KnowYourAI.startRetriever('kb-search', {});
    retriever.setQuery('login issues troubleshooting');
    const docs = await searchKnowledgeBase('login issues');
    retriever.setDocuments(docs);
    retriever.end();

    // Step 3: Generate final response
    await KnowYourAI.withGeneration('generate-response', async (gen) => {
      gen.setModel('gemini-2.0-flash');
      const res = await client.models.generateContent({
        model: 'gemini-2.0-flash',
        contents: `Context: ${JSON.stringify(docs)}\nIntent: ${classification}\nGenerate helpful response.`,
      });
      gen.setUsage({ inputTokens: 500, outputTokens: 200 });
      return res.text;
    });
  }
);
In the tracing view you’ll see:
customer-support (trace)
├── classify-intent (generation) — 150ms, 60 tokens
├── kb-search (retriever) — 80ms, 3 documents
└── generate-response (generation) — 1200ms, 700 tokens
Each span shows inputs, outputs, duration, tokens, cost, and errors — making it easy to pinpoint bottlenecks or failures.

Privacy controls

Control what data is sent to the dashboard:
KnowYourAI.init({
  dsn: process.env.KNOW_YOUR_AI_DSN!,
  // Don't send actual message content
  recordInputs: false,
  recordOutputs: false,
  // Still capture performance metrics
  recordRequestParams: true,
  enableCostEstimation: true,
  // Sample 50% of traffic in high-volume environments
  sampleRate: 0.5,
  integrations: [KnowYourAI.googleGenAIIntegration()],
});

Graceful shutdown

Flush all pending events before your process exits:
process.on('SIGTERM', async () => {
  await KnowYourAI.getClient()?.flush(5000); // 5s timeout
  process.exit(0);
});