Contributing

Spans and Traces

Copy page

OpenTelemetry spans for distributed tracing and observability in the Inkeep Agent Framework

Overview

The Inkeep Agent Framework uses OpenTelemetry for distributed tracing and observability. Spans provide detailed visibility into the execution flow of agents, context resolution, tool execution, and other framework operations.

Getting Started with Spans

1. Import Required Dependencies

import { SpanStatusCode, type Span } from "@opentelemetry/api";
import { getTracer, setSpanWithError, unwrapError } from "../tracer";

2. Get the Tracer

// Use the centralized tracer utility
const tracer = getTracer("your-service-name");

Creating and Using Spans

return tracer.startActiveSpan(
  "context.resolve",
  {
    attributes: {
      "context.config_id": contextConfig.id,
      "context.trigger_event": options.triggerEvent,
    },
  },
  async (span: Span) => {
    try {
      // Your operation logic here
      return result;
    } catch (error) {
      // Use setSpanWithError for consistent error handling
      setSpanWithError(span, error);
      throw error;
    }
  }
);

Setting Span Attributes

Basic Attributes

span.setAttributes({
  "user.id": userId,
  "request.method": "POST",
});

Adding Events to Spans

Recording Important Milestones

// Add events for significant operations
span.addEvent("context.fetch_started", {
  definitionId: definition.id,
  url: definition.fetchConfig.url,
});

Error Events

span.addEvent("error.validation_failed", {
  definitionId: definition.id,
  error_type: "json_schema_validation",
  error_details: errorMessage,
});

Error Handling and Status

Using setSpanWithError Utility

The framework provides a convenient setSpanWithError utility function that handles error recording and status setting:

try {
  // Your operation
} catch (error) {
  // Use the setSpanWithError utility for consistent error handling
  setSpanWithError(span, error);
  throw error;
}

Error Unwrapping

The setSpanWithError utility automatically unwraps nested error cause chains to surface root cause errors. This is particularly useful because Node.js fetch() wraps underlying errors (like undici's HeadersTimeoutError) in generic TypeError: fetch failed messages.

When you pass an error to setSpanWithError, it traverses the .cause chain to find and record the original root cause error, making debugging significantly easier.

Using unwrapError Directly

For cases where you need to unwrap errors outside of span handling, use the unwrapError utility directly:

import { unwrapError } from "@inkeep/agents-core";

try {
  await fetch(url);
} catch (error) {
  const rootCause = unwrapError(error);
  // rootCause will be the underlying error (e.g., HeadersTimeoutError)
  // instead of the generic "fetch failed" wrapper
  console.error("Root cause:", rootCause.message);
  throw rootCause;
}

Best Practices

1. Consistent Naming Convention

The span naming convention follows a hierarchical structure that mirrors your code organization.

// Format: 'class.function'
// Use descriptive span names that follow a hierarchical structure

// Agent operations
"agent.generate";
"agent.tool_execute";
"agent.transfer";

// Context operations
"context.resolve";
"context.fetch";

Naming Rules

  1. Class First: Start with the class/module name (e.g., agent, context, tool)
  2. Function Second: Follow with the specific function/method (e.g., generate, resolve)
  3. Use Underscores: For multi-word functions, use underscores (e.g., tool_execute, cache_lookup)
  4. Consistent Casing: Use lowercase with underscores for consistency

Configuration and Setup

Environment Variables

# OpenTelemetry configuration
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317

Instrumentation Setup

The framework automatically sets up OpenTelemetry instrumentation in src/instrumentation.ts:

Examples in the Codebase

Agent Operations

See src/agents/Agent.ts for span usage in agent generation and tool execution.

Example from Agent.ts:

// Class: Agent
// Function: generate
return tracer.startActiveSpan(
  "agent.generate",
  {
    attributes: {
      "agent.id": this.id,
      "agent.name": this.name,
    },
  },
  async (span: Span) => {
    // ... implementation
  }
);

Summary

Spans provide powerful observability into your Inkeep Agent Framework operations. By following these patterns:

  1. Use getTracer() for consistent tracing
  2. Use consistent naming
  3. Set meaningful attributes for searchability
  4. Handle errors properly using setSpanWithError which automatically unwraps error cause chains to surface root cause errors
  5. Use startActiveSpan for automatic lifecycle management

This will give you comprehensive visibility into your agent operations, making debugging and performance optimization much easier.