ractogateway.telemetry.tracer

RactoTracer — OpenTelemetry integration for RactoGateway.

Pass a RactoTracer instance as tracer= to any developer kit to automatically emit OTEL spans for every LLM call.

Requires: pip install ractogateway[telemetry]

Example:

from ractogateway import openai_developer_kit as opd
from ractogateway.telemetry import RactoTracer

tracer = RactoTracer(
    otlp_endpoint="http://localhost:4317",
    console=True,
)
kit = opd.OpenAIDeveloperKit(
    model="gpt-4o",
    default_prompt=my_prompt,
    tracer=tracer,
)
response = kit.chat(opd.ChatConfig(user_message="Hello"))
# A span named "llm.chat" is now in your OTEL backend.
class ractogateway.telemetry.tracer.RactoTracer(*, service_name='ractogateway', otlp_endpoint=None, otlp_http_endpoint=None, console=False, in_memory=False, custom_exporter=None, price_table=None)[source]

Bases: object

OpenTelemetry tracer — pass as tracer= to any developer kit.

Records one span per LLM call with attributes for latency, token usage, estimated cost, cache-hit type, and tool-call count.

Supports OTLP gRPC (Jaeger / Grafana Tempo), OTLP HTTP, console stdout, in-memory capture (for tests), and any custom opentelemetry.sdk.trace.export.SpanExporter.

Parameters:
  • service_name (str) – OTEL service.name resource attribute. Defaults to "ractogateway".

  • otlp_endpoint (str | None) – OTLP gRPC endpoint (e.g. "http://localhost:4317"). Requires pip install ractogateway[telemetry].

  • otlp_http_endpoint (str | None) – OTLP HTTP endpoint (e.g. "http://localhost:4318"). Requires pip install ractogateway[telemetry].

  • console (bool) – Also print spans to stdout — convenient during local development.

  • in_memory (bool) – Capture spans internally in a thread-safe list. Access recorded spans via the spans property. Useful for unit tests — no external backend required.

  • custom_exporter (Any | None) – Any opentelemetry.sdk.trace.export.SpanExporter instance.

  • price_table (dict[str, ModelPricing] | None) – Override or extend the built-in DEFAULT_COST_TABLE. Keys are model identifiers; values are ModelPricing objects.

  • attributes (All spans carry the following OTEL)

  • ---------------

  • attributes

  • "anthropic" (* llm.provider — "openai" / "google" /)

  • "gpt-4o" (* llm.model — e.g.)

  • "embed" (* llm.operation — "chat" / "stream" /)

  • milliseconds (* llm.latency_ms — wall-clock time in)

  • consumed (* llm.input_tokens — prompt tokens)

  • produced (* llm.output_tokens — completion tokens)

  • places) (* llm.cost_usd — estimated USD cost (8 decimal)

  • "miss" (* llm.cache_hit — "exact" / "semantic" /)

  • response (* llm.tool_calls — number of tool calls in the)

  • success) (* llm.error_type — exception class name on error (omitted on)

record_chat_span(*, provider, model, latency_ms, input_tokens=0, output_tokens=0, cache_hit='miss', tool_calls=0, status='ok', error_type=None)[source]

Record a completed chat or stream span.

Parameters:
  • provider (str) – Provider string ("openai", "google", "anthropic").

  • model (str) – Model identifier (e.g. "gpt-4o").

  • latency_ms (float) – Total wall-clock latency of the LLM call in milliseconds.

  • input_tokens (int) – Number of prompt tokens consumed (0 for cache hits).

  • output_tokens (int) – Number of completion tokens produced (0 for cache hits).

  • cache_hit (str) – "exact", "semantic", or "miss".

  • tool_calls (int) – Number of tool calls in the response.

  • status (str) – "ok" or "error".

  • error_type (str | None) – Exception class name when status == "error", else None.

Return type:

None

record_embed_span(*, provider, model, latency_ms, input_tokens=0, status='ok', error_type=None)[source]

Record a completed embedding span.

Parameters:
  • provider (str) – Provider string ("openai" or "google").

  • model (str) – Embedding model identifier.

  • latency_ms (float) – Total wall-clock latency in milliseconds.

  • input_tokens (int) – Number of tokens embedded.

  • status (str) – "ok" or "error".

  • error_type (str | None) – Exception class name when status == "error", else None.

Return type:

None

property spans: list[SpanRecord]

Return all captured in-memory spans.

Only populated when in_memory=True. Thread-safe.

Returns:

list[SpanRecord] – Snapshot of all recorded spans (newest last).

clear_spans()[source]

Clear all in-memory spans.

Only has effect when in_memory=True.

Return type:

None