ractogateway.telemetry._pricing

Built-in model pricing table (USD per 1 million tokens).

Prices are approximate and may lag behind provider announcements. Override or extend via price_table= on RactoTracer and GatewayMetricsMiddleware.

ractogateway.telemetry._pricing.DEFAULT_COST_TABLE: dict[str, ModelPricing] = {'claude-3-5-haiku-20241022': ModelPricing(input_per_million=0.8, output_per_million=4.0), 'claude-3-5-sonnet-20241022': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'claude-3-haiku-20240307': ModelPricing(input_per_million=0.25, output_per_million=1.25), 'claude-3-opus-20240229': ModelPricing(input_per_million=15.0, output_per_million=75.0), 'claude-haiku-4-5-20251001': ModelPricing(input_per_million=0.8, output_per_million=4.0), 'claude-opus-4-6': ModelPricing(input_per_million=15.0, output_per_million=75.0), 'claude-sonnet-4-5-20250929': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'claude-sonnet-4-6': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'gemini-1.0-pro': ModelPricing(input_per_million=0.5, output_per_million=1.5), 'gemini-1.5-flash': ModelPricing(input_per_million=0.075, output_per_million=0.3), 'gemini-1.5-flash-8b': ModelPricing(input_per_million=0.0375, output_per_million=0.15), 'gemini-1.5-pro': ModelPricing(input_per_million=1.25, output_per_million=5.0), 'gemini-2.0-flash': ModelPricing(input_per_million=0.1, output_per_million=0.4), 'gemini-2.0-flash-lite': ModelPricing(input_per_million=0.075, output_per_million=0.3), 'gemini-2.5-pro': ModelPricing(input_per_million=1.25, output_per_million=10.0), 'gpt-3.5-turbo': ModelPricing(input_per_million=0.5, output_per_million=1.5), 'gpt-4': ModelPricing(input_per_million=30.0, output_per_million=60.0), 'gpt-4-turbo': ModelPricing(input_per_million=10.0, output_per_million=30.0), 'gpt-4o': ModelPricing(input_per_million=2.5, output_per_million=10.0), 'gpt-4o-mini': ModelPricing(input_per_million=0.15, output_per_million=0.6), 'o1': ModelPricing(input_per_million=15.0, output_per_million=60.0), 'o1-mini': ModelPricing(input_per_million=3.0, output_per_million=12.0), 'o3-mini': ModelPricing(input_per_million=1.1, output_per_million=4.4)}

Default pricing table. Keys are model identifiers as returned by the provider.

ractogateway.telemetry._pricing.compute_cost(model, input_tokens, output_tokens, extra_table=None)[source]

Compute the estimated USD cost for a single LLM call.

Parameters:
  • model (str) – Model identifier (e.g. "gpt-4o"). If not found in the combined table the function returns 0.0.

  • input_tokens (int) – Number of prompt tokens consumed.

  • output_tokens (int) – Number of completion tokens produced.

  • extra_table (dict[str, ModelPricing] | None) – Optional {model: ModelPricing} dict to override or extend DEFAULT_COST_TABLE. Extra entries win over defaults.

Return type:

float

Returns:

float – Estimated cost in USD, or 0.0 when the model is unknown.