ractogateway.telemetry._pricing
Built-in model pricing table (USD per 1 million tokens).
Prices are approximate and may lag behind provider announcements.
Override or extend via price_table= on RactoTracer
and GatewayMetricsMiddleware.
- ractogateway.telemetry._pricing.DEFAULT_COST_TABLE: dict[str, ModelPricing] = {'claude-3-5-haiku-20241022': ModelPricing(input_per_million=0.8, output_per_million=4.0), 'claude-3-5-sonnet-20241022': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'claude-3-haiku-20240307': ModelPricing(input_per_million=0.25, output_per_million=1.25), 'claude-3-opus-20240229': ModelPricing(input_per_million=15.0, output_per_million=75.0), 'claude-haiku-4-5-20251001': ModelPricing(input_per_million=0.8, output_per_million=4.0), 'claude-opus-4-6': ModelPricing(input_per_million=15.0, output_per_million=75.0), 'claude-sonnet-4-5-20250929': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'claude-sonnet-4-6': ModelPricing(input_per_million=3.0, output_per_million=15.0), 'gemini-1.0-pro': ModelPricing(input_per_million=0.5, output_per_million=1.5), 'gemini-1.5-flash': ModelPricing(input_per_million=0.075, output_per_million=0.3), 'gemini-1.5-flash-8b': ModelPricing(input_per_million=0.0375, output_per_million=0.15), 'gemini-1.5-pro': ModelPricing(input_per_million=1.25, output_per_million=5.0), 'gemini-2.0-flash': ModelPricing(input_per_million=0.1, output_per_million=0.4), 'gemini-2.0-flash-lite': ModelPricing(input_per_million=0.075, output_per_million=0.3), 'gemini-2.5-pro': ModelPricing(input_per_million=1.25, output_per_million=10.0), 'gpt-3.5-turbo': ModelPricing(input_per_million=0.5, output_per_million=1.5), 'gpt-4': ModelPricing(input_per_million=30.0, output_per_million=60.0), 'gpt-4-turbo': ModelPricing(input_per_million=10.0, output_per_million=30.0), 'gpt-4o': ModelPricing(input_per_million=2.5, output_per_million=10.0), 'gpt-4o-mini': ModelPricing(input_per_million=0.15, output_per_million=0.6), 'o1': ModelPricing(input_per_million=15.0, output_per_million=60.0), 'o1-mini': ModelPricing(input_per_million=3.0, output_per_million=12.0), 'o3-mini': ModelPricing(input_per_million=1.1, output_per_million=4.4)}
Default pricing table. Keys are model identifiers as returned by the provider.
- ractogateway.telemetry._pricing.compute_cost(model, input_tokens, output_tokens, extra_table=None)[source]
Compute the estimated USD cost for a single LLM call.
- Parameters:
model (
str) – Model identifier (e.g."gpt-4o"). If not found in the combined table the function returns0.0.input_tokens (
int) – Number of prompt tokens consumed.output_tokens (
int) – Number of completion tokens produced.extra_table (
dict[str,ModelPricing] |None) – Optional{model: ModelPricing}dict to override or extendDEFAULT_COST_TABLE. Extra entries win over defaults.
- Return type:
- Returns:
float – Estimated cost in USD, or
0.0when the model is unknown.