ractogateway.openai_developer_kit

OpenAI Developer Kit — from ractogateway import openai_developer_kit as gpt.

Short usage:

from ractogateway import openai_developer_kit as gpt

kit = gpt.Chat(model="gpt-4o")          # short alias
kit = gpt.OpenAIDeveloperKit(model="gpt-4o")  # full name (same class)
class ractogateway.openai_developer_kit.BatchItem(**data)[source]

Bases: BaseModel

A single request within a batch job.

Parameters:
  • custom_id (str) – User-supplied identifier used to correlate results. Must be unique within a batch.

  • user_message (str) – The end-user’s query string (equivalent to ChatConfig.user_message).

  • temperature (float) – Sampling temperature. Defaults to 0.0.

  • max_tokens (int) – Maximum tokens for the completion. Defaults to 4096.

  • extra (dict[str, Any]) – Provider-specific pass-through kwargs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

custom_id: str
user_message: str
temperature: float
max_tokens: int
extra: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.BatchJobInfo(**data)[source]

Bases: BaseModel

Metadata about a submitted batch job.

Returned by submit_batch() and poll_status().

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

job_id: str
provider: str
status: BatchStatus
created_at: float
request_count: int
raw: Any
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.BatchResult(**data)[source]

Bases: BaseModel

The outcome of a single BatchItem.

A result is always present in the results list returned by get_results(); check error to detect failures.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

custom_id: str
response: LLMResponse | None
error: str | None
raw: Any
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property ok: bool

True when the request succeeded (no error, response present).

class ractogateway.openai_developer_kit.BatchStatus(*values)[source]

Bases: str, Enum

Processing state of a batch job.

Maps to the union of OpenAI and Anthropic batch status strings.

PENDING = 'pending'
IN_PROGRESS = 'in_progress'
FINALIZING = 'finalizing'
COMPLETED = 'completed'
FAILED = 'failed'
EXPIRED = 'expired'
CANCELLING = 'cancelling'
CANCELLED = 'cancelled'
ractogateway.openai_developer_kit.Chat

Short alias — gpt.Chat(model="gpt-4o") is identical to gpt.OpenAIDeveloperKit(...).

class ractogateway.openai_developer_kit.ChatConfig(**data)[source]

Bases: BaseModel

Validated input for every chat / achat / stream / astream call.

Pass a single ChatConfig to any developer-kit method. Every field has a safe default so you only need to supply what you actually need.

Minimal example:

config = ChatConfig(user_message="Explain Python generators.")
response = kit.chat(config)

Vision / multimodal example:

from ractogateway.prompts.engine import RactoFile

config = ChatConfig(
    user_message="Describe this chart.",
    attachments=[RactoFile.from_path("sales_q4.png")],
)

Structured JSON output example:

class Sentiment(BaseModel):
    label: str
    score: float

config = ChatConfig(
    user_message="I love this library!",
    response_model=Sentiment,
)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

user_message: str
prompt: RactoPrompt | None
temperature: float
max_tokens: int
tools: ToolRegistry | None
auto_execute_tools: bool
max_tool_turns: int
response_model: type[BaseModel] | None
max_validation_retries: int
history: list[Message]
attachments: list[RactoFile] | None
chain_of_thought: bool
native_thinking: bool
thinking_budget: int
extra: dict[str, Any]
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.CostAwareRouter(tiers)[source]

Bases: object

Routes LLM requests to the appropriate model tier based on message complexity — without making any extra API calls.

Parameters:

tiers (list[RoutingTier]) – Ordered list of RoutingTier objects, sorted ascending by max_score (cheapest first). The last tier’s max_score should be 100 to act as fallback.

Raises:
  • ValueError – If tiers is empty or not sorted ascending by max_score.

  • Example — 3-tier OpenAI ladder:: – from ractogateway.routing import CostAwareRouter, RoutingTier router = CostAwareRouter([ RoutingTier(model=”gpt-4o-mini”, max_score=30), RoutingTier(model=”gpt-4o”, max_score=70), RoutingTier(model=”o3-mini”, max_score=100), ]) model = router.route(“What is 2+2?”) # → “gpt-4o-mini” model = router.route(“Analyze the trade-offs between Redis Cluster and ” “Cassandra for a write-heavy time-series workload …”) # → “o3-mini”

  • Example — binary routing (2 tiers):: – router = CostAwareRouter([ RoutingTier(model=”claude-haiku-4-5-20251001”, max_score=40), RoutingTier(model=”claude-opus-4-6”, max_score=100), ])

score(text)[source]

Compute a complexity score in [0, 100] for text.

A higher score means a more complex task.

Return type:

int

Algorithm

token_pts = min(len(text)//4, SAT) * (MAX_TP / SAT) kw_pts = min(matches * PPK, MAX_KP) score = clamp(token_pts + kw_pts, 0, 100)

route(text)[source]

Return the model identifier for text.

Walks tiers (cheapest first) and returns the first model whose max_score complexity_score. Always returns a model because the last tier has max_score == 100 (validated at construction).

Complexity: O(k) where k = number of tiers.

Return type:

str

property tiers: tuple[RoutingTier, ...]

Immutable view of the configured tiers.

class ractogateway.openai_developer_kit.EmbeddingConfig(**data)[source]

Bases: BaseModel

Validated input for embed / aembed calls.

Example:

config = EmbeddingConfig(texts=["Hello world", "Goodbye world"])
response = kit.embed(config)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

texts: list[str]
model: str | None
dimensions: int | None
extra: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.EmbeddingResponse(**data)[source]

Bases: BaseModel

Unified response from an embedding call.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

vectors: list[EmbeddingVector]
model: str
usage: dict[str, int]
raw: Any
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.EmbeddingVector(**data)[source]

Bases: BaseModel

A single embedding result.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

index: int
text: str
embedding: list[float]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.ExactMatchCache(max_size=1024, ttl_seconds=None)[source]

Bases: object

Ultra-low-latency key-value cache for identical LLM requests.

Parameters:
  • max_size (int) – LRU capacity. 0 = unlimited (no eviction).

  • ttl_seconds (float | None) – Entries older than ttl_seconds are treated as misses and transparently evicted. None disables expiry.

  • Example::

    from ractogateway.cache import ExactMatchCache

    cache = ExactMatchCache(max_size=512, ttl_seconds=3600)

    # Wire into a kit: kit = OpenAIDeveloperKit(model=”gpt-4o”, exact_cache=cache)

get(user_message, system_prompt, model, temperature, max_tokens)[source]

Return a cached response or None on a miss.

O(1) — dictionary lookup + optional move-to-end.

Return type:

LLMResponse | None

put(user_message, system_prompt, model, temperature, max_tokens, response)[source]

Store a response. Evicts LRU entry when at capacity.

O(1) amortised — dictionary insert + optional popitem(last=False).

Return type:

None

invalidate(user_message, system_prompt, model, temperature, max_tokens)[source]

Remove a specific entry. Returns True if it was present.

Return type:

bool

clear()[source]

Evict all cached entries and reset counters.

Return type:

None

property stats: CacheStats

Return a snapshot of hit/miss/size counters.

class ractogateway.openai_developer_kit.FinishReason(*values)[source]

Bases: str, Enum

Why the model stopped generating.

STOP = 'stop'
TOOL_CALL = 'tool_call'
LENGTH = 'length'
CONTENT_FILTER = 'content_filter'
ERROR = 'error'
class ractogateway.openai_developer_kit.LLMResponse(**data)[source]

Bases: BaseModel

Unified, provider-agnostic response envelope.

Every adapter’s run() method returns one of these, regardless of whether the underlying provider is OpenAI, Gemini, or Anthropic.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

content: str | None
thinking: str | None
parsed: dict[str, Any] | list[Any] | None
tool_calls: list[ToolCallResult]
finish_reason: FinishReason
usage: dict[str, int]
raw: Any
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.Message(**data)[source]

Bases: BaseModel

A single conversation turn.

Used inside ChatConfig.history to provide prior conversation context to the model for multi-turn conversations.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

role: MessageRole
content: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.MessageRole(*values)[source]

Bases: str, Enum

Role of a single message in a conversation.

SYSTEM = 'system'
USER = 'user'
ASSISTANT = 'assistant'
class ractogateway.openai_developer_kit.OpenAIBatchProcessor(model='gpt-4o-mini', *, api_key=None, base_url=None, default_prompt=None)[source]

Bases: object

Submit thousands of chat-completion requests to OpenAI’s Batch API at ~50 % of standard API cost.

Parameters:
  • model (str) – Chat model to use for all items in a batch (e.g. "gpt-4o-mini").

  • api_key (str | None) – OpenAI API key. Falls back to OPENAI_API_KEY env var.

  • base_url (str | None) – Custom base URL (Azure OpenAI / proxy).

  • default_prompt (RactoPrompt | None) – RACTO prompt used as the system message for every batch item.

submit_batch / asubmit_batch:

Upload JSONL and create batch job → returns BatchJobInfo.

poll_status / apoll_status:

Fetch current job state → returns updated BatchJobInfo.

get_results / aget_results:

Download and parse completed job results → list[BatchResult].

submit_and_wait / asubmit_and_wait:

Convenience: submit + poll until done + return results.

provider: str = 'openai'
submit_batch(items, *, prompt=None, completion_window='24h')[source]

Upload items as a JSONL file and create an OpenAI batch job.

Returns immediately with a BatchJobInfo (status = IN_PROGRESS).

Return type:

BatchJobInfo

poll_status(job_id)[source]

Fetch the current status of batch job job_id.

Return type:

BatchJobInfo

get_results(job_id)[source]

Download and parse results for a completed batch job.

Raises:

RuntimeError – If the job is not yet completed.

Return type:

list[BatchResult]

submit_and_wait(items, *, prompt=None, completion_window='24h', poll_interval_s=60.0, max_wait_s=86400.0)[source]

Submit a batch and block until it completes, then return results.

Parameters:
  • poll_interval_s (float) – Seconds between status-poll API calls. Default 60.0.

  • max_wait_s (float) – Maximum total seconds to wait. Default 86400 (24 h).

Raises:
  • TimeoutError – If the batch does not complete within max_wait_s.

  • RuntimeError – If the batch job fails or is cancelled.

Return type:

list[BatchResult]

async asubmit_batch(items, *, prompt=None, completion_window='24h')[source]

Async variant of submit_batch().

Return type:

BatchJobInfo

async apoll_status(job_id)[source]

Async variant of poll_status().

Return type:

BatchJobInfo

async aget_results(job_id)[source]

Async variant of get_results().

Return type:

list[BatchResult]

async asubmit_and_wait(items, *, prompt=None, completion_window='24h', poll_interval_s=60.0, max_wait_s=86400.0)[source]

Async variant of submit_and_wait().

Return type:

list[BatchResult]

class ractogateway.openai_developer_kit.OpenAIDeveloperKit(model='gpt-4o', *, api_key=None, base_url=None, embedding_model='text-embedding-3-small', default_prompt=None, exact_cache=None, semantic_cache=None, router=None, truncator=None, tracer=None, metrics=None)[source]

Bases: object

Complete OpenAI developer kit — chat, stream, embeddings, and optional performance/cost optimisation middleware.

Parameters:
  • model (str) – Chat model (e.g. "gpt-4o", "gpt-4o-mini"). Use "auto" when a CostAwareRouter is provided — the router will select the model per-request.

  • api_key (str | None) – OpenAI API key. Falls back to OPENAI_API_KEY env var.

  • base_url (str | None) – Custom base URL (Azure OpenAI or proxy).

  • embedding_model (str) – Default embedding model. Defaults to "text-embedding-3-small".

  • default_prompt (RactoPrompt | None) – RACTO prompt used when ChatConfig.prompt is None.

  • exact_cache (ExactMatchCache | None) – Optional ExactMatchCache. Serves byte-identical requests from memory at zero cost.

  • semantic_cache (SemanticCache | None) – Optional SemanticCache. Returns cached answers for semantically similar queries (similarity ≥ threshold).

  • router (CostAwareRouter | None) – Optional CostAwareRouter. Selects the cheapest model that can handle each request’s complexity. Required when model="auto".

  • truncator (TokenTruncator | None) – Optional TokenTruncator. Automatically trims conversation history to fit the model’s context window before each API call.

  • tracer (RactoTracer | None) – Optional RactoTracer. Emits OpenTelemetry spans for every chat, stream, and embed call. Requires pip install ractogateway[telemetry].

  • metrics (GatewayMetricsMiddleware | None) – Optional GatewayMetricsMiddleware. Records Prometheus metrics (latency, tokens, cost, cache hit/miss). Requires pip install ractogateway[prometheus].

provider: str = 'openai'
chat(config)[source]

Synchronous chat completion with optional middleware pipeline.

Middleware order: truncate → exact cache → semantic cache → route model → API call → write caches → record telemetry.

Return type:

LLMResponse

async achat(config)[source]

Async chat completion with optional middleware pipeline.

Return type:

LLMResponse

stream(config)[source]

Synchronous streaming — yields StreamChunk objects.

Example:

for chunk in kit.stream(config):
    print(chunk.delta.text, end="", flush=True)
    if chunk.is_final:
        print(f"\nTokens: {chunk.usage}")
Return type:

Iterator[StreamChunk]

async astream(config)[source]

Async streaming — yields StreamChunk objects.

Return type:

AsyncIterator[StreamChunk]

embed(config)[source]

Synchronous embedding.

Return type:

EmbeddingResponse

async aembed(config)[source]

Async embedding.

Return type:

EmbeddingResponse

class ractogateway.openai_developer_kit.RoutingTier(**data)[source]

Bases: BaseModel

One tier in the cost-aware routing ladder.

The router evaluates a complexity score (0-100) for each incoming message and selects the first tier whose max_score is >= that score. The last tier in the list always acts as the catch-all fallback.

Parameters:
  • model (str) – The LLM model identifier to use for requests that fall in this tier (e.g. "gpt-4o-mini", "gemini-2.0-flash", "claude-haiku-4-5-20251001").

  • max_score (float) – Inclusive upper bound on the complexity score that routes to this model. Range: 0-100. Set to 100 for the last (most powerful) tier so it catches everything.

Examples

tiers = [
    RoutingTier(model="gpt-4o-mini",  max_score=30),
    RoutingTier(model="gpt-4o",        max_score=70),
    RoutingTier(model="o3-mini",        max_score=100),
]

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model: str
max_score: float
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.SemanticCache(embed_fn, similarity_threshold=0.95, max_size=512, ttl_seconds=None)[source]

Bases: object

Vector-similarity cache — returns cached answers for semantically similar queries, costing $0 in API calls.

Parameters:
  • embed_fn (Callable[[str], list[float]]) – Any callable (text: str) -> list[float]. Called once per new query (cache miss) and once at put() time.

  • similarity_threshold (float) – Minimum cosine similarity to declare a hit. Default 0.95 is intentionally strict to avoid incorrect responses.

  • max_size (int) – Maximum number of entries (LRU eviction). 0 = unlimited.

  • ttl_seconds (float | None) – Optional per-entry TTL. None disables expiry.

Examples

import ractogateway.openai_developer_kit as gpt

kit = gpt.OpenAIDeveloperKit(model="gpt-4o")

def embed(text: str) -> list[float]:
    import openai
    r = openai.OpenAI().embeddings.create(
        model="text-embedding-3-small", input=text
    )
    return r.data[0].embedding

cache = SemanticCache(embed_fn=embed, similarity_threshold=0.95)
get(query)[source]

Embed query and return a cached response if cosine-sim ≥ threshold.

Returns None on a cache miss (caller should make the real API call and then invoke put()).

Complexity: O(n·d) where n = number of entries, d = embedding dim.

Return type:

LLMResponse | None

put(query, response)[source]

Embed query and store response for future similar queries.

Evicts LRU entry when at capacity.

Return type:

None

clear()[source]

Remove all entries and reset counters.

Return type:

None

property stats: CacheStats

Return a snapshot of hit/miss/size counters.

class ractogateway.openai_developer_kit.StreamChunk(**data)[source]

Bases: BaseModel

A single piece of a streaming response.

Consumers iterate over StreamChunk objects — they never touch raw provider events directly.

delta

The incremental content for this chunk.

accumulated_text

Running concatenation of all delta.text values so far.

finish_reason

None for intermediate chunks; set on the final chunk.

tool_calls

Empty until the final chunk (is_final=True).

usage

Token counts — populated on the final chunk only.

is_final

True only for the very last chunk in the stream.

raw

The underlying provider event (escape-hatch for advanced users).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

delta: StreamDelta
accumulated_text: str
accumulated_thinking: str
is_thinking: bool
finish_reason: FinishReason | None
tool_calls: list[ToolCallResult]
usage: dict[str, int]
is_final: bool
parsed: dict[str, Any] | list[Any] | None
raw: Any
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.StreamDelta(**data)[source]

Bases: BaseModel

Incremental content produced by a single streaming event.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

text: str
thinking: str
tool_call_id: str | None
tool_call_name: str | None
tool_call_args_fragment: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.ToolCallResult(**data)[source]

Bases: BaseModel

A single tool/function call returned by the model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

id: str
name: str
arguments: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.openai_developer_kit.TokenTruncator(config=None)[source]

Bases: object

Smart conversation-history trimmer.

Parameters:

config (TruncationConfig | None) – TruncationConfig instance. If omitted a default config is used (approximate counter, 8 k limit).

Examples

from ractogateway.truncation import TokenTruncator, TruncationConfig
import tiktoken

enc = tiktoken.encoding_for_model("gpt-4o")
truncator = TokenTruncator(
    TruncationConfig(
        token_counter=lambda t: len(enc.encode(t)),
        keep_first_n=2,
        keep_last_n=8,
    )
)
kit = OpenAIDeveloperKit(model="gpt-4o", truncator=truncator)
truncate(chat_config, model)[source]

Return a copy of chat_config with trimmed history if necessary.

If the total estimated token count (system prompt + history + user_message) fits within the model’s context limit, the original ChatConfig is returned unchanged.

Parameters:
  • chat_config (ChatConfig) – The chat configuration to potentially truncate.

  • model (str) – The resolved model name used to look up the context-window limit.

Return type:

ChatConfig

Returns:

ChatConfig – A new ChatConfig instance with (possibly shorter) history. The user_message and all other fields are preserved verbatim.

estimate_tokens(text)[source]

Convenience wrapper around the configured token counter.

Return type:

int

class ractogateway.openai_developer_kit.TruncationConfig(**data)[source]

Bases: BaseModel

Configuration for TokenTruncator.

Parameters:
  • max_context_tokens (int | None) – Hard cap on total prompt tokens before calling the API. When None, the truncator looks up the model in MODEL_CONTEXT_LIMITS (falling back to 8 192).

  • keep_first_n (int) – Number of history messages to always preserve from the start of the conversation (anchors context). Defaults to 2.

  • keep_last_n (int) – Number of history messages to always preserve from the most recent end of the conversation. Defaults to 6.

  • token_counter (Callable[[str], int]) –

    Callable (text: str) -> int. Defaults to the built-in approximate counter (len // 4). Swap for tiktoken for exact OpenAI token counts:

    import tiktoken
    enc = tiktoken.encoding_for_model("gpt-4o")
    config = TruncationConfig(token_counter=lambda t: len(enc.encode(t)))
    

  • safety_margin (int) – Extra token budget reserved beyond the system prompt and user message. Defaults to 512.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

max_context_tokens: int | None
keep_first_n: int
keep_last_n: int
token_counter: Callable[[str], int]
safety_margin: int
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resolve_limit(model)[source]

Return the effective token limit for model.

Priority: max_context_tokensMODEL_CONTEXT_LIMITS lookup → _DEFAULT_CONTEXT.

Return type:

int

model_post_init(_TruncationConfig__context)[source]

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

Return type:

None