Native Thinking API Reference
ChatConfig fields
- class ractogateway._models.chat.ChatConfig(**data)[source]
Bases:
BaseModelValidated input for every
chat/achat/stream/astreamcall.Pass a single
ChatConfigto any developer-kit method. Every field has a safe default so you only need to supply what you actually need.Minimal example:
config = ChatConfig(user_message="Explain Python generators.") response = kit.chat(config)
Vision / multimodal example:
from ractogateway.prompts.engine import RactoFile config = ChatConfig( user_message="Describe this chart.", attachments=[RactoFile.from_path("sales_q4.png")], )
Structured JSON output example:
class Sentiment(BaseModel): label: str score: float config = ChatConfig( user_message="I love this library!", response_model=Sentiment, )
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- user_message: str
- prompt: RactoPrompt | None
- temperature: float
- max_tokens: int
- tools: ToolRegistry | None
- auto_execute_tools: bool
- max_tool_turns: int
- max_validation_retries: int
- history: list[Message]
- chain_of_thought: bool
- native_thinking: bool
- thinking_budget: int
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
StreamDelta
- class ractogateway._models.stream.StreamDelta(**data)[source]
Bases:
BaseModelIncremental content produced by a single streaming event.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- text: str
- thinking: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Field |
Type |
Description |
|---|---|---|
|
|
Answer token delta (same as always) |
|
|
Reasoning token delta (non-empty only on thinking chunks) |
StreamChunk
- class ractogateway._models.stream.StreamChunk(**data)[source]
Bases:
BaseModelA single piece of a streaming response.
Consumers iterate over
StreamChunkobjects — they never touch raw provider events directly.- delta
The incremental content for this chunk.
- accumulated_text
Running concatenation of all
delta.textvalues so far.
- finish_reason
Nonefor intermediate chunks; set on the final chunk.
- tool_calls
Empty until the final chunk (
is_final=True).
- usage
Token counts — populated on the final chunk only.
- is_final
Trueonly for the very last chunk in the stream.
- raw
The underlying provider event (escape-hatch for advanced users).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- delta: StreamDelta
- accumulated_text: str
- accumulated_thinking: str
- is_thinking: bool
- finish_reason: FinishReason | None
- tool_calls: list[ToolCallResult]
- is_final: bool
- raw: Any
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Field |
Type |
Description |
|---|---|---|
|
|
Incremental content ( |
|
|
All answer text received so far |
|
|
All reasoning text received so far |
|
|
|
|
|
|
|
|
Token counts on the final chunk |
LLMResponse
- class ractogateway.adapters.base.LLMResponse(**data)[source]
Bases:
BaseModelUnified, provider-agnostic response envelope.
Every adapter’s
run()method returns one of these, regardless of whether the underlying provider is OpenAI, Gemini, or Anthropic.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- tool_calls: list[ToolCallResult]
- finish_reason: FinishReason
- raw: Any
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Field |
Type |
Description |
|---|---|---|
|
|
Final answer text |
|
|
Complete reasoning text (Anthropic / Google) |
|
|
Token counts; OpenAI o-series adds |
Provider behaviour summary
Anthropic
Supported models:
claude-3-7-sonnet-20250219and later.API param injected:
thinking={"type": "enabled", "budget_tokens": N}.Temperature is automatically forced to
1(API requirement; user value is ignored).Stream events:
thinking_deltaevents carrydelta.thinking;text_deltaevents carrydelta.text. Both can interleave in the same stream.Non-streaming:
LLMResponse.thinkingcontains the complete reasoning block.
Google
Supported models:
gemini-2.5-pro,gemini-2.0-flash-thinking-exp.API param injected:
ThinkingConfig(thinking_budget=N)inGenerateContentConfig.Thought parts are identified by
part.thought == Truein the response candidates.Streaming: thought parts arrive as
is_thinking=Truechunks before answer parts.Non-streaming:
LLMResponse.thinkingcontains all joined thought parts.
OpenAI
Supported models:
o1,o3,o3-mini.native_thinking/thinking_budgetare silently ignored (OpenAI does not expose reasoning text in the Chat Completions API).Reasoning token count is exposed automatically in
response.usage["reasoning_tokens"]whenever the model returnscompletion_tokens_details.reasoning_tokens.