ractogateway.truncation.truncator

Automated token truncation for long conversation histories.

When a conversation history is about to breach the model’s context-window limit, TokenTruncator trims middle turns while preserving:

  • The beginning of the conversation (keep_first_n messages) — provides original task context.

  • The most recent turns (keep_last_n messages) — preserves conversational continuity.

  • The current system prompt and user message — always present.

The truncator operates on ChatConfig and returns a new config object (Pydantic model_copy), leaving the original unchanged.

No external dependencies are required for the default approximation mode (len(text) // 4). Swap in tiktoken for exact OpenAI token counting.

class ractogateway.truncation.truncator.TokenTruncator(config=None)[source]

Bases: object

Smart conversation-history trimmer.

Parameters:

config (TruncationConfig | None) – TruncationConfig instance. If omitted a default config is used (approximate counter, 8 k limit).

Examples

from ractogateway.truncation import TokenTruncator, TruncationConfig
import tiktoken

enc = tiktoken.encoding_for_model("gpt-4o")
truncator = TokenTruncator(
    TruncationConfig(
        token_counter=lambda t: len(enc.encode(t)),
        keep_first_n=2,
        keep_last_n=8,
    )
)
kit = OpenAIDeveloperKit(model="gpt-4o", truncator=truncator)
truncate(chat_config, model)[source]

Return a copy of chat_config with trimmed history if necessary.

If the total estimated token count (system prompt + history + user_message) fits within the model’s context limit, the original ChatConfig is returned unchanged.

Parameters:
  • chat_config (ChatConfig) – The chat configuration to potentially truncate.

  • model (str) – The resolved model name used to look up the context-window limit.

Return type:

ChatConfig

Returns:

ChatConfig – A new ChatConfig instance with (possibly shorter) history. The user_message and all other fields are preserved verbatim.

estimate_tokens(text)[source]

Convenience wrapper around the configured token counter.

Return type:

int