ractogateway.pipelines.list_classifier

List Classifier Pipeline — natural-language query → best-matching list item(s).

class ractogateway.pipelines.list_classifier.AsyncListClassifierPipeline(kit, *, options=None, selection_mode='single', output_format='pydantic', prompt=None, temperature=0.0, max_tokens=512, max_retries=2, include_confidence=True, include_reasoning=False, score_all=False, option_descriptions=None, fuzzy_fallback=True, uncertain_label=None, confidence_threshold=None, case_sensitive=False, safe_mode=False, exact_cache=None, semantic_cache=None, audit_logger=None, tracer=None, metrics=None, rate_limiter=None, memory=None, user_id=None)[source]

Bases: ListClassifierPipeline

Async-first variant of ListClassifierPipeline.

run() is a coroutine — await pipeline.run(...) directly. Designed for FastAPI, aiohttp, Starlette, and other async frameworks.

Constructor and all run() parameters are identical to ListClassifierPipeline.

Example

pipeline = AsyncListClassifierPipeline.from_provider(
    "openai", "gpt-4o-mini",
    options=["Billing", "Support", "Sales"],
    safe_mode=True,
)

# FastAPI handler:
@app.post("/classify")
async def classify(query: str):
    result = await pipeline.run(query)
    return result.as_dict()

async run(user_query, **kwargs)[source]

Async run() — delegates to ListClassifierPipeline.arun().

Return type:: ClassifierResult | str | dict[str, Any]

class ractogateway.pipelines.list_classifier.AuditEntry(**data)[source]

Bases: BaseModel

Immutable audit record emitted to the audit_logger after every call.

Emitted regardless of whether the call was served from cache, hit an error, or was a live LLM classification. Provides a complete picture of every request for compliance, debugging, and analytics.

Fields

timestamp:: ISO 8601 UTC timestamp of when the call was made (e.g. "2026-02-26T14:23:01.456789Z").
user_query:: Original natural-language query.
options_provided:: Full candidate list shown to the LLM (including uncertain_label if one was configured).
selected:: Option(s) chosen by the LLM, or empty on error.
confidences:: Per-selection confidence scores, or None.
all_scores:: Score for every option (when score_all=True), or None.
reasoning:: LLM explanation (when include_reasoning=True), or None.
fuzzy_corrected:: True when the LLM returned a near-miss that was fuzzy-matched.
uncertain:: True when the LLM selected the uncertain_label option.
cache_hit:: "exact" or "semantic" when the result was served from cache; None when a live LLM call was made.
user_id:: User identifier passed to the pipeline (for rate limiting / audit).
session_id:: Conversation session identifier (for memory context).
latency_ms:: Wall-clock latency of the entire pipeline call in milliseconds (near-zero for cache hits).
usage:: Token usage dict — keys: input_tokens, output_tokens, total_tokens, retry_count. All zero on cache hits.
error:: Non-None when safe_mode=True and an exception occurred.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

timestamp: str

user_query: str

options_provided: list[str]

selected: list[str]

confidences: list[float] | None

all_scores: dict[str, float] | None

reasoning: str | None

fuzzy_corrected: bool

uncertain: bool

cache_hit: str | None

user_id: str | None

session_id: str | None

latency_ms: float

usage: dict[str, int]

error: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

exception ractogateway.pipelines.list_classifier.ClassifierRateLimitExceededError[source]

Bases: RuntimeError

Raised when the rate limiter denies a request for a given user.

class ractogateway.pipelines.list_classifier.ClassifierResult(**data)[source]

Bases: BaseModel

Result returned by ListClassifierPipeline.

All fields except user_query and options_provided have sensible defaults so that a partial result can be returned when safe_mode=True and an error occurs mid-pipeline.

Fields

user_query:: The original natural-language query passed to the pipeline.
options_provided:: The full list of candidate strings presented to the LLM (including the injected uncertain_label option if one was configured).
selected:: The option(s) chosen by the LLM, ordered by descending confidence. Empty when an error occurred and safe_mode=True.
confidences:: Per-selection confidence scores in [0.0, 1.0], aligned with selected. None when include_confidence=False.
all_scores:: Confidence score for every option in the list, keyed by option string. None when score_all=False (the default).
reasoning:: Brief natural-language explanation produced by the LLM. None when include_reasoning=False.
fuzzy_corrected:: True when the LLM returned a near-miss that was corrected by the built-in fuzzy matcher without consuming a retry.
uncertain:: True when the LLM selected the uncertain_label option, indicating no real option matched the query well enough.
cache_hit:: "exact" or "semantic" when served from cache; None for a live LLM call.
usage:: Aggregated token counts and retry statistics for this call.
error:: Non-None only when safe_mode=True and an exception occurred. When error is set, selected will be empty.

Examples

>>> result.first                         # "Billing"
>>> result.top_confidence                # 0.95
>>> result.as_string()                   # "Billing, Account"
>>> result.as_dict()                     # {"selected": ["Billing"], ...}
>>> result.as_enum()["Billing"].value    # "Billing"
>>> result.uncertain                     # False
>>> result.cache_hit                     # "exact" | "semantic" | None

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

user_query: str

options_provided: list[str]

selected: list[str]

confidences: list[float] | None

all_scores: dict[str, float] | None

reasoning: str | None

fuzzy_corrected: bool

uncertain: bool

cache_hit: str | None

usage: ClassifierUsage

error: str | None

property first: str | None: The first (highest-priority) selected option, or None if empty.

property top_confidence: float | None: Confidence score for the first selected option, or None.

property is_empty: bool: True when no options were selected (including error cases).

as_string(separator=', ')[source]

Return selected options as a single joined string.

Parameters:: separator (str) – Delimiter placed between items. Default: ", ".
Return type:: str
Returns:: str – E.g. "Billing, Account" for two selections.

as_dict()[source]

Return a plain dict with selected options and optional metadata.

Always contains "selected". "confidences", "all_scores", and "reasoning" are included only when they are non-None.

Return type:

dict[str, Any]

Returns:

dict[str, Any] –

Example:

{
    "selected":    ["Billing", "Account"],
    "confidences": [0.95, 0.82],
    "all_scores":  {"Billing": 0.95, "Account": 0.82, "Sales": 0.12},
    "reasoning":   "Both topics are mentioned explicitly.",
}

as_enum(name='SelectedOptions')[source]

Return a dynamic Python enum.Enum of the selected options.

Parameters:: name (str) – Class name for the generated Enum. Default: "SelectedOptions".
Return type:: type[Enum]
Returns:: type[Enum] – An Enum whose members have the option string as both name and value.

Example

>>> E = result.as_enum()
>>> E["Billing"].value   # "Billing"

top_n(n)[source]

Return the top-n selected options.

Parameters:: n (int) – Maximum number of options to return.
Return type:: list[str]

score_for(option)[source]

Return the confidence score for a specific option, or None.

Searches all_scores first (all options, when score_all=True), then confidences for selected items.

Parameters:: option (str) – The option string to look up.
Return type:: float | None

to_audit_entry(*, timestamp, user_id=None, session_id=None, latency_ms=0.0)[source]

Build an AuditEntry from this result.

Called automatically by the pipeline — exposed here so that users can reconstruct audit entries from stored results if needed.

Return type:: AuditEntry

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.list_classifier.ClassifierUsage(**data)[source]

Bases: BaseModel

Token usage and retry statistics for a single pipeline call.

Properties

total_tokens:: Sum of input_tokens and output_tokens across all LLM attempts, including automatic retries triggered by invalid LLM responses. Zero on a cache hit (no LLM call was made).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

input_tokens: int

output_tokens: int

retry_count: int

property total_tokens: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.pipelines.list_classifier.ListClassifierPipeline(kit, *, options=None, selection_mode='single', output_format='pydantic', prompt=None, temperature=0.0, max_tokens=512, max_retries=2, include_confidence=True, include_reasoning=False, score_all=False, option_descriptions=None, fuzzy_fallback=True, uncertain_label=None, confidence_threshold=None, case_sensitive=False, safe_mode=False, exact_cache=None, semantic_cache=None, audit_logger=None, tracer=None, metrics=None, rate_limiter=None, memory=None, user_id=None)[source]

Bases: object

Map a natural-language query to one or more items from a candidate list.

Supports every RactoGateway provider via the kit parameter or the from_provider() class factory. Internally builds a dynamic Python enum.Enum from the options list and validates every LLM response against it — hallucinations and paraphrased answers are caught, fuzzy- corrected if close enough, and retried otherwise.

Two variants

ListClassifierPipeline — run() sync, arun() async.
AsyncListClassifierPipeline — run() is async only.

type kit:: Any
param kit:: Any RactoGateway developer kit (OpenAI, Anthropic, Google, Ollama, HuggingFace). Must expose .chat(ChatConfig) and .achat(ChatConfig) methods. Use from_provider() instead of constructing kits manually when you only need provider + model.
type options:: list[str] | None
param options:: Default candidate strings. Can be overridden per-call. Must be non-empty and duplicate-free when provided.
type selection_mode:: Literal['single', 'multiple']
param selection_mode:: "single" (default) — exactly one option. "multiple" — one or more options. Overridable per-call.
type output_format:: Literal['string', 'dict', 'pydantic']
param output_format:: "pydantic" (default) — ClassifierResult. "string" — comma-joined string. "dict" — plain dict. Overridable per-call.
type prompt:: RactoPrompt | None
param prompt:: Custom RactoPrompt to replace the built-in system prompt.
type temperature:: float
param temperature:: LLM temperature. Default 0.0 for deterministic output.
type max_tokens:: int
param max_tokens:: Response token budget. Default 512.
type max_retries:: int
param max_retries:: Retry attempts when LLM returns invalid JSON / unknown option. Default 2.
type include_confidence:: bool
param include_confidence:: Ask LLM for per-selection confidence scores [0.0–1.0]. Default True.
type include_reasoning:: bool
param include_reasoning:: Ask LLM for a one-sentence explanation. Default False.
type score_all:: bool
param score_all:: Ask LLM for a score for every option (not just selected ones). Stored in result.all_scores. Default False.
type option_descriptions:: dict[str, str] | None
param option_descriptions:: {option: description} — shown inline next to each option in the prompt to help the LLM distinguish similar categories.
type fuzzy_fallback:: bool
param fuzzy_fallback:: Use stdlib difflib to correct near-miss LLM responses before consuming a retry. Default True.
type uncertain_label:: str | None
param uncertain_label:: When set, this string is appended as an extra option that the LLM can pick when nothing matches (e.g. "Other / None of the above"). result.uncertain is True when this label is selected.
type confidence_threshold:: float | None
param confidence_threshold:: Drop selections below this score. Keeps highest-confidence match as fallback. Default None (no filtering).
type case_sensitive:: bool
param case_sensitive:: Whether option matching is case-sensitive. Default False.
type safe_mode:: bool
param safe_mode:: Return ClassifierResult(error=...) instead of raising. Default False.
type tracer:: Any | None
param tracer:: Optional RactoTracer.
type metrics:: Any | None
param metrics:: Optional GatewayMetricsMiddleware.
type rate_limiter:: Any | None
param rate_limiter:: Duck-typed — check_and_consume(user_id, tokens) -> bool + get_remaining(user_id) -> int.
type memory:: Any | None
param memory:: Duck-typed — get_history(session_id) -> list[dict] + append(session_id, role, content).
type user_id:: str | None
param user_id:: Default user ID for rate limiting. Overridable per-call.

Example

# Via kit directly
from ractogateway.openai_developer_kit import Chat
from ractogateway.pipelines import ListClassifierPipeline

pipeline = ListClassifierPipeline(
    kit=Chat(model="gpt-4o-mini"),
    options=["Billing", "Technical Support", "Sales"],
    include_confidence=True,
    include_reasoning=True,
)
result = pipeline.run("My invoice is wrong")
print(result.first, result.top_confidence)

# Via from_provider() — no manual kit import needed
pipeline = ListClassifierPipeline.from_provider(
    "anthropic", "claude-haiku-4-5-20251001",
    options=["Billing", "Technical Support", "Sales"],
)

classmethod from_provider(provider, model, *, api_key=None, base_url=None, options=None, **kwargs)[source]

Create a pipeline by specifying provider + model — no kit import needed.

Parameters:

provider (str) – One of "openai", "anthropic", "google", "ollama", "huggingface".
model (str) –
Model identifier string, e.g.:
- OpenAI: "gpt-4o-mini", "gpt-4o"
- Anthropic: "claude-haiku-4-5-20251001", "claude-sonnet-4-6"
- Google: "gemini-2.0-flash", "gemini-1.5-pro"
- Ollama: "llama3.2", "mistral"
- HuggingFace: "meta-llama/Llama-3.2-3B-Instruct"
api_key (str | None) – Provider API key. Falls back to the standard env var for each provider (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY).
base_url (str | None) – Custom endpoint — used for Ollama (http://localhost:11434) or OpenAI-compatible proxies.
options (list[str] | None) – Default candidate options list.
**kwargs (Any) – Any other ListClassifierPipeline constructor parameters (selection_mode, include_confidence, safe_mode, etc.).

Return type:

ListClassifierPipeline

Returns:

ListClassifierPipeline

Example

pipeline = ListClassifierPipeline.from_provider(
    "anthropic",
    "claude-haiku-4-5-20251001",
    options=["Billing", "Support", "Sales"],
    include_reasoning=True,
    safe_mode=True,
)

static make_enum(options, name='OptionsEnum')[source]

Build a standalone dynamic enum.Enum from an options list.

Useful when you want enum-typed values outside the pipeline.

Parameters:

options (list[str]) – List of option strings.
name (str) – Enum class name. Default "OptionsEnum".

Return type:

type[Enum]

Returns:

type[Enum]

Example

E = ListClassifierPipeline.make_enum(["Red", "Green", "Blue"])
E["Red"].value   # "Red"

get_options()[source]

Return the pipeline-level options list, or None if not set.

Return type:: list[str] | None

set_options(options)[source]

Replace the entire pipeline-level options list.

Thread-safe — safe to call while the pipeline is in use.

Parameters:: options (list[str]) – New options list. Must be non-empty and duplicate-free.
Return type:: None

add_option(option, description=None)[source]

Append a new option to the pipeline-level list.

Parameters:

option (str) – The option string to add.
description (str | None) – Optional inline description for the option.

Return type:

None

remove_option(option)[source]

Remove an option from the pipeline-level list.

Parameters:: option (str) – The option string to remove. Raises ValueError if not found.
Return type:: None

run(user_query, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]

Classify user_query synchronously.

Parameters:

user_query (str) – Natural-language query to classify.
options (list[str] | None) – Per-call override for the candidate list. Omit to use the pipeline-level list. Pass [] to get a ValueError.
selection_mode (Literal['single', 'multiple'] | None) – Per-call override — "single" or "multiple".
output_format (Literal['string', 'dict', 'pydantic'] | None) – Per-call override — "pydantic", "string", or "dict".
max_tokens (int | None) – Per-call LLM setting overrides.
confidence_threshold (float | None) – Per-call override. Pass None explicitly to disable filtering for this call even if a pipeline-level threshold is set.
session_id (str | None) – Conversation session ID for memory retrieval/storage.
user_id (str | None) – Per-call user ID for rate limiting and audit.

Return type:

ClassifierResult | str | dict[str, Any]

Returns:

ClassifierResult | str | dict – Type depends on output_format.

batch_run(queries, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]

Classify multiple queries synchronously, one after another.

Shares all per-call overrides across every query in the batch. Use abatch_run() to run them concurrently in async contexts.

Parameters:: queries (list[str]) – List of natural-language queries to classify.
Return type:: list[ClassifierResult | str | dict[str, Any]]
Returns:: list – One result per query, in the same order.

async arun(user_query, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]

Async variant of run() — identical parameters.

Return type:: ClassifierResult | str | dict[str, Any]

async abatch_run(queries, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None, max_concurrency=None)[source]

Classify multiple queries concurrently with asyncio.gather.

Parameters:

queries (list[str]) – List of natural-language queries.
max_concurrency (int | None) – Cap the number of simultaneous LLM calls. None (default) runs all queries in parallel. Set to e.g. 5 to avoid rate-limit errors on large batches.

Return type:

list[ClassifierResult | str | dict[str, Any]]

Returns:

list – Results in the same order as queries.