ractogateway.pipelines.list_classifier
List Classifier Pipeline — natural-language query → best-matching list item(s).
- class ractogateway.pipelines.list_classifier.AsyncListClassifierPipeline(kit, *, options=None, selection_mode='single', output_format='pydantic', prompt=None, temperature=0.0, max_tokens=512, max_retries=2, include_confidence=True, include_reasoning=False, score_all=False, option_descriptions=None, fuzzy_fallback=True, uncertain_label=None, confidence_threshold=None, case_sensitive=False, safe_mode=False, exact_cache=None, semantic_cache=None, audit_logger=None, tracer=None, metrics=None, rate_limiter=None, memory=None, user_id=None)[source]
Bases:
ListClassifierPipelineAsync-first variant of
ListClassifierPipeline.run()is a coroutine —await pipeline.run(...)directly. Designed for FastAPI, aiohttp, Starlette, and other async frameworks.Constructor and all
run()parameters are identical toListClassifierPipeline.Example
pipeline = AsyncListClassifierPipeline.from_provider( "openai", "gpt-4o-mini", options=["Billing", "Support", "Sales"], safe_mode=True, ) # FastAPI handler: @app.post("/classify") async def classify(query: str): result = await pipeline.run(query) return result.as_dict()
- class ractogateway.pipelines.list_classifier.AuditEntry(**data)[source]
Bases:
BaseModelImmutable audit record emitted to the
audit_loggerafter every call.Emitted regardless of whether the call was served from cache, hit an error, or was a live LLM classification. Provides a complete picture of every request for compliance, debugging, and analytics.
Fields
- timestamp:
ISO 8601 UTC timestamp of when the call was made (e.g.
"2026-02-26T14:23:01.456789Z").- user_query:
Original natural-language query.
- options_provided:
Full candidate list shown to the LLM (including
uncertain_labelif one was configured).- selected:
Option(s) chosen by the LLM, or empty on error.
- confidences:
Per-selection confidence scores, or
None.- all_scores:
Score for every option (when
score_all=True), orNone.- reasoning:
LLM explanation (when
include_reasoning=True), orNone.- fuzzy_corrected:
Truewhen the LLM returned a near-miss that was fuzzy-matched.- uncertain:
Truewhen the LLM selected theuncertain_labeloption.- cache_hit:
"exact"or"semantic"when the result was served from cache;Nonewhen a live LLM call was made.- user_id:
User identifier passed to the pipeline (for rate limiting / audit).
- session_id:
Conversation session identifier (for memory context).
- latency_ms:
Wall-clock latency of the entire pipeline call in milliseconds (near-zero for cache hits).
- usage:
Token usage dict — keys:
input_tokens,output_tokens,total_tokens,retry_count. All zero on cache hits.- error:
Non-
Nonewhensafe_mode=Trueand an exception occurred.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- timestamp: str
- user_query: str
- fuzzy_corrected: bool
- uncertain: bool
- latency_ms: float
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- exception ractogateway.pipelines.list_classifier.ClassifierRateLimitExceededError[source]
Bases:
RuntimeErrorRaised when the rate limiter denies a request for a given user.
- class ractogateway.pipelines.list_classifier.ClassifierResult(**data)[source]
Bases:
BaseModelResult returned by
ListClassifierPipeline.All fields except
user_queryandoptions_providedhave sensible defaults so that a partial result can be returned whensafe_mode=Trueand an error occurs mid-pipeline.Fields
- user_query:
The original natural-language query passed to the pipeline.
- options_provided:
The full list of candidate strings presented to the LLM (including the injected
uncertain_labeloption if one was configured).- selected:
The option(s) chosen by the LLM, ordered by descending confidence. Empty when an error occurred and
safe_mode=True.- confidences:
Per-selection confidence scores in [0.0, 1.0], aligned with
selected.Nonewheninclude_confidence=False.- all_scores:
Confidence score for every option in the list, keyed by option string.
Nonewhenscore_all=False(the default).- reasoning:
Brief natural-language explanation produced by the LLM.
Nonewheninclude_reasoning=False.- fuzzy_corrected:
Truewhen the LLM returned a near-miss that was corrected by the built-in fuzzy matcher without consuming a retry.- uncertain:
Truewhen the LLM selected theuncertain_labeloption, indicating no real option matched the query well enough.- cache_hit:
"exact"or"semantic"when served from cache;Nonefor a live LLM call.- usage:
Aggregated token counts and retry statistics for this call.
- error:
Non-
Noneonly whensafe_mode=Trueand an exception occurred. Whenerroris set,selectedwill be empty.
Examples
>>> result.first # "Billing" >>> result.top_confidence # 0.95 >>> result.as_string() # "Billing, Account" >>> result.as_dict() # {"selected": ["Billing"], ...} >>> result.as_enum()["Billing"].value # "Billing" >>> result.uncertain # False >>> result.cache_hit # "exact" | "semantic" | None
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- user_query: str
- fuzzy_corrected: bool
- uncertain: bool
- usage: ClassifierUsage
- property is_empty: bool
Truewhen no options were selected (including error cases).
- as_string(separator=', ')[source]
Return selected options as a single joined string.
- as_dict()[source]
Return a plain
dictwith selected options and optional metadata.Always contains
"selected"."confidences","all_scores", and"reasoning"are included only when they are non-None.
- as_enum(name='SelectedOptions')[source]
Return a dynamic Python
enum.Enumof the selected options.- Parameters:
name (
str) – Class name for the generated Enum. Default:"SelectedOptions".- Return type:
- Returns:
type[Enum] – An Enum whose members have the option string as both name and value.
Example
>>> E = result.as_enum() >>> E["Billing"].value # "Billing"
- top_n(n)[source]
Return the top-n selected options.
- score_for(option)[source]
Return the confidence score for a specific option, or
None.Searches
all_scoresfirst (all options, whenscore_all=True), thenconfidencesfor selected items.
- to_audit_entry(*, timestamp, user_id=None, session_id=None, latency_ms=0.0)[source]
Build an
AuditEntryfrom this result.Called automatically by the pipeline — exposed here so that users can reconstruct audit entries from stored results if needed.
- Return type:
AuditEntry
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.list_classifier.ClassifierUsage(**data)[source]
Bases:
BaseModelToken usage and retry statistics for a single pipeline call.
Properties
- total_tokens:
Sum of input_tokens and output_tokens across all LLM attempts, including automatic retries triggered by invalid LLM responses. Zero on a cache hit (no LLM call was made).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- input_tokens: int
- output_tokens: int
- retry_count: int
- property total_tokens: int
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.pipelines.list_classifier.ListClassifierPipeline(kit, *, options=None, selection_mode='single', output_format='pydantic', prompt=None, temperature=0.0, max_tokens=512, max_retries=2, include_confidence=True, include_reasoning=False, score_all=False, option_descriptions=None, fuzzy_fallback=True, uncertain_label=None, confidence_threshold=None, case_sensitive=False, safe_mode=False, exact_cache=None, semantic_cache=None, audit_logger=None, tracer=None, metrics=None, rate_limiter=None, memory=None, user_id=None)[source]
Bases:
objectMap a natural-language query to one or more items from a candidate list.
Supports every RactoGateway provider via the
kitparameter or thefrom_provider()class factory. Internally builds a dynamic Pythonenum.Enumfrom the options list and validates every LLM response against it — hallucinations and paraphrased answers are caught, fuzzy- corrected if close enough, and retried otherwise.Two variants
ListClassifierPipeline—run()sync,arun()async.AsyncListClassifierPipeline—run()is async only.
- type kit:
- param kit:
Any RactoGateway developer kit (OpenAI, Anthropic, Google, Ollama, HuggingFace). Must expose
.chat(ChatConfig)and.achat(ChatConfig)methods. Usefrom_provider()instead of constructing kits manually when you only need provider + model.- type options:
- param options:
Default candidate strings. Can be overridden per-call. Must be non-empty and duplicate-free when provided.
- type selection_mode:
Literal['single','multiple']- param selection_mode:
"single"(default) — exactly one option."multiple"— one or more options. Overridable per-call.- type output_format:
Literal['string','dict','pydantic']- param output_format:
"pydantic"(default) —ClassifierResult."string"— comma-joined string."dict"— plaindict. Overridable per-call.- type prompt:
- param prompt:
Custom
RactoPromptto replace the built-in system prompt.- type temperature:
- param temperature:
LLM temperature. Default
0.0for deterministic output.- type max_tokens:
- param max_tokens:
Response token budget. Default
512.- type max_retries:
- param max_retries:
Retry attempts when LLM returns invalid JSON / unknown option. Default
2.- type include_confidence:
- param include_confidence:
Ask LLM for per-selection confidence scores [0.0–1.0]. Default
True.- type include_reasoning:
- param include_reasoning:
Ask LLM for a one-sentence explanation. Default
False.- type score_all:
- param score_all:
Ask LLM for a score for every option (not just selected ones). Stored in
result.all_scores. DefaultFalse.- type option_descriptions:
- param option_descriptions:
{option: description}— shown inline next to each option in the prompt to help the LLM distinguish similar categories.- type fuzzy_fallback:
- param fuzzy_fallback:
Use stdlib
difflibto correct near-miss LLM responses before consuming a retry. DefaultTrue.- type uncertain_label:
- param uncertain_label:
When set, this string is appended as an extra option that the LLM can pick when nothing matches (e.g.
"Other / None of the above").result.uncertainisTruewhen this label is selected.- type confidence_threshold:
- param confidence_threshold:
Drop selections below this score. Keeps highest-confidence match as fallback. Default
None(no filtering).- type case_sensitive:
- param case_sensitive:
Whether option matching is case-sensitive. Default
False.- type safe_mode:
- param safe_mode:
Return
ClassifierResult(error=...)instead of raising. DefaultFalse.- type tracer:
- param tracer:
Optional
RactoTracer.- type metrics:
- param metrics:
Optional
GatewayMetricsMiddleware.- type rate_limiter:
- param rate_limiter:
Duck-typed —
check_and_consume(user_id, tokens) -> bool+get_remaining(user_id) -> int.- type memory:
- param memory:
Duck-typed —
get_history(session_id) -> list[dict]+append(session_id, role, content).- type user_id:
- param user_id:
Default user ID for rate limiting. Overridable per-call.
Example
# Via kit directly from ractogateway.openai_developer_kit import Chat from ractogateway.pipelines import ListClassifierPipeline pipeline = ListClassifierPipeline( kit=Chat(model="gpt-4o-mini"), options=["Billing", "Technical Support", "Sales"], include_confidence=True, include_reasoning=True, ) result = pipeline.run("My invoice is wrong") print(result.first, result.top_confidence) # Via from_provider() — no manual kit import needed pipeline = ListClassifierPipeline.from_provider( "anthropic", "claude-haiku-4-5-20251001", options=["Billing", "Technical Support", "Sales"], )
- classmethod from_provider(provider, model, *, api_key=None, base_url=None, options=None, **kwargs)[source]
Create a pipeline by specifying provider + model — no kit import needed.
- Parameters:
provider (
str) – One of"openai","anthropic","google","ollama","huggingface".model (
str) –Model identifier string, e.g.:
OpenAI:
"gpt-4o-mini","gpt-4o"Anthropic:
"claude-haiku-4-5-20251001","claude-sonnet-4-6"Google:
"gemini-2.0-flash","gemini-1.5-pro"Ollama:
"llama3.2","mistral"HuggingFace:
"meta-llama/Llama-3.2-3B-Instruct"
api_key (
str|None) – Provider API key. Falls back to the standard env var for each provider (e.g.OPENAI_API_KEY,ANTHROPIC_API_KEY).base_url (
str|None) – Custom endpoint — used for Ollama (http://localhost:11434) or OpenAI-compatible proxies.options (
list[str] |None) – Default candidate options list.**kwargs (
Any) – Any otherListClassifierPipelineconstructor parameters (selection_mode,include_confidence,safe_mode, etc.).
- Return type:
ListClassifierPipeline- Returns:
ListClassifierPipeline
Example
pipeline = ListClassifierPipeline.from_provider( "anthropic", "claude-haiku-4-5-20251001", options=["Billing", "Support", "Sales"], include_reasoning=True, safe_mode=True, )
- static make_enum(options, name='OptionsEnum')[source]
Build a standalone dynamic
enum.Enumfrom an options list.Useful when you want enum-typed values outside the pipeline.
- Parameters:
- Return type:
- Returns:
type[Enum]
Example
E = ListClassifierPipeline.make_enum(["Red", "Green", "Blue"]) E["Red"].value # "Red"
- get_options()[source]
Return the pipeline-level options list, or
Noneif not set.
- set_options(options)[source]
Replace the entire pipeline-level options list.
Thread-safe — safe to call while the pipeline is in use.
- add_option(option, description=None)[source]
Append a new option to the pipeline-level list.
- remove_option(option)[source]
Remove an option from the pipeline-level list.
- run(user_query, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]
Classify user_query synchronously.
- Parameters:
user_query (
str) – Natural-language query to classify.options (
list[str] |None) – Per-call override for the candidate list. Omit to use the pipeline-level list. Pass[]to get aValueError.selection_mode (
Literal['single','multiple'] |None) – Per-call override —"single"or"multiple".output_format (
Literal['string','dict','pydantic'] |None) – Per-call override —"pydantic","string", or"dict".confidence_threshold (
float|None) – Per-call override. PassNoneexplicitly to disable filtering for this call even if a pipeline-level threshold is set.session_id (
str|None) – Conversation session ID for memory retrieval/storage.user_id (
str|None) – Per-call user ID for rate limiting and audit.
- Return type:
- Returns:
ClassifierResult | str | dict – Type depends on output_format.
- batch_run(queries, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]
Classify multiple queries synchronously, one after another.
Shares all per-call overrides across every query in the batch. Use
abatch_run()to run them concurrently in async contexts.
- async arun(user_query, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None)[source]
Async variant of
run()— identical parameters.
- async abatch_run(queries, *, options=<object object>, selection_mode=None, output_format=None, temperature=None, max_tokens=None, confidence_threshold=<object object>, session_id=None, user_id=None, max_concurrency=None)[source]
Classify multiple queries concurrently with
asyncio.gather.- Parameters:
- Return type:
- Returns:
list – Results in the same order as queries.