ractogateway.rag

RactoGateway RAG — production-grade Retrieval-Augmented Generation.

Quick start:

from ractogateway.rag.pipeline import RactoRAG
from ractogateway.rag.embedders.openai_embedder import OpenAIEmbedder
from ractogateway.rag.stores.chroma_store import ChromaStore

rag = RactoRAG(
    vector_store=ChromaStore(collection="my_docs"),
    embedder=OpenAIEmbedder(),
    llm_kit=my_kit,
)
rag.ingest("docs/report.pdf")
response = rag.query("What was the Q3 revenue?")

Components

  • Readers: ractogateway.rag.readers

  • Chunkers: ractogateway.rag.chunkers

  • Processors: ractogateway.rag.processors

  • Embedders: ractogateway.rag.embedders

  • Stores: ractogateway.rag.stores

  • Pipeline: RactoRAG

class ractogateway.rag.RactoRAG(vector_store=None, embedder=None, *, store=None, chunker=None, processors=None, llm_kit=None, context_template=None, reader_registry=None, default_prompt=None)[source]

Bases: object

Production-grade RAG pipeline for RactoGateway.

Parameters:
  • vector_store (BaseVectorStore | None) – Any BaseVectorStore instance.

  • embedder (BaseEmbedder | None) – Any BaseEmbedder instance.

  • chunker (BaseChunker | None) – How to split documents. Defaults to RecursiveChunker with chunk_size=512, overlap=50.

  • processors (list[BaseProcessor] | None) – List of text processors applied to each chunk before embedding. Defaults to [TextCleaner()].

  • llm_kit (Any | None) – Any developer kit (OpenAIDeveloperKit, GoogleDeveloperKit, or AnthropicDeveloperKit). Required for query() / aquery().

  • context_template (str | None) – Template string for injecting retrieved context into the LLM prompt. Must contain {context} and {question} placeholders.

  • reader_registry (FileReaderRegistry | None) – Custom FileReaderRegistry. Defaults to a registry with all built-in readers.

  • default_prompt (RactoPrompt | None) – RACTO prompt used for generation. Falls back to a built-in RAG prompt.

ingest(path, **metadata)[source]

Read, chunk, embed, and store a single file.

Parameters:
  • path (str | Path) – Path to the file to ingest.

  • **metadata (Any) – Extra metadata merged into each chunk’s ChunkMetadata.extra.

Return type:

list[Chunk]

Returns:

list[Chunk] – The chunks that were added to the vector store.

ingest_dir(directory, pattern='**/*', **metadata)[source]

Recursively ingest all supported files in a directory.

Parameters:
  • directory (str | Path) – Root directory to scan.

  • pattern (str) – Glob pattern relative to directory.

  • **metadata (Any) – Extra metadata merged into every chunk.

Return type:

list[Chunk]

Returns:

list[Chunk] – All chunks added across all ingested files.

ingest_text(text, source='manual', **metadata)[source]

Ingest a raw text string directly (no file needed).

Parameters:
  • text (str) – The text content to ingest.

  • source (str) – A label identifying the source (stored in metadata).

  • **metadata (Any) – Extra metadata merged into each chunk.

Return type:

list[Chunk]

async aingest(path, **metadata)[source]

Async variant of ingest().

Return type:

list[Chunk]

async aingest_dir(directory, pattern='**/*', **metadata)[source]

Async variant of ingest_dir().

Return type:

list[Chunk]

async aingest_text(text, source='manual', **metadata)[source]

Async variant of ingest_text().

Return type:

list[Chunk]

retrieve(query, top_k=5, filters=None)[source]

Embed query and retrieve the top-k most relevant chunks.

Parameters:
  • query (str) – Natural-language question or search phrase.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked results (rank 1 = most relevant).

async aretrieve(query, top_k=5, filters=None)[source]

Async variant of retrieve().

Return type:

list[RetrievalResult]

query(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]

Retrieve relevant chunks and generate an answer.

Parameters:
  • question (str) – The user’s question.

  • top_k (int) – Number of context chunks to retrieve.

  • filters (dict[str, Any] | None) – Optional metadata filters for retrieval.

  • prompt (RactoPrompt | None) – Override the default RACTO prompt for generation.

  • temperature (float) – LLM temperature (default 0.0 for factual answers).

  • max_tokens (int) – Maximum tokens in the generated answer.

Return type:

RAGResponse

Returns:

RAGResponse – Contains the generated answer plus the retrieved source chunks.

Raises:

RuntimeError – If no llm_kit was provided.

async aquery(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]

Async variant of query().

Return type:

RAGResponse

property store: BaseVectorStore

The underlying vector store.

property embedder: BaseEmbedder

The underlying embedder.

count()[source]

Return the total number of indexed chunks.

Return type:

int

clear()[source]

Remove all indexed chunks from the vector store.

Return type:

None

class ractogateway.rag.PageIndexRAG(llm_kit=None, *, processors=None, reader_registry=None, context_template="Use the following retrieved page excerpts to answer the user's question.\\nIf the excerpts do not contain enough information, say so clearly.\\n\\n--- CONTEXT ---\\n{context}\\n--- END CONTEXT ---\\n\\nQuestion: {question}", default_prompt=None, page_size=1000, page_overlap=100, k1=1.5, b=0.75, top_keywords=20, ocr_backend=None, ocr_fallback=True, min_ocr_confidence=0.0)[source]

Bases: object

Vectorless RAG pipeline that indexes documents at the page level.

Parameters:
  • llm_kit (Any) – Any RactoGateway developer kit (OpenAI, Anthropic, Google, Ollama, HuggingFace). Required only for query() / aquery(). Pass None to use the pipeline in retrieve-only mode.

  • processors (Sequence[BaseProcessor] | None) – Text processors applied to each page before indexing. Defaults to [TextCleaner()].

  • reader_registry (FileReaderRegistry | None) – File reader registry used to load non-PDF documents. Defaults to a FileReaderRegistry with all built-in readers registered.

  • context_template (str) – Jinja-style template with {context} and {question} placeholders used when building the LLM prompt.

  • default_prompt (RactoPrompt | None) – RactoPrompt used for generation. Defaults to a built-in factual Q&A prompt.

  • page_size (int) – Maximum character length of each page window for non-PDF files (default 1 000).

  • page_overlap (int) – Character overlap between consecutive windows (default 100).

  • k1 (float) – BM25 term-frequency saturation parameter (default 1.5).

  • b (float) – BM25 length-normalisation parameter (default 0.75).

  • top_keywords (int) – Number of top TF-weighted keywords to extract per page for the decision index (default 20).

retrieve(query, top_k=5)[source]

Retrieve the most relevant pages for query.

Uses two-stage retrieval: decision index (candidate selection) → BM25 scoring (ranking).

Parameters:
  • query (str) – Natural-language question or keyword string.

  • top_k (int) – Maximum number of results to return.

Return type:

list[PageIndexResult]

Returns:

list[PageIndexResult] – Pages ranked by BM25 score (most relevant first).

async aretrieve(query, top_k=5)[source]

Async variant of retrieve().

Return type:

list[PageIndexResult]

ingest(path, **metadata)[source]

Read a file and add its pages to the index.

PDFs are split page-by-page; all other file types are split into fixed-size character windows.

Parameters:
  • path (str) – Absolute or relative path to the file.

  • **metadata (Any) – Arbitrary key/value pairs stored in PageEntry.extra.

Return type:

list[PageEntry]

Returns:

list[PageEntry] – All page entries created from this file.

async aingest(path, **metadata)[source]

Async variant of ingest().

Return type:

list[PageEntry]

ingest_text(text, source='manual', **metadata)[source]

Index raw text directly (no file I/O).

Parameters:
  • text (str) – Plain text to index.

  • source (str) – Descriptive label stored in PageEntry.source.

  • **metadata (Any) – Arbitrary key/value pairs stored in PageEntry.extra.

Return type:

list[PageEntry]

async aingest_text(text, source='manual', **metadata)[source]

Async variant of ingest_text().

Return type:

list[PageEntry]

ingest_dir(directory, pattern='**/*', *, on_progress=None, **metadata)[source]

Ingest all files matching pattern inside directory.

Files that cannot be read are logged and skipped; the rest are indexed normally.

Parameters:
  • directory (str) – Root directory to search.

  • pattern (str) – Glob pattern relative to directory (default "**/*").

  • on_progress (Callable[[int, int], None] | None) – Optional callback (done, total) -> None called after each file is processed (or skipped). Useful for progress bars.

  • **metadata (Any) – Forwarded to every ingest() call.

Return type:

list[PageEntry]

async aingest_dir(directory, pattern='**/*', *, max_concurrent=4, on_progress=None, **metadata)[source]

Async parallel variant of ingest_dir().

Parameters:
  • directory (str) – Root directory to search.

  • pattern (str) – Glob pattern relative to directory (default "**/*").

  • max_concurrent (int) – Maximum number of files ingested concurrently (default 4).

  • on_progress (Callable[[int, int], None] | None) – Optional callback (done, total) -> None called after each file finishes (thread-safe; called from the event loop).

  • **metadata (Any) – Forwarded to every aingest() call.

Return type:

list[PageEntry]

add_document(path, **metadata)[source]

Alias for ingest().

Return type:

list[PageEntry]

add_texts(texts, source='manual', **metadata)[source]

Ingest a list of text strings.

Return type:

list[PageEntry]

search(query, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]

Alias for query().

Return type:

PageIndexResponse

query(question, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]

Retrieve relevant pages and generate an answer with the LLM kit.

Parameters:
  • question (str) – Natural-language question to answer.

  • top_k (int) – Number of pages to retrieve.

  • prompt (RactoPrompt | None) – Override the kit’s default prompt for this call.

  • temperature (float) – Sampling temperature for generation.

  • max_tokens (int) – Maximum generation tokens.

Return type:

PageIndexResponse

Returns:

PageIndexResponse – Contains the generated answer, ranked sources, and the context string that was supplied to the model.

Raises:

ValueError – If no llm_kit was provided and generation is requested.

async aquery(question, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]

Async variant of query().

Return type:

PageIndexResponse

remove_document(doc_id)[source]

Remove all pages belonging to doc_id from the index.

Parameters:

doc_id (str) – The doc_id value from any PageEntry returned during ingestion.

Return type:

int

Returns:

int – Number of page entries removed.

clear()[source]

Remove all indexed entries and reset the pipeline to empty state.

Return type:

None

save(path)[source]

Serialise the full index to a JSON file.

The saved file contains all PageEntry records, BM25 term weights, and deduplication hashes. Reload with load().

Parameters:

path (str) – Destination file path (will be created or overwritten).

Return type:

None

classmethod load(path, **kwargs)[source]

Load a previously saved index from path.

Parameters:
  • path (str) – JSON file written by save().

  • **kwargs (Any) – Forwarded to the constructor (e.g. llm_kit=kit).

Return type:

PageIndexRAG

Returns:

PageIndexRAG – A new instance with the index fully restored.

property entry_count: int

Total number of indexed page entries.

property document_count: int

Number of distinct documents ingested.

class ractogateway.rag.PageEntry(**data)[source]

Bases: BaseModel

A single page (or fixed-size window) extracted from a document.

Produced by PageIndexRAG during ingestion and stored in the in-process index.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

entry_id: str
page_number: int | None
content: str
source: str
section_title: str | None
keywords: list[str]
doc_id: str
char_count: int
extra: dict[str, Any]
ocr_applied: bool
ocr_confidence: float | None
content_hash: str | None
property text: str

Alias for content.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.PageIndexResult(**data)[source]

Bases: BaseModel

A single retrieved page together with its BM25 relevance score.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

entry: PageEntry
score: float
rank: int
matched_terms: list[str]
property content: str

Alias for entry.content.

property text: str

Alias for entry.content.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.PageIndexResponse(**data)[source]

Bases: BaseModel

Full response from PageIndexRAG.query() / PageIndexRAG.aquery().

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

answer: LLMResponse | None
sources: list[PageIndexResult]
query: str
context_used: str
property results: list[PageIndexResult]

Alias for sources.

property pages: list[PageIndexResult]

Alias for sources.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.Chunk(**data)[source]

Bases: BaseModel

A single embeddable slice of a document.

Produced by a BaseChunker and enriched with an embedding vector by a BaseEmbedder.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

chunk_id: str
doc_id: str
content: str
embedding: list[float] | None
metadata: ChunkMetadata
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.ChunkMetadata(**data)[source]

Bases: BaseModel

Provenance and positional data attached to every chunk.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

source: str
page: int | None
chunk_index: int
total_chunks: int
start_char: int
end_char: int
doc_id: str
extra: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.Document(**data)[source]

Bases: BaseModel

A raw document loaded from a file or supplied as plain text.

Parameters:
  • content (str) – The full extracted text of the document.

  • source (str) – Absolute file path, URL, or a descriptive label (e.g. "manual").

  • metadata (dict[str, Any]) – Free-form key/value pairs (file size, author, MIME type, …).

  • doc_id (str) – Auto-generated UUID; override only when you need stable IDs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

doc_id: str
content: str
source: str
metadata: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.RAGResponse(**data)[source]

Bases: BaseModel

Combined output from a RAG query (retrieval + generation).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

answer: LLMResponse
sources: list[RetrievalResult]
query: str
context_used: str
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.RetrievalConfig(**data)[source]

Bases: BaseModel

Input parameters for a vector-store search.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

query: str
top_k: int
filters: dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.RetrievalResult(**data)[source]

Bases: BaseModel

A single retrieved chunk together with its relevance score.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

chunk: Chunk
score: float
rank: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ractogateway.rag.FixedChunker(chunk_size=512, overlap=50)[source]

Bases: BaseChunker

Split text into fixed-size character windows with overlap.

Parameters:
  • chunk_size (int) – Maximum number of characters per chunk.

  • overlap (int) – Number of characters to repeat at the start of the next chunk. Must be less than chunk_size.

chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.

class ractogateway.rag.RecursiveChunker(chunk_size=512, overlap=50, separators=None)[source]

Bases: BaseChunker

Split text recursively using a priority list of separators.

Parameters:
  • chunk_size (int) – Maximum number of characters per chunk.

  • overlap (int) – Number of characters of overlap between consecutive chunks.

  • separators (list[str] | None) – Ordered list of separator strings to try. The first separator that produces pieces within chunk_size is used.

chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.

class ractogateway.rag.SemanticChunker(embedder, threshold=0.5, min_chunk_size=2, language='english')[source]

Bases: BaseChunker

Split documents where the semantic similarity between adjacent sentences drops below a threshold.

Parameters:
  • embedder (BaseEmbedder) – Any BaseEmbedder instance.

  • threshold (float) – Cosine similarity below which a split is inserted (default: 0.5).

  • min_chunk_size (int) – Minimum number of sentences per chunk (prevents ultra-fine splits).

  • language (str) – NLTK sentence tokenizer language.

chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.

class ractogateway.rag.SentenceChunker(sentences_per_chunk=5, overlap_sentences=1, language='english')[source]

Bases: BaseChunker

Split text into groups of sentences using NLTK.

Parameters:
  • sentences_per_chunk (int) – Number of sentences per chunk.

  • overlap_sentences (int) – Number of sentences to repeat at the start of the next chunk.

  • language (str) – Language for the NLTK sentence tokenizer (default: "english").

chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.

class ractogateway.rag.GoogleEmbedder(model='text-embedding-004', *, api_key=None, task_type=None, batch_size=100)[source]

Bases: BaseEmbedder

Embed texts using the Google Gemini Embeddings API.

Parameters:
  • model (str) – Gemini embedding model (default "text-embedding-004").

  • api_key (str | None) – Gemini API key. Falls back to GEMINI_API_KEY env var.

  • task_type (str | None) – Gemini task type hint (e.g. "RETRIEVAL_DOCUMENT", "RETRIEVAL_QUERY"). None lets the API decide.

  • batch_size (int) – Maximum number of texts per API call.

property dimension: int

Dimensionality of the embedding vectors.

Returns -1 if not known until after the first call.

embed(texts)[source]

Embed texts synchronously.

Parameters:

texts (list[str]) – Non-empty list of strings to embed.

Return type:

list[list[float]]

Returns:

list[list[float]] – One embedding vector per input text, in the same order.

async aembed(texts)[source]

Async variant of embed().

Return type:

list[list[float]]

class ractogateway.rag.OpenAIEmbedder(model='text-embedding-3-small', *, api_key=None, base_url=None, dimensions=None, batch_size=256)[source]

Bases: BaseEmbedder

Embed texts using the OpenAI Embeddings API.

Parameters:
  • model (str) – OpenAI embedding model (default "text-embedding-3-small").

  • api_key (str | None) – OpenAI API key. Falls back to OPENAI_API_KEY env var.

  • base_url (str | None) – Custom base URL (Azure OpenAI or proxy).

  • dimensions (int | None) – Override output dimensionality (supported for text-embedding-3-*).

  • batch_size (int) – Maximum number of texts per API call.

property dimension: int

Dimensionality of the embedding vectors.

Returns -1 if not known until after the first call.

embed(texts)[source]

Embed texts synchronously.

Parameters:

texts (list[str]) – Non-empty list of strings to embed.

Return type:

list[list[float]]

Returns:

list[list[float]] – One embedding vector per input text, in the same order.

async aembed(texts)[source]

Async variant of embed().

Return type:

list[list[float]]

class ractogateway.rag.VoyageEmbedder(model='voyage-3', *, api_key=None, input_type='document', batch_size=128)[source]

Bases: BaseEmbedder

Embed texts using the Voyage AI API.

Voyage AI embeddings are optimised for Anthropic Claude RAG pipelines and are the recommended choice when using Claude as the generation LLM.

Parameters:
  • model (str) – Voyage model name (default "voyage-3").

  • api_key (str | None) – Voyage API key. Falls back to VOYAGE_API_KEY env var.

  • input_type (str | None) – "query" for queries, "document" for documents to index. Using the correct type improves retrieval quality.

  • batch_size (int) – Maximum texts per API call.

property dimension: int

Dimensionality of the embedding vectors.

Returns -1 if not known until after the first call.

embed(texts)[source]

Embed texts synchronously.

Parameters:

texts (list[str]) – Non-empty list of strings to embed.

Return type:

list[list[float]]

Returns:

list[list[float]] – One embedding vector per input text, in the same order.

async aembed(texts)[source]

Async variant of embed().

Return type:

list[list[float]]

class ractogateway.rag.Lemmatizer(use_pos_tagging=True)[source]

Bases: BaseProcessor

Reduce words to their base (lemma) form using NLTK WordNet.

Parameters:

use_pos_tagging (bool) – If True, use POS tagging to improve lemmatization accuracy. Slightly slower but produces better results.

process(text)[source]

Process text and return the transformed string.

Parameters:

text (str) – Input text (chunk content or raw document content).

Return type:

str

Returns:

str – Processed text. Must be a non-empty string when input is non-empty.

class ractogateway.rag.ProcessingPipeline(processors)[source]

Bases: BaseProcessor

Apply a sequence of BaseProcessor objects to text.

Example:

pipeline = ProcessingPipeline([TextCleaner(), Lemmatizer()])
processed = pipeline.process("  Hello,   worlds!  ")
Parameters:

processors (list[BaseProcessor]) – Ordered list of processors to apply. Each processor receives the output of the previous one.

process(text)[source]

Process text and return the transformed string.

Parameters:

text (str) – Input text (chunk content or raw document content).

Return type:

str

Returns:

str – Processed text. Must be a non-empty string when input is non-empty.

class ractogateway.rag.TextCleaner(normalize_unicode=True, strip_html=True, strip_control_chars=True, collapse_whitespace=True, collapse_blank_lines=True)[source]

Bases: BaseProcessor

Normalise text for embedding and retrieval.

Steps applied (all optional via constructor flags):

  1. Unicode normalisation (NFC)

  2. Strip residual HTML tags

  3. Remove control characters

  4. Collapse multiple spaces to one

  5. Collapse runs of blank lines to at most two newlines

  6. Strip leading/trailing whitespace

Parameters:
  • normalize_unicode (bool) – Apply unicodedata.normalize("NFC", text).

  • strip_html (bool) – Remove <tag> patterns.

  • strip_control_chars (bool) – Remove non-printable control characters.

  • collapse_whitespace (bool) – Collapse sequences of spaces/tabs to a single space.

  • collapse_blank_lines (bool) – Collapse 3+ consecutive newlines to 2.

process(text)[source]

Process text and return the transformed string.

Parameters:

text (str) – Input text (chunk content or raw document content).

Return type:

str

Returns:

str – Processed text. Must be a non-empty string when input is non-empty.

class ractogateway.rag.FileReaderRegistry(readers=None)[source]

Bases: object

Registry that maps file extensions to BaseReader instances.

By default all built-in readers are registered. You can add custom readers with register().

Example:

registry = FileReaderRegistry()
doc = registry.read("report.pdf")
register(reader)[source]

Add reader to the registry for all its supported extensions.

Return type:

None

get_reader(path)[source]

Return the reader for path’s extension.

Raises:

ValueError – If no reader supports the file’s extension.

Return type:

BaseReader

read(path)[source]

Convenience method: detect reader and return a Document.

Return type:

Document

property supported_extensions: frozenset[str]

All extensions currently registered.

class ractogateway.rag.ChromaStore(collection='ractogateway', *, path=None, host=None, port=8000, distance_function='cosine')[source]

Bases: BaseVectorStore

Vector store backed by ChromaDB.

Supports both in-process (path or None for ephemeral) and HTTP-client modes (host + port).

Parameters:
  • collection (str) – Name of the ChromaDB collection.

  • path (str | None) – Persist directory for a local persistent client. None = ephemeral.

  • host (str | None) – ChromaDB server host (enables HTTP client mode).

  • port (int) – ChromaDB server port (default 8000).

  • distance_function (str) – "cosine", "l2", or "ip" (inner product).

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.FAISSStore(dimension=None, index_type='flat_ip')[source]

Bases: BaseVectorStore

Vector store backed by Facebook AI Similarity Search (FAISS).

Stores embeddings in a flat L2 or cosine (Inner Product) index. All data is in-memory; call save() / load() to persist.

Parameters:
  • dimension (int | None) – Embedding dimension. Inferred from the first add() call if None.

  • index_type (str) – "flat_l2" or "flat_ip" (inner product / cosine when normalised).

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

save(path)[source]

Persist the FAISS index to path.index and chunks to path.chunks.

Return type:

None

load(path)[source]

Load a previously saved index from path.

Return type:

None

class ractogateway.rag.InMemoryVectorStore(similarity='cosine')[source]

Bases: BaseVectorStore

Pure-Python brute-force vector store — no extra dependencies.

This store keeps all chunks and their embeddings in memory. It is not suitable for production-scale corpora but requires no installation.

Parameters:

similarity (str) – Similarity function to use. Currently only "cosine" is supported.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.MilvusStore(collection='ractogateway', *, host='localhost', port=19530, uri=None, token=None, dimension=None, metric_type='IP', batch_size=100)[source]

Bases: BaseVectorStore

Vector store backed by Milvus or Zilliz Cloud.

Parameters:
  • collection (str) – Milvus collection name.

  • host (str) – Milvus server host (default "localhost").

  • port (int) – Milvus server port (default 19530).

  • uri (str | None) – Zilliz Cloud URI (overrides host/port when set).

  • token (str | None) – Zilliz Cloud API token.

  • dimension (int | None) – Embedding dimension. Inferred on first add.

  • metric_type (str) – "IP" (inner product / cosine) or "L2".

  • batch_size (int) – Vectors per insert batch.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.PGVectorStore(dsn, *, table='rag_chunks', dimension=None, distance='cosine', batch_size=100)[source]

Bases: BaseVectorStore

Vector store backed by PostgreSQL with the pgvector extension.

Parameters:
  • dsn (str) – PostgreSQL connection string (e.g. "postgresql://user:pass@localhost/mydb").

  • table (str) – Table name (default "rag_chunks").

  • dimension (int | None) – Embedding dimension. Inferred on first add.

  • distance (str) – "cosine", "l2", or "inner".

  • batch_size (int) – Rows per INSERT batch.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.PineconeStore(index_name, *, api_key=None, namespace='', batch_size=100)[source]

Bases: BaseVectorStore

Vector store backed by Pinecone cloud.

Parameters:
  • index_name (str) – Name of the Pinecone index (must already exist).

  • api_key (str | None) – Pinecone API key. Falls back to PINECONE_API_KEY env var.

  • namespace (str) – Pinecone namespace for logical data isolation.

  • environment – Deprecated Pinecone environment string (for legacy pod-based indexes).

  • batch_size (int) – Number of vectors per upsert batch.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.QdrantStore(collection='ractogateway', *, url=None, api_key=None, distance='cosine', dimension=None, batch_size=100)[source]

Bases: BaseVectorStore

Vector store backed by Qdrant.

Parameters:
  • collection (str) – Qdrant collection name.

  • url (str | None) – Qdrant server URL. None = in-memory.

  • api_key (str | None) – Qdrant cloud API key (optional).

  • distance (str) – "cosine", "euclid", or "dot".

  • dimension (int | None) – Vector dimension. Inferred on first add if None.

  • batch_size (int) – Points per upsert batch.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int

class ractogateway.rag.WeaviateStore(class_name='RactoChunk', *, url=None, api_key=None, additional_headers=None, distance_metric='cosine', batch_size=100)[source]

Bases: BaseVectorStore

Vector store backed by Weaviate.

Supports embedded (local, no server needed), local server, and Weaviate Cloud (WCS) connections.

Parameters:
  • class_name (str) – Weaviate class (collection) name.

  • url (str | None) – Weaviate server URL. None = use embedded Weaviate.

  • api_key (str | None) – Weaviate Cloud API key.

  • additional_headers (dict[str, str] | None) – Extra HTTP headers (e.g. for OpenAI API key pass-through to Weaviate).

  • distance_metric (str) – "cosine" or "l2-squared".

  • batch_size (int) – Objects per batch import.

add(chunks)[source]

Add chunks (with embeddings) to the store.

Parameters:

chunks (list[Chunk]) – Chunks to index. Each chunk must have a non-None embedding.

Raises:

ValueError – If any chunk has embedding=None.

Return type:

None

search(embedding, top_k=5, filters=None)[source]

Search for the top_k most similar chunks.

Parameters:
  • embedding (list[float]) – Query embedding vector.

  • top_k (int) – Number of results to return.

  • filters (dict[str, Any] | None) – Optional metadata filters (store-specific format).

Return type:

list[RetrievalResult]

Returns:

list[RetrievalResult] – Ranked list of results (rank 1 = most similar).

delete(chunk_ids)[source]

Remove chunks with the given IDs from the store.

Return type:

None

clear()[source]

Remove all chunks from the store.

Return type:

None

count()[source]

Return the total number of indexed chunks.

Return type:

int