RAG Pipeline
Pipeline
- class ractogateway.rag.pipeline.RactoRAG(vector_store=None, embedder=None, *, store=None, chunker=None, processors=None, llm_kit=None, context_template=None, reader_registry=None, default_prompt=None)[source]
Bases:
objectProduction-grade RAG pipeline for RactoGateway.
- Parameters:
vector_store (
BaseVectorStore|None) – AnyBaseVectorStoreinstance.embedder (
BaseEmbedder|None) – AnyBaseEmbedderinstance.chunker (
BaseChunker|None) – How to split documents. Defaults toRecursiveChunkerwithchunk_size=512, overlap=50.processors (
list[BaseProcessor] |None) – List of text processors applied to each chunk before embedding. Defaults to[TextCleaner()].llm_kit (
Any|None) – Any developer kit (OpenAIDeveloperKit,GoogleDeveloperKit, orAnthropicDeveloperKit). Required forquery()/aquery().context_template (
str|None) – Template string for injecting retrieved context into the LLM prompt. Must contain{context}and{question}placeholders.reader_registry (
FileReaderRegistry|None) – CustomFileReaderRegistry. Defaults to a registry with all built-in readers.default_prompt (
RactoPrompt|None) – RACTO prompt used for generation. Falls back to a built-in RAG prompt.
- ingest_dir(directory, pattern='**/*', **metadata)[source]
Recursively ingest all supported files in a directory.
- ingest_text(text, source='manual', **metadata)[source]
Ingest a raw text string directly (no file needed).
- async aingest_dir(directory, pattern='**/*', **metadata)[source]
Async variant of
ingest_dir().
- async aingest_text(text, source='manual', **metadata)[source]
Async variant of
ingest_text().
- retrieve(query, top_k=5, filters=None)[source]
Embed query and retrieve the top-k most relevant chunks.
- Parameters:
- Return type:
- Returns:
list[RetrievalResult] – Ranked results (rank 1 = most relevant).
- async aretrieve(query, top_k=5, filters=None)[source]
Async variant of
retrieve().- Return type:
- query(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]
Retrieve relevant chunks and generate an answer.
- Parameters:
question (
str) – The user’s question.top_k (
int) – Number of context chunks to retrieve.filters (
dict[str,Any] |None) – Optional metadata filters for retrieval.prompt (
RactoPrompt|None) – Override the default RACTO prompt for generation.temperature (
float) – LLM temperature (default0.0for factual answers).max_tokens (
int) – Maximum tokens in the generated answer.
- Return type:
- Returns:
RAGResponse – Contains the generated answer plus the retrieved source chunks.
- Raises:
RuntimeError – If no
llm_kitwas provided.
- async aquery(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]
Async variant of
query().- Return type:
- property store: BaseVectorStore
The underlying vector store.
- property embedder: BaseEmbedder
The underlying embedder.
Models
Core document and chunk models for RAG.
Every piece of content in the RAG pipeline is represented as a Document
(raw, as loaded from a file) or a Chunk (a processed, embeddable slice of
a document). Both are strict Pydantic models with no unvalidated fields.
- class ractogateway.rag._models.document.ChunkMetadata(**data)[source]
Bases:
BaseModelProvenance and positional data attached to every chunk.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.Document(**data)[source]
Bases:
BaseModelA raw document loaded from a file or supplied as plain text.
- Parameters:
content (str) – The full extracted text of the document.
source (str) – Absolute file path, URL, or a descriptive label (e.g.
"manual").metadata (dict[str, Any]) – Free-form key/value pairs (file size, author, MIME type, …).
doc_id (str) – Auto-generated UUID; override only when you need stable IDs.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.Chunk(**data)[source]
Bases:
BaseModelA single embeddable slice of a document.
Produced by a
BaseChunkerand enriched with an embedding vector by aBaseEmbedder.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- metadata: ChunkMetadata
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Retrieval and RAG response models.
- class ractogateway.rag._models.retrieval.RetrievalConfig(**data)[source]
Bases:
BaseModelInput parameters for a vector-store search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.retrieval.RetrievalResult(**data)[source]
Bases:
BaseModelA single retrieved chunk together with its relevance score.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.retrieval.RAGResponse(**data)[source]
Bases:
BaseModelCombined output from a RAG query (retrieval + generation).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- answer: LLMResponse
- sources: list[RetrievalResult]
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Readers
Abstract base class for all file readers.
- class ractogateway.rag.readers.base.BaseReader[source]
Bases:
ABCRead content from a file path, raw bytes, or a binary buffer.
Concrete subclasses must implement
_read_path()and may override_read_bytes()to support bytes/buffer input. The publicread()method handles all type coercion automatically.- abstract property supported_extensions: frozenset[str]
Lower-case extensions (with dot) this reader handles, e.g.
{".pdf"}.
- read(source)[source]
Load source and return its content as a
Document.- Parameters:
source (
str|Path|bytes|BinaryIO) –strorPathFile path read from disk. Both absolute and relative paths are accepted.
bytesRaw file bytes.
Document.sourceis set to"<bytes>".- binary file-like object
Any object with a
.read() -> bytesmethod — e.g.io.BytesIO, an open binary file handle, a network stream.Document.sourceis set to"<buffer>".
- Return type:
FileReaderRegistry — auto-detects the right reader for any file extension.
- class ractogateway.rag.readers.registry.FileReaderRegistry(readers=None)[source]
Bases:
objectRegistry that maps file extensions to
BaseReaderinstances.By default all built-in readers are registered. You can add custom readers with
register().Example:
registry = FileReaderRegistry() doc = registry.read("report.pdf")
- get_reader(path)[source]
Return the reader for path’s extension.
- Raises:
ValueError – If no reader supports the file’s extension.
- Return type:
Plain-text reader — handles .txt, .md, .rst, .log and similar files.
- class ractogateway.rag.readers.text_reader.TextReader(encoding='utf-8')[source]
Bases:
BaseReaderRead any UTF-8 (or latin-1 fallback) plain-text file.
No external dependencies required.
Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method.- Parameters:
encoding (
str) – Primary encoding to try. Falls back to"latin-1"on error.
PDF reader — uses pypdf (lazy import).
Install with: pip install ractogateway[rag-pdf]
- class ractogateway.rag.readers.pdf_reader.PdfReader(extract_images=False)[source]
Bases:
BaseReaderExtract text from PDF files using
pypdf.Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method.- Parameters:
extract_images (
bool) – Reserved for future use — image extraction is not yet supported.
Word document reader — uses python-docx (lazy import).
Install with: pip install ractogateway[rag-word]
- class ractogateway.rag.readers.word_reader.WordReader[source]
Bases:
BaseReaderExtract text from Microsoft Word (.docx) files using
python-docx.Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method.
Spreadsheet reader — handles CSV (stdlib) and XLSX (openpyxl, lazy).
Install xlsx support with: pip install ractogateway[rag-excel]
- class ractogateway.rag.readers.spreadsheet_reader.SpreadsheetReader(max_rows=None, include_header=True)[source]
Bases:
BaseReaderRead CSV and Excel spreadsheets into plain text.
Each row is rendered as a tab-separated line; an optional header row is prepended. Multiple sheets in an XLSX workbook are separated by a
--- Sheet: <name> ---divider.Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method. When bytes/buffer are provided, XLSX format is detected via the ZIP magic header (PK\x03\x04); everything else is treated as CSV/TSV.- Parameters:
Image reader — uses Pillow (lazy import) to extract metadata.
Images are represented as a textual description of their EXIF/metadata,
plus an optional prompt to an LLM for visual description. Pixel data is
not stored in the Document; use RactoFile
for multimodal vision calls.
Install with: pip install ractogateway[rag-image]
- class ractogateway.rag.readers.image_reader.ImageReader(include_exif=True)[source]
Bases:
BaseReaderExtract metadata from image files and represent them as text Documents.
The resulting
Document.contentis a human-readable summary of image properties (size, mode, format, EXIF tags). Pass the image to a vision LLM separately usingRactoFilefor actual visual understanding.Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method.- Parameters:
include_exif (
bool) – Whether to extract and include EXIF metadata in the content.
HTML reader — uses stdlib html.parser (no extra deps).
- class ractogateway.rag.readers.html_reader.HtmlReader[source]
Bases:
BaseReaderExtract visible text from HTML files using the stdlib HTML parser.
No external dependencies required.
Accepts a file path (
str/Path), rawbytes, or any binary file-like object with a.read()method.
Chunkers
Abstract base class for text chunkers.
- class ractogateway.rag.chunkers.base.BaseChunker[source]
Bases:
ABCSplit a
Documentinto a list ofChunkobjects.Each chunk preserves provenance (
doc_id,chunk_index,start_char,end_char) in itsChunkMetadata.
Fixed-size character chunker with configurable overlap.
- class ractogateway.rag.chunkers.fixed_chunker.FixedChunker(chunk_size=512, overlap=50)[source]
Bases:
BaseChunkerSplit text into fixed-size character windows with overlap.
- Parameters:
Recursive character text splitter (LangChain-style).
Tries progressively finer separators ("\n\n", "\n", ". ",
" " and finally character-by-character) until every piece fits within
chunk_size.
- class ractogateway.rag.chunkers.recursive_chunker.RecursiveChunker(chunk_size=512, overlap=50, separators=None)[source]
Bases:
BaseChunkerSplit text recursively using a priority list of separators.
- Parameters:
Sentence-aware chunker — uses NLTK sent_tokenize (lazy import).
Install with: pip install ractogateway[rag-nlp]
- class ractogateway.rag.chunkers.sentence_chunker.SentenceChunker(sentences_per_chunk=5, overlap_sentences=1, language='english')[source]
Bases:
BaseChunkerSplit text into groups of sentences using NLTK.
- Parameters:
Semantic chunker — splits at embedding-space boundaries.
Uses cosine similarity between adjacent sentence embeddings to detect
topic shifts. Requires an BaseEmbedder
and NLTK sent_tokenize.
Install with: pip install ractogateway[rag-nlp]
- class ractogateway.rag.chunkers.semantic_chunker.SemanticChunker(embedder, threshold=0.5, min_chunk_size=2, language='english')[source]
Bases:
BaseChunkerSplit documents where the semantic similarity between adjacent sentences drops below a threshold.
- Parameters:
embedder (
BaseEmbedder) – AnyBaseEmbedderinstance.threshold (
float) – Cosine similarity below which a split is inserted (default:0.5).min_chunk_size (
int) – Minimum number of sentences per chunk (prevents ultra-fine splits).language (
str) – NLTK sentence tokenizer language.
Processors
Abstract base class for text processors.
- class ractogateway.rag.processors.base.BaseProcessor[source]
Bases:
ABCTransform a text string and return the processed result.
Processors are applied to chunk content before embedding. They can normalise whitespace, lemmatize tokens, remove stop words, etc.
Chain multiple processors with
ProcessingPipeline.
Text cleaning processor — no extra dependencies.
- class ractogateway.rag.processors.cleaner.TextCleaner(normalize_unicode=True, strip_html=True, strip_control_chars=True, collapse_whitespace=True, collapse_blank_lines=True)[source]
Bases:
BaseProcessorNormalise text for embedding and retrieval.
Steps applied (all optional via constructor flags):
Unicode normalisation (NFC)
Strip residual HTML tags
Remove control characters
Collapse multiple spaces to one
Collapse runs of blank lines to at most two newlines
Strip leading/trailing whitespace
- Parameters:
normalize_unicode (
bool) – Applyunicodedata.normalize("NFC", text).strip_html (
bool) – Remove<tag>patterns.strip_control_chars (
bool) – Remove non-printable control characters.collapse_whitespace (
bool) – Collapse sequences of spaces/tabs to a single space.collapse_blank_lines (
bool) – Collapse 3+ consecutive newlines to 2.
Lemmatization processor — uses NLTK WordNetLemmatizer (lazy import).
Install with: pip install ractogateway[rag-nlp]
Note: Lemmatization changes the surface form of text and can degrade embedding quality for neural models (which were trained on unmodified text). Use this processor only when building keyword-index pipelines or when explicitly required for your retrieval strategy.
- class ractogateway.rag.processors.lemmatizer.Lemmatizer(use_pos_tagging=True)[source]
Bases:
BaseProcessorReduce words to their base (lemma) form using NLTK WordNet.
- Parameters:
use_pos_tagging (
bool) – IfTrue, use POS tagging to improve lemmatization accuracy. Slightly slower but produces better results.
ProcessingPipeline — chain multiple processors sequentially.
- class ractogateway.rag.processors.pipeline.ProcessingPipeline(processors)[source]
Bases:
BaseProcessorApply a sequence of
BaseProcessorobjects to text.Example:
pipeline = ProcessingPipeline([TextCleaner(), Lemmatizer()]) processed = pipeline.process(" Hello, worlds! ")
- Parameters:
processors (
list[BaseProcessor]) – Ordered list of processors to apply. Each processor receives the output of the previous one.
Embedders
Abstract base class for embedding providers.
- class ractogateway.rag.embedders.base.BaseEmbedder[source]
Bases:
ABCEmbed a list of texts into dense float vectors.
All embedders implement both sync
embed()and asyncaembed()variants. The dimension of returned vectors is declared via thedimensionproperty (-1if unknown until the first call).- property dimension: int
Dimensionality of the embedding vectors.
Returns
-1if not known until after the first call.
OpenAI embedding provider.
Install with: pip install ractogateway[openai]
- class ractogateway.rag.embedders.openai_embedder.OpenAIEmbedder(model='text-embedding-3-small', *, api_key=None, base_url=None, dimensions=None, batch_size=256)[source]
Bases:
BaseEmbedderEmbed texts using the OpenAI Embeddings API.
- Parameters:
model (
str) – OpenAI embedding model (default"text-embedding-3-small").api_key (
str|None) – OpenAI API key. Falls back toOPENAI_API_KEYenv var.base_url (
str|None) – Custom base URL (Azure OpenAI or proxy).dimensions (
int|None) – Override output dimensionality (supported fortext-embedding-3-*).batch_size (
int) – Maximum number of texts per API call.
- property dimension: int
Dimensionality of the embedding vectors.
Returns
-1if not known until after the first call.
Google Gemini embedding provider.
Install with: pip install ractogateway[google]
- class ractogateway.rag.embedders.google_embedder.GoogleEmbedder(model='text-embedding-004', *, api_key=None, task_type=None, batch_size=100)[source]
Bases:
BaseEmbedderEmbed texts using the Google Gemini Embeddings API.
- Parameters:
model (
str) – Gemini embedding model (default"text-embedding-004").api_key (
str|None) – Gemini API key. Falls back toGEMINI_API_KEYenv var.task_type (
str|None) – Gemini task type hint (e.g."RETRIEVAL_DOCUMENT","RETRIEVAL_QUERY").Nonelets the API decide.batch_size (
int) – Maximum number of texts per API call.
- property dimension: int
Dimensionality of the embedding vectors.
Returns
-1if not known until after the first call.
Voyage AI embedding provider (Anthropic-aligned, best for Claude RAG).
Install with: pip install ractogateway[rag-voyage]
- class ractogateway.rag.embedders.voyage_embedder.VoyageEmbedder(model='voyage-3', *, api_key=None, input_type='document', batch_size=128)[source]
Bases:
BaseEmbedderEmbed texts using the Voyage AI API.
Voyage AI embeddings are optimised for Anthropic Claude RAG pipelines and are the recommended choice when using Claude as the generation LLM.
- Parameters:
model (
str) – Voyage model name (default"voyage-3").api_key (
str|None) – Voyage API key. Falls back toVOYAGE_API_KEYenv var.input_type (
str|None) –"query"for queries,"document"for documents to index. Using the correct type improves retrieval quality.batch_size (
int) – Maximum texts per API call.
- property dimension: int
Dimensionality of the embedding vectors.
Returns
-1if not known until after the first call.
PageIndexRAG — Vectorless BM25 Pipeline
- class ractogateway.rag.page_index.pipeline.PageIndexRAG(llm_kit=None, *, processors=None, reader_registry=None, context_template="Use the following retrieved page excerpts to answer the user's question.\\nIf the excerpts do not contain enough information, say so clearly.\\n\\n--- CONTEXT ---\\n{context}\\n--- END CONTEXT ---\\n\\nQuestion: {question}", default_prompt=None, page_size=1000, page_overlap=100, k1=1.5, b=0.75, top_keywords=20, ocr_backend=None, ocr_fallback=True, min_ocr_confidence=0.0)[source]
Bases:
objectVectorless RAG pipeline that indexes documents at the page level.
- Parameters:
llm_kit (
Any) – Any RactoGateway developer kit (OpenAI, Anthropic, Google, Ollama, HuggingFace). Required only forquery()/aquery(). PassNoneto use the pipeline in retrieve-only mode.processors (
Sequence[BaseProcessor] |None) – Text processors applied to each page before indexing. Defaults to[TextCleaner()].reader_registry (
FileReaderRegistry|None) – File reader registry used to load non-PDF documents. Defaults to aFileReaderRegistrywith all built-in readers registered.context_template (
str) – Jinja-style template with{context}and{question}placeholders used when building the LLM prompt.default_prompt (
RactoPrompt|None) –RactoPromptused for generation. Defaults to a built-in factual Q&A prompt.page_size (
int) – Maximum character length of each page window for non-PDF files (default 1 000).page_overlap (
int) – Character overlap between consecutive windows (default 100).k1 (
float) – BM25 term-frequency saturation parameter (default 1.5).b (
float) – BM25 length-normalisation parameter (default 0.75).top_keywords (
int) – Number of top TF-weighted keywords to extract per page for the decision index (default 20).
- retrieve(query, top_k=5)[source]
Retrieve the most relevant pages for query.
Uses two-stage retrieval: decision index (candidate selection) → BM25 scoring (ranking).
- Parameters:
- Return type:
- Returns:
list[PageIndexResult] – Pages ranked by BM25 score (most relevant first).
- async aretrieve(query, top_k=5)[source]
Async variant of
retrieve().- Return type:
- ingest(path, **metadata)[source]
Read a file and add its pages to the index.
PDFs are split page-by-page; all other file types are split into fixed-size character windows.
- async aingest_text(text, source='manual', **metadata)[source]
Async variant of
ingest_text().
- ingest_dir(directory, pattern='**/*', *, on_progress=None, **metadata)[source]
Ingest all files matching pattern inside directory.
Files that cannot be read are logged and skipped; the rest are indexed normally.
- Parameters:
- Return type:
- async aingest_dir(directory, pattern='**/*', *, max_concurrent=4, on_progress=None, **metadata)[source]
Async parallel variant of
ingest_dir().- Parameters:
directory (
str) – Root directory to search.pattern (
str) – Glob pattern relative to directory (default"**/*").max_concurrent (
int) – Maximum number of files ingested concurrently (default 4).on_progress (
Callable[[int,int],None] |None) – Optional callback(done, total) -> Nonecalled after each file finishes (thread-safe; called from the event loop).
- Return type:
- search(query, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]
Alias for
query().- Return type:
- query(question, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]
Retrieve relevant pages and generate an answer with the LLM kit.
- Parameters:
- Return type:
- Returns:
PageIndexResponse – Contains the generated answer, ranked sources, and the context string that was supplied to the model.
- Raises:
ValueError – If no
llm_kitwas provided and generation is requested.
- async aquery(question, *, top_k=5, prompt=None, temperature=0.0, max_tokens=2048)[source]
Async variant of
query().- Return type:
- save(path)[source]
Serialise the full index to a JSON file.
The saved file contains all
PageEntryrecords, BM25 term weights, and deduplication hashes. Reload withload().
PageIndex Models
Pydantic models for the PageIndexRAG pipeline.
- class ractogateway.rag.page_index._models.PageEntry(**data)[source]
Bases:
BaseModelA single page (or fixed-size window) extracted from a document.
Produced by
PageIndexRAGduring ingestion and stored in the in-process index.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag.page_index._models.PageIndexResult(**data)[source]
Bases:
BaseModelA single retrieved page together with its BM25 relevance score.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag.page_index._models.PageIndexResponse(**data)[source]
Bases:
BaseModelFull response from
PageIndexRAG.query()/PageIndexRAG.aquery().Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- answer: LLMResponse | None
- sources: list[PageIndexResult]
- property results: list[PageIndexResult]
Alias for sources.
- property pages: list[PageIndexResult]
Alias for sources.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
PageIndex BM25 Engine
Pure-Python BM25 index and decision-tree inverted index.
No external dependencies required — everything is implemented with the Python standard library.
Two components work together for two-stage retrieval:
_DecisionIndex— an inverted keyword index that maps content terms to page entry IDs. Given a tokenised query it returns the union of candidate entry IDs in O(|query terms|) time. This is the “decision tree” routing layer.BM25Index— Okapi BM25 (k1=1.5, b=0.75) that scores the candidates returned by the decision index. Only candidates are scored, so the full corpus is never re-ranked on every query.
- ractogateway.rag.page_index._bm25.extract_keywords(text, top_n=20)[source]
Return the top-n most frequent content tokens from text.
- class ractogateway.rag.page_index._bm25.BM25Index(k1=1.5, b=0.75)[source]
Bases:
objectOkapi BM25 scorer over a corpus of
PageEntrytexts.- Parameters:
Stores
Abstract base class for vector stores.
- class ractogateway.rag.stores.base.BaseVectorStore[source]
Bases:
ABCPersist and search embedding vectors.
All vector stores share the same interface:
add(),search(),delete(),clear(), andcount(). The underlying storage backend is determined by the concrete subclass.- abstractmethod add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
- abstractmethod search(embedding, top_k=5, filters=None)[source]
Search for the top_k most similar chunks.
- Parameters:
- Return type:
- Returns:
list[RetrievalResult] – Ranked list of results (rank 1 = most similar).
In-memory vector store — pure Python, zero extra dependencies.
Uses brute-force cosine similarity over a list of stored vectors. Suitable for development, testing, and small corpora (< 10k chunks).
- class ractogateway.rag.stores.in_memory_store.InMemoryVectorStore(similarity='cosine')[source]
Bases:
BaseVectorStorePure-Python brute-force vector store — no extra dependencies.
This store keeps all chunks and their embeddings in memory. It is not suitable for production-scale corpora but requires no installation.
- Parameters:
similarity (
str) – Similarity function to use. Currently only"cosine"is supported.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
ChromaDB vector store (lazy import).
Install with: pip install ractogateway[rag-chroma]
- class ractogateway.rag.stores.chroma_store.ChromaStore(collection='ractogateway', *, path=None, host=None, port=8000, distance_function='cosine')[source]
Bases:
BaseVectorStoreVector store backed by ChromaDB.
Supports both in-process (
pathorNonefor ephemeral) and HTTP-client modes (host+port).- Parameters:
collection (
str) – Name of the ChromaDB collection.path (
str|None) – Persist directory for a local persistent client.None= ephemeral.host (
str|None) – ChromaDB server host (enables HTTP client mode).port (
int) – ChromaDB server port (default 8000).distance_function (
str) –"cosine","l2", or"ip"(inner product).
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
FAISS vector store (lazy import).
Install with: pip install ractogateway[rag-faiss]
- class ractogateway.rag.stores.faiss_store.FAISSStore(dimension=None, index_type='flat_ip')[source]
Bases:
BaseVectorStoreVector store backed by Facebook AI Similarity Search (FAISS).
Stores embeddings in a flat L2 or cosine (Inner Product) index. All data is in-memory; call
save()/load()to persist.- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
- search(embedding, top_k=5, filters=None)[source]
Search for the top_k most similar chunks.
- Parameters:
- Return type:
- Returns:
list[RetrievalResult] – Ranked list of results (rank 1 = most similar).
Pinecone vector store (lazy import).
Install with: pip install ractogateway[rag-pinecone]
- class ractogateway.rag.stores.pinecone_store.PineconeStore(index_name, *, api_key=None, namespace='', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Pinecone cloud.
- Parameters:
index_name (
str) – Name of the Pinecone index (must already exist).api_key (
str|None) – Pinecone API key. Falls back toPINECONE_API_KEYenv var.namespace (
str) – Pinecone namespace for logical data isolation.environment – Deprecated Pinecone environment string (for legacy pod-based indexes).
batch_size (
int) – Number of vectors per upsert batch.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Qdrant vector store (lazy import).
Install with: pip install ractogateway[rag-qdrant]
- class ractogateway.rag.stores.qdrant_store.QdrantStore(collection='ractogateway', *, url=None, api_key=None, distance='cosine', dimension=None, batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Qdrant.
- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Weaviate vector store (lazy import).
Install with: pip install ractogateway[rag-weaviate]
- class ractogateway.rag.stores.weaviate_store.WeaviateStore(class_name='RactoChunk', *, url=None, api_key=None, additional_headers=None, distance_metric='cosine', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Weaviate.
Supports embedded (local, no server needed), local server, and Weaviate Cloud (WCS) connections.
- Parameters:
class_name (
str) – Weaviate class (collection) name.url (
str|None) – Weaviate server URL.None= use embedded Weaviate.additional_headers (
dict[str,str] |None) – Extra HTTP headers (e.g. for OpenAI API key pass-through to Weaviate).distance_metric (
str) –"cosine"or"l2-squared".batch_size (
int) – Objects per batch import.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Milvus / Zilliz vector store (lazy import).
Install with: pip install ractogateway[rag-milvus]
- class ractogateway.rag.stores.milvus_store.MilvusStore(collection='ractogateway', *, host='localhost', port=19530, uri=None, token=None, dimension=None, metric_type='IP', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Milvus or Zilliz Cloud.
- Parameters:
collection (
str) – Milvus collection name.host (
str) – Milvus server host (default"localhost").port (
int) – Milvus server port (default19530).uri (
str|None) – Zilliz Cloud URI (overrides host/port when set).dimension (
int|None) – Embedding dimension. Inferred on first add.metric_type (
str) –"IP"(inner product / cosine) or"L2".batch_size (
int) – Vectors per insert batch.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
PostgreSQL + pgvector store (lazy import).
Install with: pip install ractogateway[rag-pgvector]
- class ractogateway.rag.stores.pgvector_store.PGVectorStore(dsn, *, table='rag_chunks', dimension=None, distance='cosine', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by PostgreSQL with the pgvector extension.
- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type: