RAG Pipeline
Pipeline
- class ractogateway.rag.pipeline.RactoRAG(vector_store, embedder, *, chunker=None, processors=None, llm_kit=None, context_template=None, reader_registry=None, default_prompt=None)[source]
Bases:
objectProduction-grade RAG pipeline for RactoGateway.
- Parameters:
vector_store (
BaseVectorStore) – AnyBaseVectorStoreinstance.embedder (
BaseEmbedder) – AnyBaseEmbedderinstance.chunker (
BaseChunker|None) – How to split documents. Defaults toRecursiveChunkerwithchunk_size=512, overlap=50.processors (
list[BaseProcessor] |None) – List of text processors applied to each chunk before embedding. Defaults to[TextCleaner()].llm_kit (
Any|None) – Any developer kit (OpenAIDeveloperKit,GoogleDeveloperKit, orAnthropicDeveloperKit). Required forquery()/aquery().context_template (
str|None) – Template string for injecting retrieved context into the LLM prompt. Must contain{context}and{question}placeholders.reader_registry (
FileReaderRegistry|None) – CustomFileReaderRegistry. Defaults to a registry with all built-in readers.default_prompt (
RactoPrompt|None) – RACTO prompt used for generation. Falls back to a built-in RAG prompt.
- async aingest_dir(directory, pattern='**/*', **metadata)[source]
Async variant of
ingest_dir().
- async aingest_text(text, source='manual', **metadata)[source]
Async variant of
ingest_text().
- async aquery(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]
Async variant of
query().- Return type:
- async aretrieve(query, top_k=5, filters=None)[source]
Async variant of
retrieve().- Return type:
- property embedder: BaseEmbedder
The underlying embedder.
- ingest_dir(directory, pattern='**/*', **metadata)[source]
Recursively ingest all supported files in a directory.
- ingest_text(text, source='manual', **metadata)[source]
Ingest a raw text string directly (no file needed).
- query(question, top_k=5, filters=None, prompt=None, temperature=0.0, max_tokens=2048)[source]
Retrieve relevant chunks and generate an answer.
- Parameters:
question (
str) – The user’s question.top_k (
int) – Number of context chunks to retrieve.filters (
dict[str,Any] |None) – Optional metadata filters for retrieval.prompt (
RactoPrompt|None) – Override the default RACTO prompt for generation.temperature (
float) – LLM temperature (default0.0for factual answers).max_tokens (
int) – Maximum tokens in the generated answer.
- Return type:
- Returns:
RAGResponse – Contains the generated answer plus the retrieved source chunks.
- Raises:
RuntimeError – If no
llm_kitwas provided.
- retrieve(query, top_k=5, filters=None)[source]
Embed query and retrieve the top-k most relevant chunks.
- Parameters:
- Return type:
- Returns:
list[RetrievalResult] – Ranked results (rank 1 = most relevant).
- property store: BaseVectorStore
The underlying vector store.
Models
Core document and chunk models for RAG.
Every piece of content in the RAG pipeline is represented as a Document
(raw, as loaded from a file) or a Chunk (a processed, embeddable slice of
a document). Both are strict Pydantic models with no unvalidated fields.
- class ractogateway.rag._models.document.Chunk(**data)[source]
Bases:
BaseModelA single embeddable slice of a document.
Produced by a
BaseChunkerand enriched with an embedding vector by aBaseEmbedder.- metadata: ChunkMetadata
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.ChunkMetadata(**data)[source]
Bases:
BaseModelProvenance and positional data attached to every chunk.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.Document(**data)[source]
Bases:
BaseModelA raw document loaded from a file or supplied as plain text.
- Parameters:
content (str) – The full extracted text of the document.
source (str) – Absolute file path, URL, or a descriptive label (e.g.
"manual").metadata (dict[str, Any]) – Free-form key/value pairs (file size, author, MIME type, …).
doc_id (str) – Auto-generated UUID; override only when you need stable IDs.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Retrieval and RAG response models.
- class ractogateway.rag._models.retrieval.RAGResponse(**data)[source]
Bases:
BaseModelCombined output from a RAG query (retrieval + generation).
- answer: LLMResponse
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- sources: list[RetrievalResult]
- class ractogateway.rag._models.retrieval.RetrievalConfig(**data)[source]
Bases:
BaseModelInput parameters for a vector-store search.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Readers
Abstract base class for all file readers.
- class ractogateway.rag.readers.base.BaseReader[source]
Bases:
ABCRead a file from disk and return a
Document.Concrete subclasses implement
read()and declare which file extensions they handle viasupported_extensions.
FileReaderRegistry — auto-detects the right reader for any file extension.
- class ractogateway.rag.readers.registry.FileReaderRegistry(readers=None)[source]
Bases:
objectRegistry that maps file extensions to
BaseReaderinstances.By default all built-in readers are registered. You can add custom readers with
register().Example:
registry = FileReaderRegistry() doc = registry.read("report.pdf")
- get_reader(path)[source]
Return the reader for path’s extension.
- Raises:
ValueError – If no reader supports the file’s extension.
- Return type:
Plain-text reader — handles .txt, .md, .rst, .log and similar files.
- class ractogateway.rag.readers.text_reader.TextReader(encoding='utf-8')[source]
Bases:
BaseReaderRead any UTF-8 (or latin-1 fallback) plain-text file.
No external dependencies required.
- Parameters:
encoding (
str) – Primary encoding to try. Falls back to"latin-1"on error.
PDF reader — uses pypdf (lazy import).
Install with: pip install ractogateway[rag-pdf]
- class ractogateway.rag.readers.pdf_reader.PdfReader(extract_images=False)[source]
Bases:
BaseReaderExtract text from PDF files using
pypdf.- Parameters:
extract_images (
bool) – Reserved for future use — image extraction is not yet supported.
Word document reader — uses python-docx (lazy import).
Install with: pip install ractogateway[rag-word]
- class ractogateway.rag.readers.word_reader.WordReader[source]
Bases:
BaseReaderExtract text from Microsoft Word (.docx) files using
python-docx.
Spreadsheet reader — handles CSV (stdlib) and XLSX (openpyxl, lazy).
Install xlsx support with: pip install ractogateway[rag-excel]
- class ractogateway.rag.readers.spreadsheet_reader.SpreadsheetReader(max_rows=None, include_header=True)[source]
Bases:
BaseReaderRead CSV and Excel spreadsheets into plain text.
Each row is rendered as a tab-separated line; an optional header row is prepended. Multiple sheets in an XLSX workbook are separated by a
--- Sheet: <name> ---divider.- Parameters:
Image reader — uses Pillow (lazy import) to extract metadata.
Images are represented as a textual description of their EXIF/metadata,
plus an optional prompt to an LLM for visual description. Pixel data is
not stored in the Document; use RactoFile
for multimodal vision calls.
Install with: pip install ractogateway[rag-image]
- class ractogateway.rag.readers.image_reader.ImageReader(include_exif=True)[source]
Bases:
BaseReaderExtract metadata from image files and represent them as text Documents.
The resulting
Document.contentis a human-readable summary of image properties (size, mode, format, EXIF tags). Pass the image to a vision LLM separately usingRactoFilefor actual visual understanding.- Parameters:
include_exif (
bool) – Whether to extract and include EXIF metadata in the content.
HTML reader — uses stdlib html.parser (no extra deps).
- class ractogateway.rag.readers.html_reader.HtmlReader[source]
Bases:
BaseReaderExtract visible text from HTML files using the stdlib HTML parser.
No external dependencies required.
Chunkers
Abstract base class for text chunkers.
- class ractogateway.rag.chunkers.base.BaseChunker[source]
Bases:
ABCSplit a
Documentinto a list ofChunkobjects.Each chunk preserves provenance (
doc_id,chunk_index,start_char,end_char) in itsChunkMetadata.
Fixed-size character chunker with configurable overlap.
- class ractogateway.rag.chunkers.fixed_chunker.FixedChunker(chunk_size=512, overlap=50)[source]
Bases:
BaseChunkerSplit text into fixed-size character windows with overlap.
- Parameters:
Recursive character text splitter (LangChain-style).
Tries progressively finer separators ("\n\n", "\n", ". ",
" " and finally character-by-character) until every piece fits within
chunk_size.
- class ractogateway.rag.chunkers.recursive_chunker.RecursiveChunker(chunk_size=512, overlap=50, separators=None)[source]
Bases:
BaseChunkerSplit text recursively using a priority list of separators.
- Parameters:
Sentence-aware chunker — uses NLTK sent_tokenize (lazy import).
Install with: pip install ractogateway[rag-nlp]
- class ractogateway.rag.chunkers.sentence_chunker.SentenceChunker(sentences_per_chunk=5, overlap_sentences=1, language='english')[source]
Bases:
BaseChunkerSplit text into groups of sentences using NLTK.
- Parameters:
Semantic chunker — splits at embedding-space boundaries.
Uses cosine similarity between adjacent sentence embeddings to detect
topic shifts. Requires an BaseEmbedder
and NLTK sent_tokenize.
Install with: pip install ractogateway[rag-nlp]
- class ractogateway.rag.chunkers.semantic_chunker.SemanticChunker(embedder, threshold=0.5, min_chunk_size=2, language='english')[source]
Bases:
BaseChunkerSplit documents where the semantic similarity between adjacent sentences drops below a threshold.
- Parameters:
embedder (
BaseEmbedder) – AnyBaseEmbedderinstance.threshold (
float) – Cosine similarity below which a split is inserted (default:0.5).min_chunk_size (
int) – Minimum number of sentences per chunk (prevents ultra-fine splits).language (
str) – NLTK sentence tokenizer language.
Processors
Abstract base class for text processors.
- class ractogateway.rag.processors.base.BaseProcessor[source]
Bases:
ABCTransform a text string and return the processed result.
Processors are applied to chunk content before embedding. They can normalise whitespace, lemmatize tokens, remove stop words, etc.
Chain multiple processors with
ProcessingPipeline.
Text cleaning processor — no extra dependencies.
- class ractogateway.rag.processors.cleaner.TextCleaner(normalize_unicode=True, strip_html=True, strip_control_chars=True, collapse_whitespace=True, collapse_blank_lines=True)[source]
Bases:
BaseProcessorNormalise text for embedding and retrieval.
Steps applied (all optional via constructor flags):
Unicode normalisation (NFC)
Strip residual HTML tags
Remove control characters
Collapse multiple spaces to one
Collapse runs of blank lines to at most two newlines
Strip leading/trailing whitespace
- Parameters:
normalize_unicode (
bool) – Applyunicodedata.normalize("NFC", text).strip_html (
bool) – Remove<tag>patterns.strip_control_chars (
bool) – Remove non-printable control characters.collapse_whitespace (
bool) – Collapse sequences of spaces/tabs to a single space.collapse_blank_lines (
bool) – Collapse 3+ consecutive newlines to 2.
Lemmatization processor — uses NLTK WordNetLemmatizer (lazy import).
Install with: pip install ractogateway[rag-nlp]
Note: Lemmatization changes the surface form of text and can degrade embedding quality for neural models (which were trained on unmodified text). Use this processor only when building keyword-index pipelines or when explicitly required for your retrieval strategy.
- class ractogateway.rag.processors.lemmatizer.Lemmatizer(use_pos_tagging=True)[source]
Bases:
BaseProcessorReduce words to their base (lemma) form using NLTK WordNet.
- Parameters:
use_pos_tagging (
bool) – IfTrue, use POS tagging to improve lemmatization accuracy. Slightly slower but produces better results.
ProcessingPipeline — chain multiple processors sequentially.
- class ractogateway.rag.processors.pipeline.ProcessingPipeline(processors)[source]
Bases:
BaseProcessorApply a sequence of
BaseProcessorobjects to text.Example:
pipeline = ProcessingPipeline([TextCleaner(), Lemmatizer()]) processed = pipeline.process(" Hello, worlds! ")
- Parameters:
processors (
list[BaseProcessor]) – Ordered list of processors to apply. Each processor receives the output of the previous one.
Embedders
Abstract base class for embedding providers.
- class ractogateway.rag.embedders.base.BaseEmbedder[source]
Bases:
ABCEmbed a list of texts into dense float vectors.
All embedders implement both sync
embed()and asyncaembed()variants. The dimension of returned vectors is declared via thedimensionproperty (-1if unknown until the first call).
OpenAI embedding provider.
Install with: pip install ractogateway[openai]
- class ractogateway.rag.embedders.openai_embedder.OpenAIEmbedder(model='text-embedding-3-small', *, api_key=None, base_url=None, dimensions=None, batch_size=256)[source]
Bases:
BaseEmbedderEmbed texts using the OpenAI Embeddings API.
- Parameters:
model (
str) – OpenAI embedding model (default"text-embedding-3-small").api_key (
str|None) – OpenAI API key. Falls back toOPENAI_API_KEYenv var.base_url (
str|None) – Custom base URL (Azure OpenAI or proxy).dimensions (
int|None) – Override output dimensionality (supported fortext-embedding-3-*).batch_size (
int) – Maximum number of texts per API call.
Google Gemini embedding provider.
Install with: pip install ractogateway[google]
- class ractogateway.rag.embedders.google_embedder.GoogleEmbedder(model='text-embedding-004', *, api_key=None, task_type=None, batch_size=100)[source]
Bases:
BaseEmbedderEmbed texts using the Google Gemini Embeddings API.
- Parameters:
model (
str) – Gemini embedding model (default"text-embedding-004").api_key (
str|None) – Gemini API key. Falls back toGEMINI_API_KEYenv var.task_type (
str|None) – Gemini task type hint (e.g."RETRIEVAL_DOCUMENT","RETRIEVAL_QUERY").Nonelets the API decide.batch_size (
int) – Maximum number of texts per API call.
Voyage AI embedding provider (Anthropic-aligned, best for Claude RAG).
Install with: pip install ractogateway[rag-voyage]
- class ractogateway.rag.embedders.voyage_embedder.VoyageEmbedder(model='voyage-3', *, api_key=None, input_type='document', batch_size=128)[source]
Bases:
BaseEmbedderEmbed texts using the Voyage AI API.
Voyage AI embeddings are optimised for Anthropic Claude RAG pipelines and are the recommended choice when using Claude as the generation LLM.
- Parameters:
model (
str) – Voyage model name (default"voyage-3").api_key (
str|None) – Voyage API key. Falls back toVOYAGE_API_KEYenv var.input_type (
str|None) –"query"for queries,"document"for documents to index. Using the correct type improves retrieval quality.batch_size (
int) – Maximum texts per API call.
Stores
Abstract base class for vector stores.
- class ractogateway.rag.stores.base.BaseVectorStore[source]
Bases:
ABCPersist and search embedding vectors.
All vector stores share the same interface:
add(),search(),delete(),clear(), andcount(). The underlying storage backend is determined by the concrete subclass.- abstractmethod add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
- abstractmethod delete(chunk_ids)[source]
Remove chunks with the given IDs from the store.
- Return type:
In-memory vector store — pure Python, zero extra dependencies.
Uses brute-force cosine similarity over a list of stored vectors. Suitable for development, testing, and small corpora (< 10k chunks).
- class ractogateway.rag.stores.in_memory_store.InMemoryVectorStore(similarity='cosine')[source]
Bases:
BaseVectorStorePure-Python brute-force vector store — no extra dependencies.
This store keeps all chunks and their embeddings in memory. It is not suitable for production-scale corpora but requires no installation.
- Parameters:
similarity (
str) – Similarity function to use. Currently only"cosine"is supported.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
ChromaDB vector store (lazy import).
Install with: pip install ractogateway[rag-chroma]
- class ractogateway.rag.stores.chroma_store.ChromaStore(collection='ractogateway', *, path=None, host=None, port=8000, distance_function='cosine')[source]
Bases:
BaseVectorStoreVector store backed by ChromaDB.
Supports both in-process (
pathorNonefor ephemeral) and HTTP-client modes (host+port).- Parameters:
collection (
str) – Name of the ChromaDB collection.path (
str|None) – Persist directory for a local persistent client.None= ephemeral.host (
str|None) – ChromaDB server host (enables HTTP client mode).port (
int) – ChromaDB server port (default 8000).distance_function (
str) –"cosine","l2", or"ip"(inner product).
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
FAISS vector store (lazy import).
Install with: pip install ractogateway[rag-faiss]
- class ractogateway.rag.stores.faiss_store.FAISSStore(dimension=None, index_type='flat_ip')[source]
Bases:
BaseVectorStoreVector store backed by Facebook AI Similarity Search (FAISS).
Stores embeddings in a flat L2 or cosine (Inner Product) index. All data is in-memory; call
save()/load()to persist.- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Pinecone vector store (lazy import).
Install with: pip install ractogateway[rag-pinecone]
- class ractogateway.rag.stores.pinecone_store.PineconeStore(index_name, *, api_key=None, namespace='', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Pinecone cloud.
- Parameters:
index_name (
str) – Name of the Pinecone index (must already exist).api_key (
str|None) – Pinecone API key. Falls back toPINECONE_API_KEYenv var.namespace (
str) – Pinecone namespace for logical data isolation.environment – Deprecated Pinecone environment string (for legacy pod-based indexes).
batch_size (
int) – Number of vectors per upsert batch.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Qdrant vector store (lazy import).
Install with: pip install ractogateway[rag-qdrant]
- class ractogateway.rag.stores.qdrant_store.QdrantStore(collection='ractogateway', *, url=None, api_key=None, distance='cosine', dimension=None, batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Qdrant.
- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Weaviate vector store (lazy import).
Install with: pip install ractogateway[rag-weaviate]
- class ractogateway.rag.stores.weaviate_store.WeaviateStore(class_name='RactoChunk', *, url=None, api_key=None, additional_headers=None, distance_metric='cosine', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Weaviate.
Supports embedded (local, no server needed), local server, and Weaviate Cloud (WCS) connections.
- Parameters:
class_name (
str) – Weaviate class (collection) name.url (
str|None) – Weaviate server URL.None= use embedded Weaviate.additional_headers (
dict[str,str] |None) – Extra HTTP headers (e.g. for OpenAI API key pass-through to Weaviate).distance_metric (
str) –"cosine"or"l2-squared".batch_size (
int) – Objects per batch import.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
Milvus / Zilliz vector store (lazy import).
Install with: pip install ractogateway[rag-milvus]
- class ractogateway.rag.stores.milvus_store.MilvusStore(collection='ractogateway', *, host='localhost', port=19530, uri=None, token=None, dimension=None, metric_type='IP', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by Milvus or Zilliz Cloud.
- Parameters:
collection (
str) – Milvus collection name.host (
str) – Milvus server host (default"localhost").port (
int) – Milvus server port (default19530).uri (
str|None) – Zilliz Cloud URI (overrides host/port when set).dimension (
int|None) – Embedding dimension. Inferred on first add.metric_type (
str) –"IP"(inner product / cosine) or"L2".batch_size (
int) – Vectors per insert batch.
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type:
PostgreSQL + pgvector store (lazy import).
Install with: pip install ractogateway[rag-pgvector]
- class ractogateway.rag.stores.pgvector_store.PGVectorStore(dsn, *, table='rag_chunks', dimension=None, distance='cosine', batch_size=100)[source]
Bases:
BaseVectorStoreVector store backed by PostgreSQL with the pgvector extension.
- Parameters:
- add(chunks)[source]
Add chunks (with embeddings) to the store.
- Parameters:
chunks (
list[Chunk]) – Chunks to index. Each chunk must have a non-Noneembedding.- Raises:
ValueError – If any chunk has
embedding=None.- Return type: