ractogateway.rag._models.document
Core document and chunk models for RAG.
Every piece of content in the RAG pipeline is represented as a Document
(raw, as loaded from a file) or a Chunk (a processed, embeddable slice of
a document). Both are strict Pydantic models with no unvalidated fields.
- class ractogateway.rag._models.document.ChunkMetadata(**data)[source]
Bases:
BaseModelProvenance and positional data attached to every chunk.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- source: str
- chunk_index: int
- total_chunks: int
- start_char: int
- end_char: int
- doc_id: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.Document(**data)[source]
Bases:
BaseModelA raw document loaded from a file or supplied as plain text.
- Parameters:
content (str) – The full extracted text of the document.
source (str) – Absolute file path, URL, or a descriptive label (e.g.
"manual").metadata (dict[str, Any]) – Free-form key/value pairs (file size, author, MIME type, …).
doc_id (str) – Auto-generated UUID; override only when you need stable IDs.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- doc_id: str
- content: str
- source: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ractogateway.rag._models.document.Chunk(**data)[source]
Bases:
BaseModelA single embeddable slice of a document.
Produced by a
BaseChunkerand enriched with an embedding vector by aBaseEmbedder.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- chunk_id: str
- doc_id: str
- content: str
- metadata: ChunkMetadata
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].