ractogateway.rag.chunkers.base

Abstract base class for text chunkers.

class ractogateway.rag.chunkers.base.BaseChunker[source]

Bases: ABC

Split a Document into a list of Chunk objects.

Each chunk preserves provenance (doc_id, chunk_index, start_char, end_char) in its ChunkMetadata.

abstractmethod chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.