ractogateway.rag.chunkers.recursive_chunker
Recursive character text splitter (LangChain-style).
Tries progressively finer separators ("\n\n", "\n", ". ",
" " and finally character-by-character) until every piece fits within
chunk_size.
-
class ractogateway.rag.chunkers.recursive_chunker.RecursiveChunker(chunk_size=512, overlap=50, separators=None)[source]
Bases: BaseChunker
Split text recursively using a priority list of separators.
- Parameters:
chunk_size (int) – Maximum number of characters per chunk.
overlap (int) – Number of characters of overlap between consecutive chunks.
separators (list[str] | None) – Ordered list of separator strings to try. The first separator that
produces pieces within chunk_size is used.
-
chunk(document)[source]
Split document into chunks.
- Parameters:
document (Document) – The fully-loaded document to split.
- Return type:
list[Chunk]
- Returns:
list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.