ractogateway.rag.chunkers.recursive_chunker

Recursive character text splitter (LangChain-style).

Tries progressively finer separators ("\n\n", "\n", ". ", " " and finally character-by-character) until every piece fits within chunk_size.

class ractogateway.rag.chunkers.recursive_chunker.RecursiveChunker(chunk_size=512, overlap=50, separators=None)[source]

Bases: BaseChunker

Split text recursively using a priority list of separators.

Parameters:
  • chunk_size (int) – Maximum number of characters per chunk.

  • overlap (int) – Number of characters of overlap between consecutive chunks.

  • separators (list[str] | None) – Ordered list of separator strings to try. The first separator that produces pieces within chunk_size is used.

chunk(document)[source]

Split document into chunks.

Parameters:

document (Document) – The fully-loaded document to split.

Return type:

list[Chunk]

Returns:

list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.