Getting Started
User Guide
API Reference
Abstract base class for text chunkers.
Bases: ABC
ABC
Split a Document into a list of Chunk objects.
Document
Chunk
Each chunk preserves provenance (doc_id, chunk_index, start_char, end_char) in its ChunkMetadata.
doc_id
chunk_index
start_char
end_char
ChunkMetadata
Split document into chunks.
document (Document) – The fully-loaded document to split.
list[Chunk]
list
list[Chunk] – Ordered list of non-overlapping (or slightly overlapping) chunks.