base#

Classes

base.Language(value[, names, module, ...])

Enum of the programming languages.

base.TextSplitter(chunk_size, chunk_overlap, ...)

Interface for splitting text into chunks.

base.TokenTextSplitter([encoding_name, ...])

Splitting text to tokens using model tokenizer.

base.Tokenizer(chunk_overlap, ...)

Tokenizer data class.

Functions

base.split_text_on_tokens(*, text, tokenizer)

Split incoming text and return chunks using tokenizer.