Recursive Splitter Processor

The Recursive Splitter processor takes text document as input and splits the text progressively into smaller chunks using a list of separators.

This approach allows handling various levels of granularity, ensuring the resulting chunks fit within size constraints while preserving context.

Below are the configuration details of the processor.

Drop Non-Chunked Columns

All columns except for the chunked ones will be dropped from the output.

Input Column

Select the column containing the text you want to split.

Output Column

Specify the name of the column where the split text chunks will be stored.

Separator

Type

Choose whether to define a separator as a string or regex.

Value

Enter your custom separator value.

Include separator in chunks

Check the box if you want to include the separators in the output chunks.

Chunk

Size

Set the maximum number of characters for each chunk.

Overlap

Define the number of characters to repeat between consecutive chunks.

ADD CONFIGURATION

option to add further configurations in key-value pair is available.

Top