Recursive Splitter Processor
The Recursive Splitter processor takes text document as input and splits the text progressively into smaller chunks using a list of separators.
This approach allows handling various levels of granularity, ensuring the resulting chunks fit within size constraints while preserving context.
Below are the configuration details of the processor.
Drop Non-Chunked Columns
All columns except for the chunked ones will be dropped from the output.
Input Column
Select the column containing the text you want to split.
Output Column
Specify the name of the column where the split text chunks will be stored.
Separator
Type
Choose whether to define a separator as a string or regex.
Value
Enter your custom separator value.
Include separator in chunks
Check the box if you want to include the separators in the output chunks.
Chunk
Size
Set the maximum number of characters for each chunk.
Overlap
Define the number of characters to repeat between consecutive chunks.
ADD CONFIGURATION
option to add further configurations in key-value pair is available.
If you have any feedback on Gathr documentation, please email us!