Character Splitter Processor
The Character Splitter processor takes text document as input and breaks down the text into smaller chunks based on specific parameters.
Below are the configuration details of the processor.
Drop Non-Chunked Columns
All columns except for the chunked ones will be dropped from the output.
Input Column
Select the column containing the text you want to split.
Output Column
Specify the name of the column where the split text chunks will be stored.
Separator
Type
Choose whether to define a separator as a string or regex.
Value
Enter your custom separator value.
Include separator in chunks
Check the box if you want to include the separators in the output chunks.
Chunk
Size
Set the maximum number of characters for each chunk.
Overlap
Define the number of characters to repeat between consecutive chunks.
ADD CONFIGURATION
option to add further configurations in key-value pair is available.
If you have any feedback on Gathr documentation, please email us!