Kudu ETL Source
Apache Kudu is a column-oriented data store of the Apache Hadoop ecosystem. It enable fast analytics on fast (rapidly changing) data. The channel is engineered to take advantage of hardware and in-memory processing. It lowers query latency significantly from similar type of tools.
Schema Type
See the topic Provide Schema for ETL Source → to know how schema details can be provided for data sources.
After providing schema type details, the next step is to configure the data source.
Data Source Configuration
Each configuration property available in the Kafka data source is explained below.
Connection Name
Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Kudu earlier. Or create one as explained in the topic - Kudu Connection →
Table Name
Name of the table to fetch the data.
ADD CONFIGURATION: To add additional custom Kafka properties in key-value pairs.
Metadata
Enter the schema and select table. You can view the Metadata of the tables.
| Field | Description | 
|---|---|
| Table | Select table of which you want to view Metadata. | 
| Column Name | Name of the column generated from the table. | 
| Column Type | Type of the column, for example: Text, Int | 
| Nullable | If the value of the column could be Nullable or not. | 
Detect Schema
Check the populated schema details. For more details, see Schema Preview →
Incremental Read
| Field | Description | 
|---|---|
| Enable Incremental Read | Check this check-box to enable incremental read support. | 
| Column to Check | Select a column on which incremental read will work. Displays the list of columns that has integer, long, date, timestamp, decimal types of values. | 
| Start Value | Mention a value of the reference column, only the records whose value of the reference column is greater than this value will be read. | 
| Read Control Type | Provides three options to control data to be fetched -None, Limit By Count, and Maximum Value. None: All the records with value of reference column greater than offset will be read. Limit By Count: Mentioned no. of records will be read with the value of reference column greater than offset will be read. Maximum Value: All the records with value of reference column greater than offset and less than Column Value field will be read. For None and Limit by count it is recommended that table should have data in sequential and sorted (increasing) order. | 
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!