Cassandra Emitter
In this article
Cassandra emitter allows you to store data in a Cassandra table.
Cassandra Emitter Configuration
To add a Cassandra emitter into your pipeline, drag the emitter to the canvas and connect it to a Data Source or processor.
The configuration settings are as follows:
| Field | Description | 
|---|---|
| Connection Name | All Cassandra connections will be listed here. Select a connection for connecting to Advance Kafka. | 
| KeySpace | Cassandra keyspace name. If keyspace name does not exist in Cassandra, then it will create new keyspace. | 
| Output Fields | Output messages fields. | 
| Key Columns | A single/compound primary key consists of the partition key and one or more additional columns that determines clustering. | 
| Table Name Expression | Cassandra table name. If the table name does not exist on the keyspace then it will create a new table. The user can create tables dynamically based on field name provided in table name expression. | 
| Consistency Level | Consistency level refers to how up-to-date and synchronized a row of Cassandra data is on all its replicas. Consistency levels are as follows: ONE: Only a single replica must respond. TWO: Two replicas must respond. THREE: Three replicas must respond. QUORUM: A majority (n/2 + 1) of the replicas must respond. ALL: All of the replicas must respond. LOCAL_QUORUM: A majority of the replicas in the local data center (whichever data center the coordinator is in) must respond. EACH_QUORUM: A majority of the replicas in each data center must respond. LOCAL_ONE: Only a single replica must respond. In a multi-data center cluster, this also guarantees that read requests are not sent to replicas in a remote data center. | 
| Replication Strategy | A replication strategy specifies the implementation class for determining the nodes where replicas are placed. Possible strategies are SimpleStrategy and NetworkTopologyStrategy. | 
| Replication Factor | Replication factor used to make additional copies of data. | 
| Enable TTL | Select the checkbox to enable TTL (Time to Live) for records to persist for that time duration. | 
| TTL Value | It will appear only when Enable TTL checkbox is selected. Provide TTL value in seconds. | 
| Checkpoint Storage Location | Select the checkpointing storage location. Available options are HDFS, S3, and EFS. | 
| Checkpoint Connections | Select the connection. Connections are listed corresponding to the selected storage location. | 
| Checkpoint Directory | It is the path where Spark Application stores the checkpointing data. For HDFS and EFS, enter the relative path like /user/hadoop/, checkpointingDir system will add suitable prefix by itself. For S3, enter an absolute path like: S3://BucketName/checkpointingDir | 
| Time-Based Check Point | Select checkbox to enable timebased checkpoint on each pipeline run i.e. in each pipeline run above provided checkpoint location will be appended with current time in millis. | 
| Batch Size | Number of records to be picked for inserting into Cassandra. | 
| Output Mode | Output mode to be used while writing the data to Streaming emitter. Select the output mode from the given three options: Append: Output Mode in which only the new rows in the streaming data will be written to the sink Complete Mode: Output Mode in which all the rows in the streaming data will be written to the sink every time there are some updates Update Mode: Output Mode in which only the rows that were updated in the streaming data will be written to the sink every time there are some updates. | 
| Save Mode | Save Mode is used to specify the expected behavior of saving data to a data sink. ErrorifExist: When persisting data, if the data already exists, an exception is expected to be thrown. Append: When persisting data, if data/table already exists, contents of the Schema are expected to be appended to existing data. Overwrite: When persisting data, if data/table already exists, existing data is expected to be overwritten by the contents of the data. Ignore: When persisting data, if data/table already exists, the save operation is expected to not save the contents of the data and to not change the existing data. This is similar to a CREATE TABLE IF NOT EXISTS in SQL. | 
| Enable Trigger | Trigger defines how frequently a streaming query should be executed. | 
| Processing Time | It will appear only when Enable Trigger checkbox is selected. Processing Time is the trigger time interval in minutes or seconds. | 
| Add Configuration | Enables to configure additional Cassandra properties. | 
If you have any feedback on Gathr documentation, please email us!