Default
Note: Some of the properties reflected are not feasible with Multi-Cloud version of Gathr. These properties are marked with **
All default or shared kind of configurations properties come under this category. This category is further divided into various sub-categories.
Platform
| Field | Description | 
|---|---|
| Application Logging Level | The logging level to be used for gathr logs. | 
| Gathr HTTPs Enabled | Whether gathr application support HTTPs protocol or not. | 
| Spark HTTPs Enabled | Whether Spark server support HTTPs protocol or not. | 
| Test Connection Time Out | Timeout for test connection (in ms). | 
| Java Temp Directory | The temp directory location. | 
| Gathr Reporting Period | Whether to enable View Data link in application or not. | 
| View Data Enabled | Whether to enable View Data link in application or not. | 
| TraceMessage Compression | The type of compression used on emitted TraceMessage from any component. | 
| Message Compression | The type of compression used on emitted object from any component. | 
| Enable Gathr Monitoring Flag | Flag to tell if monitoring is enabled or not. | 
| CEP Type | Defines the name of the cep used. Possible value is esper as of now. | 
| Enable Esper HA Global | To enable or disable HA. | 
| CepHA Wait Interval | The wait interval of primary CEP task node. | 
| Gathr Scheduler Interval | The topology stopped alert scheduler’s time interval in seconds. | 
| Enable Gathr Scheduler | Flag to enable or disable the topology stopped alert. | 
| Gathr Session Timeout | The timeout for a login session in gathr. | 
| Enable dashboard | Defines whether dashboard is enable or disable. | 
| Enable Log Agent | Defines if Agent Configuration option should be visible on gathr GUI or not. | 
| Enable Storm Error Search | Enable showing pipeline Application Errors tab using LogMonitoring search page. | 
| Gathr Pipeline Error Search Tenant Token | Tenant token for Pipeline Error Search. | 
| Gathr Storm Error Search Index Expression | Pipeline application error index expression (time based is expression to create indexes in ES or Solr, that is used during retrieval also). | 
| Kafka Spout Connection Retry Sleep Time | Time between consecutive Kafka spout connection retry. | 
| Cluster Manager Home URL | The URL of gathr Cluster Manager | 
| Gathr Pipeline Log Location | gathr Pipeline Log Location. | 
| HDFS Location for Pipeline Jars | HDFS Location for Pipeline Jars. | 
| Scheduler Table Prefix | Tables name starting with a prefix which are related to storing scheduler’s state. | 
| Scheduler Thread Pool Class | Class used to implement thread pool for the scheduler. | 
| Scheduler Thread Pool Thread Count | This count can be any positive integer, although only numbers between 1 and 100 are practical. This is the number of threads that are available for concurrent execution of jobs. If only a few jobs run a few times a day, then 1 thread is plenty. However if multiple jobs, with most of them running every minute, then you probably want a thread count like 50 or 100 (this is dependent on the nature of the jobs performed and available resources). | 
| Scheduler Datasource Max Connections | The maximum number of connections that the scheduler datasource can create in its pool of connections. | 
| Scheduler Misfire Threshold Time | Milliseconds the scheduler will tolerate a trigger to pass its next-fire-time by, before being considered misfired. | 
| HDP Version | Version of HDP ecosystem. | 
| CDH Version | Version of CDH ecosystem. | 
| Audit Targets | Defines the Audit Logging Implementation to be use in the application, Default is file. | 
| Enable Audit | Defines the value (true/false) for enabling audit in application. | 
| Persistence Encryption Key | Specifies the encryption key used to encrypt data in persistence. | 
| Ambari HTTPs Enabled | Whether Ambari server support HTTPs protocol or not. | 
| Graphite HTTPs Enabled | Whether Graphite server support HTTPs protocol or not. | 
| Elastic Search HTTPs Enabled | Whether Elasticsearch engine support HTTPs protocol or not. | 
| SQL Query Execution Log File Path | File location for logging gathr SQL query execution statistics. | 
| SQL Query Execution Threshold Time (in ms) | Defines the max limit of execution time for sql queries after which event will be logged (in ms). | 
| Lineage Persistence Store | The data store that will be used by data lineage feature. | 
| Aspectjweaver jar location | The absolute path of aspectweaver jar required for inspect pipeline or data lineage. | 
| Is Apache Environment | Default value is false. For all apache environment set it to “true”. | 
Zookeeper
| Field | Description | 
|---|---|
| Zookeeper Retry Count | Zookeeper connection retry count. | 
| Zookeeper Retry Delay Interval | Defines the retry interval for the zookeeper connection. | 
| Zookeeper Session Timeout | Zookeeper’s session timeout time. | 
Spark
| Field | Description | 
|---|---|
| Model Registration Validation Timeout(in seconds) | The time, in seconds, after which the MLlib, ML or H2O model registration and validation process will be failed if the process not complete. | 
| Spark Fetch Schema Timeout(in seconds) | The time, in seconds, after which the fetch schema process of register table will be failed if the process not complete. | 
| Spark Failover Scheduler Period(in ms) | Regular intervals to run scheduler tasks. Only applicable for testing connection of Data Sources in running pipeline. | 
| Spark Failover Scheduler Delay(in ms) | Delay after which a scheduler task can run once it is ready. Only applicable for testing connection of Data Sources in running pipeline. | 
| Refresh Superuser Pipelines and Connections | Whether to refresh Superuser Pipelines and Default Connections in database while web studio restart. | 
| Gathr SparkErrorSearchPipeline Index Expression ** | Pipeline application error index expression (time based js expression to create indexes in ES or Solr, that is used during retrieval). | 
| Enable Spark Error Search ** | Enabled to index and search spark pipeline error in LogMonitoring. | 
| Register Model Minimum Memory | Minimum memory required for web studio to register tables, MLlib, ML or H2O models. Example -Xms512m. | 
| Register Model Maximum Memory | Maximum memory required for web studio to register tables, MLlib, ML or H2O models. Example -Xmx2048m. | 
| H2O Jar Location | Local file system’s directory location at which H2O model jar will be placed after model registration. | 
| H2O Model HDFS Jar Location | HDFS path location at which H2O model jar will be placed after model registration. | 
| Spark Monitoring Scheduler Delay(in ms) ** | Specifies the Spark monitoring scheduler delay in milliseconds. | 
| Spark Monitoring Scheduler Period(in ms) ** | Specifies the Spark monitoring scheduler period in milliseconds. | 
| Spark Monitoring Enable ** | Specifies the flag to enable the spark monitoring. | 
| Spark Executor Java Agent Config | Spark Executor Java Agent configuration to monitor executor process, the command includes jar path, configuration file path and Name of the process. | 
| Spark JVM Monitoring Enable ** | Specifies the flag to enable the spark monitoring. | 
| ES query monitoring index name | Provide the ES query monitoring index name which is required for indexing the data of query streaming. | 
| Scheduler period for es monitoring purging | Scheduler period for es monitoring purging in seconds. | 
| Rotation policy for of ES monitoring graph | Specify the rotation policy for index creation for ES monitoring graph (daily for a period of one day and weekly for 7 days). | 
| Purging duration of ES monitoring index | Purge duration for ES in seconds for es monitoring graph index. Index created before this duration will be deleted. | 
| Enable purging scheduler for ES Graph monitoring | Check the checkbox to enable purging scheduler for ES Graph monitoring. | 
| Spark Version ** | By default the version is set to 2.3. Note: Set spark version to 2.2 for HDP 2.6.3” | 
| Livy Supported JARs Location ** | HDFS location where livy related jar file and application streaming jar file have been kept. | 
| Livy Session Driver Memory ** | Minimum memory that will be allocated to driver while creating livy session. | 
| Livy Session Driver Vcores ** | Minimum virtual cores that will be allocated to driver while creating Livy session. | 
| Livy Session Executor Memory ** | Minimum executor instances that will be allocated while executing while creating Livy seconds where sample data has been kept while schema auto detection. | 
| Livy Session Executor Vcores ** | Minimum virtual cores that will be allocated to executor while creating Livy session. | 
| Livy Session Executor Instances ** | Minimum executor instances that will be allocated while executing while creating Livy session.HDFS where sample data has been kept while schema auto detection. | 
| Livy Custom Jar HDFS Path ** | The full qualified path of HDFS where uploaded custom jar has been kept while creating pipeline. | 
| Livy Data Fetch Timeout ** | The query time interval in seconds for fetching data while data inspection. | 
| isMonitoringGraphsEnabled | Whether monitoring graph is enabled or not. | 
| ES query monitoring index name | this property stores the data of monitoring in this given index of default ES connection. | 
| Scheduler period for ES monitoring purging | in this time interval purging scheduler will invoke and check whether the above index is eligible for purging (in sec.) (tomcat restart require). | 
| Rotation policy of ES monitoring graph | “It can have two values daily or weekly” If daily index will be rotated daily else weekly means only a single day data will be stored in single index otherwise a data of a week will be stored in an index. | 
| Purging duration of ES monitoring index | It’s a duration after which index will be deleted default is 604800 sec. Means index will be deleted after 1 week.” (tomcat restart requires) | 
| Enable purging scheduler for ES Graph monitoring | If we need purging of index or not depend on this flag. Purging will not take place if flag is disable. It requires restart of Tomcat Server. | 
RabbitMQ
| Field | Description | 
|---|---|
| RabbitMQ Max Retries | Defines maximum number of retries for the RabbitMQ connection. | 
| RabbitMQ Retry Delay Interval | Defines the retry delay intervals for RabbitMQ connection. | 
| RabbitMQ Session Timeout | Defines session timeout for the RabbitMQ connection. | 
| Real-time Alerts Exchange Name | Defines the RabbitMQ exchange name for real time alert data. | 
Kafka
| Field | Description | 
|---|---|
| Kafka Message Fetch Size Bytes | The number of byes of messages to attempt to fetch for each topic-partition in each fetch request. | 
| Kafka Producer Type | Defines whether Kafka producing data in async or sync mode. | 
| Kafka Zookeeper Session Timeout(in ms) | The Kafka Zookeeper Connection timeout. | 
| Kafka Producer Serializer Class | The class name of the Kafka producer key serializer used. | 
| Kafka Producer Partitioner Class | The class name of the Kafka producer partitioner used. | 
| Kafka Key Serializer Class | The class name of the Kafka producer serializer used. | 
| Kafka 0.9 Producer Serializer Class | The class name of the Kafka 0.9 producer key serializer used. | 
| Kafka 0.9 Producer Partitioner Class | The class name of the Kafka 0.9 producer partitioner used. | 
| Kafka 0.9 Key Serializer Class | The class name of the Kafka 0.9 producer serializer used. | 
| Kafka Producer Batch Size | The batch size of data produced at Kafka from log agent. | 
| Kafka Producer Topic Metadata Refresh Interval(in ms) | The metadata refresh time taken by Kafka when there is a failure. | 
| Kafka Producer Retry Backoff(in ms) | The amount of time that the Kafka producer waits before refreshing the metadata. | 
| Kafka Producer Message Send Max Retry Count | The number of times the producer will automatically retry a failed send request. | 
| Kafka Producer Request Required Acks | The acknowledgment of when a produce request is considered completed. | 
Security
| Field | Description | 
|---|---|
| Kerberos Sections | Section names in keytab_login.conf for which keytabs must be extracted from pipeline if krb.config.override is set to true. | 
| Hadoop Security Enabled | Set to true if Hadoop in use is secured with Kerberos Authentication. | 
| Kafka Security Enabled | Set to true if Kafka in use is secured with Kerberos Authentication. | 
| Solr Security Enabled | Set to true if Solr in use is secured with Kerberos Authentication. | 
| Keytab login conf file Path | Specify path for keytab_login.conf file. | 
CloudTrial
| Field | Description | 
|---|---|
| Cloud Trial | The flag for Cloud Trial. Possible values are True/False. | 
| Cloud Trial Max Datausage Monitoring Size (in bytes) | The maximum data usage limit for cloud trial. | 
| Cloud Trial Day Data Usage Monitoring Size (in bytes) | The maximum data usage for FTP User. | 
| Cloud Trial Data Usage Monitoring From Time | The time from where to enable the data usage monitoring. | 
| Cloud Trial Workers Limit | The maximum number of workers for FTP user. | 
| FTP Service URL | The URL of FTP service to create the FTP directory for logged in user (required only for cloud trial). | 
| FTP Disk Usage Limit | The disk usage limit for FTP users. | 
| FTP Base Path | The base path for the FTP location. | 
Monitoring
| Enable Monitoring Graphs | Set to True to enable Monitoring and to view monitoring graphs. | 
|---|---|
| QueryServer Monitoring Flag | Defines the flag value (true/false) for enabling the query monitoring. | 
| QueryServer Moniting Reporters Supported | Defines the comma-separated list of appenders where metrics will be published. Valid values are graphite, console, logger. | 
| QueryServer Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. | 
| QueryServer Metrics Duration Rate Unit | Specifies the unit of duration for the queryserver metrics. | 
| QueryServer Metrics Report Duration | Time period after which query server metrics should be published. | 
| Query Retries | Specifies the number of retries to make a query in indexing. | 
| Query Retry Interval (in ms) | Defines query retry interval in milliseconds. | 
| Error Search Scroll Size | Number of records to fetch in each page scroll. Default value is 10. | 
| Error Search Scroll Expiry Time (in secs) | Time after which search results will expire. Default value is 300 seconds. | 
| Index Name Prefix | Prefix to use for error search system index creation. The prefix will be used to evaluate exact index name with partitioning. Default value is sax_error_. | 
| Index number of shards | Number of shards to create in the error search index. Default value is 5. | 
| Index Replication Factor | Number of replica copies to maintain for each index shard. Default value is 0. | 
| Index Scheduler Frequency (in secs) | Interval (in secs) after which scheduler will collect error data and index in index store. | 
| Index Partitioning Duration (in hours) | Time duration after which a new index will be created using partitioning. Default value is 24 hours. | 
| Data Retention Time (in days) | Time duration for retaining old data. Data above this threshold will be deleted by scheduler. Default value is 60 days. | 
Audit
| Field | Description | Default Value | 
|---|---|---|
| Enable Event Auditing | Defines the value for enabling events auditing in the application. | true | 
| Events Collection Frequency (in secs) | Time interval (in seconds) in which batch of captured events will be processed for indexing. | 10 | 
| Events Search Scroll size | Number of records to fetch in each page scroll on result table. | 100 | 
| Events Search Scroll Expiry (in secs) | Time duration (in seconds) for search scroll window to expire. | 300 | 
| Events Index Name Prefix | Prefix string for events index name. The prefix will be used to evaluate exact target index name while data partitioning process. | sax_audit_ | 
| Events Index Number of Shards | Number of shards to create for events index. | 5 | 
| Events Index Replication Factor | Number of replica copies to maintain for each index shard. | 0 | 
| Index Partitioning Duration (in hours) | Time duration (in hours) after which a new index will be created for events data. A partition number will be calculated based on this property. This calculated partition number prefixed with Events Index Name Prefix value will make target index name. | 24 | 
| Events Retention Time (in days) | Retention time (in days) of data after which it will be auto deleted. | 60 | 
| Events Indexing Retries | Number of retries to index events data before sending it to a WAL file. | 5 | 
| Events Indexing Retries Interval (in milliseconds) | It defines the retries interval (in milliseconds) to perform subsequent retries. | 3000 | 
Query Server
| Field | Description | 
|---|---|
| QueryServer Monitoring Flag | The flag value (true/false) for enabling the query monitoring. | 
| QueryServer Monitoring Reporters Supported | The comma-separated list of appenders where metrics will be published. Valid values are graphite, console, logger. | 
| QueryServer Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. | 
| QueryServer Metrics Duration Rate Unit | Specifies the unit of duration for the queryserver metrics. | 
| QueryServer Metrics Report Duration | Time after which query server metrics should be published. | 
| QueryServer Metrics Report Duration Unit | The units for reporting query server metrics. | 
| Query Retries | The number of retries to make a query in indexing. | 
| Query Retry Interval (in ms) | Defines query retry interval in milliseconds. | 
Others
| Field | Description | 
|---|---|
| Audit Targets | Defines the audit logging implementation to be used in the application, Default is fine. | 
| ActiveMQ Connection Timeout(in ms) | Defines the active MQTT connection timeout interval in ms. | 
| MQTT Max Retries | Max retries of MQTT server. | 
| MQTT Retry Delay Interval | Retry interval, in milliseconds, for MQTT retry mechanism. | 
| JMS Max Retries | Max retries of JMS server. | 
| JMS Retry Delay Interval | Retry interval, in milliseconds, for JMS retry mechanism. | 
| Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. | 
| Metrics Duration Rate Unit | Specifies the unit of duration for the metrics. | 
| Metrics Report Duration | Specifies the duration at interval of which reporting of metrics will be done. | 
| Metrics Report Duration Unit | Specifies the unit of the duration at which queryserver metrics will be reported. | 
| Gathr Default Tenant Token | Token of user for HTTP calls to LogMonitoring for adding/modifying system info. | 
| LogMonitoring Dashboard Interval(in min) | Log monitoring application refresh interval. | 
| Logmonitoring Supervisors Servers | Servers dedicated to run LogMonitoring pipeline. | 
| Export Search Raw Field | Comma separated fields to export LogMonitoring search result. | 
| Elasticsearch Keystore download path prefix | Elasticsearch keystore download path prefix in case of uploading keystore. | 
| Tail Logs Server Port | Listening port number where tail command will listen incoming streams of logs, default is 9001. | 
| Tail Logs Max Buffer Size | Maximum number of lines, that can be stored on browser, default is 1000. | 
| sax.datasets.profile.frequency.distribution.count.limit | Defines the number of distinct values to be shown in the frequency distribution graph of a column in a Dataset. | 
| sax.datasets.profile.generator.json.template | common/templates/DatasetProfileGenerator.json Template of the spark job used to generate profile of a Dataset. | 
| Pipeline Error Notification Email IDs | Provide comma separated email IDs for pipeline error notification. | 
| Pipeline Test Connection Enabled | Check mark the checkbox to enable the email notification when a pipeline component is down. | 
| Maintenance mode enabled | Provide true or false value for enabling the email notification in case pipeline component stops working. | 
| Contextual Logs | A detailed contextual information (e.g. userName, roles, projectName) will be appended in the logs once this option is enabled. | 
| Enable Event Notifier | Check this option to enable event notification based on the provided event notifier type. For example: SNS. | 
| Event Notifier Type | Provide the event notifier type i.e., SNS | 
| SNS Authentication Type | Select the AWS Authentication Type from the available options: - AWS Keys - Instance Profile - Role ARN | 
| AWS Key ID | Provide the AWS account access key. | 
| AWS Secret Key | Provide the AWS account secret key. | 
| SNS Topic Region | Provide the AWS SNS topic region. | 
| SNS Topic Type | Select the SNS topic type from the below available options: - Standard - FIFO | 
| SNS Topic ARN | Provide the SNS topic ARN where you want to publish alert data i.e., arn:aws:sns:us-east-1:123456789012:Test. | 
| Role ARN | Provide the AWS account Role ARN. | 
| Message Group ID | Provide message group ID for the FIFO SNS topic. | 
If you have any feedback on Gathr documentation, please email us!