Computermonitoringfifthchapter

Monitoring and Alerts

Overview of Pipeline Monitoring and Alerts

When you run a pipeline, you can view real-time statistics about the pipeline, examine data being processed, and create rules and alerts.

Monitoring tab allows you to:

• Monitor the performance of pipelines

• Configure alerts for each metrics.

There are two types of Monitoring in StreamAnalytix:

• SuperUserMonitoring, that only monitors superuser pipeline with factors such as Summary View, Jobs, Stages and Executors.

• Monitoring of Pipelines, with two distinctions:

• Monitoring view (View Summary) of the pipeline

• Monitoring of the workspace pipeline with factors such as Summary view, Jobs, stages and Executors.

Spark Pipeline Monitoring

On the Monitoring page, you can choose any pipeline (superuser and workspace) to view data as it passes through the pipeline.

You can monitor the pipelines and their metrics.

Pipeline Monitoring Views

To monitor a pipeline from the data pipeline page itself, go to the pipeline tile and once the pipeline is in active mode, click on the three dots and choose View Summary:

NOTE: To enable real time monitoring graphs, go to Superuser view and then Configuration >Defaults >Spark > Enable Monitoring Graphs , and set its value to true.

To view the real-time monitoring graphs, click on the View Summary.

The View Summary tab provides the following tab:

Monitoring

Under the Monitoring tab, you will find the following graphs, as shown below:

Input vs. Processing Rate
The input rate specifies rate at which data is arriving from a source (channel) e.g. Kafka.

Processing rate specify the rate at which data from this source is being processed by Spark

Here number of input rows defines number of records processed in a trigger execution.

Batch Duration
Approximate time in to process this over all micro-batch(trigger) in seconds.

Aggregation State
Shows aggregation state of current streaming query. It shows information about operator query that store the aggregation state. The graph will be seen in case if pipeline has done some aggregation.

By Default, in spark configuration tab “Enable Monitoring Graphs” property will be disabled for monitoring.

Monitoring Page

For enabling this page, go to configuration page > Default> Spark

Select Spark Monitoring Enable check box as shown below:

Click on the Monitoring on the home page of StreamAnalytix.

You will be directed to the Monitoring page. Under Spark, you will view four tabs: Summary View, Jobs, Stages and Executors.

StreamAnalytix enables you to monitor the performance of active Spark pipelines through graphs.

Summary View

Summary View provides a basic summary of the Spark components (Jobs, Stages and Executors) and their historical data with different kind of views.

The Summary View has the following widgets:

1. Jobs history: Shows all the past active, failed and completed jobs of a pipeline for the time selected.

2. Latest jobs: Provides an overview of all the Active/Failed/Completed jobs, which are currently executing in a pipeline.

3. Input records/sec: Number of records received per second as input for processing at a given time.

Job History shows all the active jobs from past of the pipeline for any time duration.

The X-axis displays the Time and Y-axis displays the number of jobs. For example, at 12:56 number of completed jobs are 8000+

Field	Description
Active Jobs	Number of active jobs in a pipeline.
Failed Jobs	Number of pipeline jobs failed for a given period.
Completed Jobs	Count of jobs that completed successfully in a pipeline.

Note: Count shows the number of alerts configured in the widget.

Latest Jobs:

The Latest Jobs widget displays Active, Failed and Completed jobs.

Field	Description
Active Jobs	Number of currently active jobs of the pipeline. For example, the above screenshot shows 1 Active Job (shown on top of the circle)
Failed Jobs	Total number of pipeline jobs failed for a given period.
Completed Jobs	Total count of jobs that completed successfully in a pipeline. For example, the above screenshot shows 8663 completed jobs.

Input Records per Sec: Number of records received per second as input for processing at a given time.

The X-axis shows the time whereas Y-axis shows input records received per second.

Jobs

Job is parallel computation that consists multiple tasks which is produced in response to a Spark action. For example, Actions like save and collect.

Jobs: The Jobs widget provides information about the Active, Completed and Skipped stages.

Field	Description
Active Batches	Total number of currently executing stages of the jobs. For example, the above screenshot shows 1 Active Stage at a given period.
Completed Stages	Total number of completed stages of the jobs.
Skipped Stages	Total number of skipped stages of the jobs when the output of a stage has already been computed.

Stages

A stage is a set of independent tasks all computing the same function that need to run as part of a Spark job, where all the tasks have the same shuffle dependencies. Stages page shows all the active stages of the running pipeline.

Field	Description
Bytes In/Out for all Active Stages	This shows the time progressive graph of number of bytes entering (bytes in) and number of bytes leaving a stage (bytes out).
Records In/Out for All Active Stages	This shows the time progressive graph of number of records entering (records in) and number of records leaving a stage (records out).
Shuffle Bytes Read/Write for all Active Stages	This shows the time progressive graph of number of bytes read (shuffle bytes read) and number of bytes written (shuffle bytes write) in a shuffle stage. This is done after a reduce-by key operation.
Shuffle Records Read/Write for all Active Stages	This shows the time progressive graph of number of records read (shuffle records read) and number of records written (shuffle records write) in a shuffle stage as done after a reduce by key operation.

Bytes In/Out for all Active Stages:

This shows the time progressive graph of number of bytes entering (bytes in) and number of bytes leaving a stage (bytes out).

The X-axis shows the Time whereas Y-axis shows the Bytes In/Out for all Active Stages.

Field	Description
Input Bytes	Bytes read from Hadoop or from Spark Storage.
Output Bytes	Bytes written to Hadoop.

Stages-Records In/Out for All Active Stages: This shows the time progressive graph of number of records entering (records in) and number of records leaving a stage (records out )

The X-axis shows the Date whereas Y-axis shows the Records In/Out for all Active Stages.

Field	Description
Input Bytes	Records read from Hadoop or from Spark Storage.
Output Bytes	Records read from Hadoop or from Spark Storage.

Stages-Shuffle Bytes Read/Write for all Active Stages: This shows the time progressive graph of number of bytes read (shuffle bytes read) and number of bytes written (shuffle bytes write) in a shuffle stage.

The X-axis shows the Date whereas Y-axis shows Shuffle Bytes Read/Write for all Active Stages.

Field	Description
Shuffle Read Bytes	Shows the number of bytes read (shuffle bytes read) in a shuffle stage.
Shuffle Write Bytes	Shows the number of bytes written (shuffle bytes write) in a shuffle stage.

Shuffle Records Read/Write for all Active Stages: Shuffle Records Read/Write widget shows the time progressive graph of number of records read (shuffle records read) and number of records written (shuffle records write) in a shuffle stage.

The X-axis shows t.he Date whereas Y-axis shows Shuffle Records Read/Write for all Active Stages.

Field	Description
Shuffle Read Records	Shows the number of records read (shuffle records read) in a shuffle stage.
Shuffle Write Records	Shows the number of records written (shuffle records write) in a shuffle stage.

Executors

Executors are worker-nodes processes in charge of running individual tasks in each Spark job. They are launched at the beginning of a Spark application and run for a lifetime of an application. It shows all the executor details of the currently running pipeline.

Field	Description
Max Memory	Maximum memory consumed by the executors
Disk Used	Disk used by the executors
RDD Blocks	Number of RDD blocks of the executors
Total Duration	Total time taken by the executors
Total Input Bytes	Total Input Bytes processed by the executors

There are two types of graphs available for each pipeline.

1. Total Shuffle Read/Write

2. Tasks

Total Shuffle Read/Write: Shuffling means the reallocation of data between multiple Spark stages. Total Shuffle Read/Write widget provides information about the Total Shuffle Read Data and Total Shuffle Write Data.

The X-axis shows the date and the Y-axis shows the Total Shuffle Read/Write data. ..

Field	Description
Total Shuffle Read	The total size of data in bytes read from other executors in the shuffle phases
Total Shuffle Write	The total size of data in bytes written to other executors in the shuffle phases

Tasks provides information about the total number of Completed Tasks, Active Tasks and Failed Tasks.

Field	Description
Completed tasks	Total number of successfully completed tasks by the executors.
Active Tasks	Number of currently executing tasks of the executors.
Active Tasks	Number of failed tasks of the executors.

Field

Description

Completed tasks

Total number of successfully completed tasks by the executors.

Active Tasks

Number of currently executing tasks of the executors.

Active Tasks

Number of failed tasks of the executors.

Configuring Alerts

StreamAnalytix allows you to create alerts to be notified on System or Application Events. Alerts facilitates you to be notified every time the given criteria for an alert is satisfied.

Alerts are divided in two categories:

• Application Alerts: Shows the alerts applied on the data pipeline using Alert processor.

• System Alerts: Shows the alerts applied on Data Pipeline Monitoring for its components, System Health Metrics of CPU, and system error.

For a pipeline, you can configure alerts on Alerts Processor (Application Alerts) or Monitoring widgets Manage Alerts (Monitoring alerts/System Alerts).

You can dynamically enable or disable alerts of running pipeline. If you disable alerts (currently enabled), then no future alerts are generated.

If you enable alert (currently disabled), then based on criteria specified alerts start getting generated.

Application Alert

To create an Application Alert, add the Alert Processor in a pipeline and configure the alert.

View Alerts:

You can view the generated alerts on the Alerts page, Information tab, as shown below:

The configured alerts are displayed under Configured List tab.

System Alert

To create a system alert, go to Monitoring section. Select Manage Alerts under Actions of Processing Time. It allows you to create alerts on system monitoring parameters like CPU Idle state, memory usage etc.

For generating System alerts for Spark, following pipeline should be running:

• SparkMonitorMetricPipeline

Manage Alerts will open a new window that is shown below:

Click on Create New Alert link. The system displays Alert screen:

Field	Description
Alert Name	Provide a name for the alert.
Alert Type	Depending upon the severity of the alert, you can provide an Alert Type. The categories are: General: A general alert is an alert with low severity. Critical: A critical alert is an alert with high severity.
Alert Action Type	Alert uses default action mode and is generated when the criteria that’s defined by the alert is met.
Enable	To enable or disable the alert at the time of configuring data.
Aggregation	StreamAnalytix framework provides the functionality to aggregate similar kind of alerts into one alert. This allows user to generate alert on moving time window in seconds with user provided frequency.
Criteria	Mean: Performs mean operation on selected widget parameters. Min: Performs mean operation on selected widget parameters. Max: Performs mean operation on selected widget parameters. However, Criteria depends on the selected widgets.
Send Alert	It enables you to automate the option to send email alerts to multiple users at the same time. You can add fields like Cc and Bcc.

View Alerts

StreamAnalytix enables you to check the generated alerts on the user interface along with email notifications.

Configured List

On Alerts page, you can view the list of all configured alerts for all pipelines in a workspace. (Go to Alerts --> Configured List)

All configured alerts of pipelines are listed here.

You can search the alerts on the basis of keyword and sort the list column wise by clicking on the column headers.

The configured list provides following information about the alerts.

• Name: Alert Name.

• Pipeline Name: Pipeline name on which alert is configured.

• Type: Alert type whether Monitoring or Application Type

• Actions: The following two type of actions can be performed on Alerts :

• Enable/Disable

• View Criteria

Enable/Disable: Click on the toggle icon and enable or disable the alert.If disabled, alert will not be generated and you will get the message “Alert disabled successfully”.

If alert is enabled, you will get the message “Alert enabled successfully”.

View Criteria: Click on the view criteria button to view the rules specified for generating the alarm.

For example, if active jobs are equal to 23, alarm will be generated.

Count Monitoring Alerts:

You can count the monitoring alerts for different widgets.

Go to Monitoring > Summary View

User can view the count of configured monitoring alerts on each widget.

For example, count(2) in Job History widget specifies two alerts are configured in this widget.

Note: If you delete configured alert, then all related alerts are also deleted

If you delete a pipeline, then all associated configured alerts are also deleted.