Enable HA on Gathr Webstudio
In this article
This document guides the procedure to setup high availability of Gathr tomcat.
When HA is enabled on Gathr, it provides surety on the availability of Gathr Webstudio. It is therefore recommended that Gathr Webstudio should be HA enabled for better availability.
This topic describes the procedure to setup High Availability (HA) on Gathr Webstudio.
Prerequisites
Minimum Two nodes are required for this deployment. Before start deploy Gathr we need to check Gathr pre-requisites.
Load balancer requires preferable haproxy.
NFS mount point or common shared location should be accessible between the two Gathr Webstudio machines.
Haproxy configurations has to be done beforehand. (Refer point 13 for haproxy configurations)
Tomcat should be deployed on the first machine. To know more, see Embedded Gathr →
Steps to Enable HA
In the
config.properties
- setdeployment.mode
as Cluster instead of standalone on the first machine.Update below config parameters in env-config.yaml
sax.web.url : http://<HA proxy host>:8090/Gathr sax.ui.host: <HA-proxy host>
Take backup of existing Gathr folder.
Create NFS mount point directory.
Create one common folder e.g. “gathrfiles”
Move the below folders from Gathr installation dir to NFS mount point from first machine:
- lib
- conf
- pipelineData
- uploadjar
- workflowData
- pythonVirtualEnvironments
- work
- workflowData
- udfjar
- external (If using external templates)
Copy the Gathr installation folder from first machine to other machines, folder structure for Gathr should be the same.
Create softlink from mountpoint to Gathr installation directory (on both machines).
E.g. run the command from Gathr installation directory
```
ln -s <mountdir>/workflowData workflowData
ln -s <mountdir>/lib lib
ln -s <mountdir>/conf conf
ln -s <mountdir>/pipelinedata pipelinedata
ln -s <mountdir>/uploadjar uploadjar
ln -s <mountdir>/pythonVirtualEnvironments pythonVirtualEnvironments
ln -s <mountdir>/work work
ln -s <mountdir>/udfjar udfjar
ln -s <mountdir>/external external
```
Start Gathr on first node with config.reload=true , if it is up and running and able to access URL then start Gathr on other nodes without config.reload=true.
Both URL should be accessible separately now:
If you create pipeline – same should be available on the other Gathr as well.
You should be able to inspect pipeline on both the Gathr.
- Both URLs should be accessible separately now:
Validation:
If you create a pipeline, then the same should be available on the other tomcat.
You should be able to inspect pipelines on both the tomcats.
- Make the changes below in haproxy.cfg file.
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2 debug
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 0
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 0
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend web_server
bind 0.0.0.0:8090
mode http
#http-request redirect scheme https code 301 unless { ssl_fc }
stats enable
stats uri /haproxy?stats
stats realm Strictly\ Private
option http-server-close
option forwardfor
stats auth admin:admin
option httplog
option logasap
frontend web_server2
bind 0.0.0.0:9595
#log 127.0.0.1:514 local0 debug
mode http
#http-request redirect scheme https code 301 unless { ssl_fc }
stats enable
stats uri /haproxy?stats
stats realm Strictly\ Private
option http-server-close
option forwardfor
use_backend web_server2
backend web_server
mode http
balance roundrobin
cookie WEBSTUDIOID insert indirect nocache
server g1 10.80.72.187:8090 cookie g1
server g2 10.80.72.204:8090 cookie g2
backend web_server2
mode http
cookie WEBSTUDIOID insert indirect nocache
server g1 10.80.72.187:9595 cookie g1
server g2 10.80.72.204:9595 cookie g2
- Restart haproxy server then open Gathr with haproxy url.
If you have any feedback on Gathr documentation, please email us!