Data Intelligence and MCP Setup
This document provides step-by-step instructions to set up Logstash 8.11.0, Data Intelligence (DI) service and MCP service.
Logstash 8.11.0 Setup
Prerequisites
- JDK 17 installed
- Elasticsearch 8.11.0 running and accessible
Steps to setup Logstash 8.11.0
- Import GPG Key using the below command
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
- Create YUM Repo
sudo tee /etc/yum.repos.d/elastic.repo > /dev/null <<EOF
[elastic-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
- Install Logstash
sudo dnf install logstash-8.11.0 -y
- Verify Installation
/usr/share/logstash/bin/logstash --version
Steps to Configure Logstash
- Create custom config directory:
mkdir -p /usr/share/logstash/config
cp -R /etc/logstash/* /usr/share/logstash/config/
cd /usr/share/logstash/config/
mkdir gathr_config
- Create pipeline config file
vi /usr/share/logstash/config/gathr_config/logstash_di.conf
input {
file {
path => "<MOUNT_LOCATION>/gathr-mount/di/logs/genai-ama-logs/genai-ama-conversations/*"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json { source => "message" }
mutate { copy => [ "appName", "[@metadata][appNameIndex]" ] }
mutate { lowercase => ["[@metadata][appNameIndex]"] }
mutate { add_field => { "indexName" => "genai_ama_conversations_%{[@metadata][appNameIndex]}" } }
mutate { remove_field => [ "@timestamp", "@version", "path", "host", "message"] }
}
output {
elasticsearch {
hosts => ["https://<ES_IP>:9200/"]
ssl => true
user => "elastic"
password => "YYK=sW3w6YrsTQjFyOgS"
cacert => "/etc/ssl/certs/es-ca.crt"
index => "%{indexName}"
}
stdout {}
}
- Update pipelines.yml using the below command
vi /usr/share/logstash/config/pipelines.yml
- pipeline.id: di_pipeline
path.config: "/usr/share/logstash/config/gathr_config/logstash_di.conf"
- Create Systemd Service
[Unit]
Description=logstash
[Service]
Type=simple
User=root
Group=root
Environment="LS_JAVA_HOME=/usr/share/openjdk"
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/usr/share/logstash/config"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384
TimeoutStopSec=infinity
[Install]
WantedBy=multi-user.target
- Enable & Start Service
sudo systemctl daemon-reexec
sudo systemctl enable --now logstash
- Check Logs
sudo systemctl status logstash
sudo journalctl -u logstash -f
tail -f /var/log/logstash/logstash-plain.log
Data Intelligence (DI) Setup
Prerequisites
- DI Docker tar provided by Gathr team
- Docker & Docker Compose installed
Step for DI Setup
- Update .env File
# Data Intelligence
ENABLE_DATA_INTELL=true
DATA_INTELL_HOST=<DI_HOST>
DATA_INTELL_PORT=5001
DI_CPU=2
DI_RAM=4g
DI_MOUNT_PATH=/opt/ul-merge/Gathr/mount
# Data storage path
GATHR_DATA_VOLUME_PATH=<GATHR_MOUNT_LOCATION>
ZK_CONNECTION_STRING=<ZKIP>:2181
- Load DI Docker Image
docker load -i diservice.tar
- Create di-compose.yml
version: '3.8'
services:
di-service:
restart: always
deploy:
resources:
limits:
cpus: "${DI_CPU}"
memory: "${DI_RAM}"
image: diservice
ports:
- "5001:5001"
volumes:
- ${GATHR_DATA_VOLUME_PATH}/gathr-mount/di:/opt/ul-merge/Gathr/mount/di
- ${GATHR_DATA_VOLUME_PATH}/gathr-mount/di/genai-logs:/opt/python_projects/di/genai-logs/
environment:
DI_SERVICE_USER: ${GATHR_SERVICE_USER}
DI_SERVICE_UID: ${GATHR_SERVICE_UID}
DI_SERVICE_GROUP: ${GATHR_SERVICE_GROUP}
DI_SERVICE_GID: ${GATHR_SERVICE_GID}
ZK_CONNECTION_STRING: ${ZK_CONNECTION_STRING}
GATHR_DATA_VOLUME_PATH: ${GATHR_DATA_VOLUME_PATH}
DI_MOUNT_PATH: ${DI_MOUNT_PATH}
networks:
- spark
networks:
spark:
- Deploy DI Service
docker-compose -f di-compose.yml up -d
At this point:
Logstash is installed and configured to push DI logs into Elasticsearch.
DI service is running inside Docker with configured mounts and environment variables.
Configure path in Gathr env-config.yaml
sax-config :
env-config :
genaiapp:
ama.base.url: "http://<IP>:5001/"
log.folder.base.path: "/opt/ul-merge/Gathr/mount/di/logs/genai-ama-logs"
service.type: "RestAPI"
data.folder.base.path: "/opt/ul-merge/Gathr/mount/di/genAIAppData"
metering.json.folder.path: "/opt/ul-merge/Gathr/mount/di/genAIAppMeteringInformation/"
genericjdbc.lib.base.path: "/opt/ul-merge/Gathr/mount/di/genAIAppCData/"
yaml.folder.base.path: "/opt/ul-merge/Gathr/mount/di/genAIAppYAML"
- Configure in Gathr common.yaml
genai.di.component.metadata.json.path: "/opt/ul-merge/Gathr/mount/di/componentMetadata/"
MCP Setup
Prerequisites
- DI service must be up and accessible
- MCP tarball provided by Gathr team
- Docker & Docker Compose installed
Steps for MCP Setup
- Update .env File
Add the following variables:
# MCP SERVER
GATHR_URL="<GATHR_URL>"
MCP_CPU=1
MCP_RAM=2g
# Data storage path (Path where Gathr will store the data externally)
GATHR_DATA_VOLUME_PATH=<GATHR_MOUNT_LOCATION>
- Load MCP Docker Image
docker load -i mcpservice.tar
- Create mcp-compose.yml
version: '3.9'
services:
mcp-service:
restart: always
deploy:
resources:
limits:
cpus: "${MCP_CPU}"
memory: "${MCP_RAM}"
image: mcpservice:latest
ports:
- "8001:8001"
volumes:
- ${GATHR_DATA_VOLUME_PATH}/gathr-mount/di/mcplogs:/opt/python_project/mcp/logs
environment:
MCP_SERVICE_USER: ${GATHR_SERVICE_USER}
MCP_SERVICE_UID: ${GATHR_SERVICE_UID}
MCP_SERVICE_GROUP: ${GATHR_SERVICE_GROUP}
MCP_SERVICE_GID: ${GATHR_SERVICE_GID}
GATHR_DATA_VOLUME_PATH: ${GATHR_DATA_VOLUME_PATH}
GATHR_URL: ${GATHR_URL}
networks:
- spark
networks:
spark:
- Deploy MCP Service
docker-compose -f mcp-compose.yml up -d
MCP service is deployed and integrated with DI
- Update in Gathr common.yaml
genai.di.mcp.server.url: "http://<MCP_SERVER_IP>:8001"
If you have any feedback on Gathr documentation, please email us!