If the Sensor Gateway goes down, the sensors cannot communicate with the Carbon Black Cloud. To have High Availability (HA), Carbon Black recommends setting up at least two Sensor Gateway servers in Active or Passive mode. As a result, if one Sensor Gateway goes down, the other is available to cater the requests.
Although you can view the health of your Sensor Gateway servers in the Cloud, Carbon Black recommends monitoring the health of the Sensor Gateway in-house using any of the available tools. Thus, ensuring the Sensor Gateway servers remain healthy and avoid a situation where both the Active and Passive Sensor Gateway servers are down.
To set up the Sensor Gateway in Active or Passive mode, follow the steps.
Note: Consider the following when configuring the Keepalived settings.
- Priority value is higher on the primary server. The state does not matter. If the state is MASTER but the priority is lower than the router with BACKUP, the server loses its MASTER state.
Virtual_router_id
must be the same on the Sensor Gateway Server 1 and the Sensor Gateway Server 2.
- By default, a single
vrrp_instance
supports up to 20 virtual_ipaddress
. To add more addresses, you must add more instances.
Prerequisites
Ensure you have a network setup. The IP addresses in the network scenario below are for illustration purpose only. The Virtual IP is of utmost importance. You use this IP address while configuring the sensors. You can also map the IP to a DNS
sensor-gateway.somecompany.com
and use the FQDN while configuring the sensors.
Table 1.
Network scenario
Sensor Gateway Server 1 |
192.168.10.111 (eth0) |
Sensor Gateway Server 2 |
192.168.10.112 (eth0) |
Virtual IP |
192.168.10.121 |
Procedure
- Install the required packages for configuring Keepalived on each of the Sensor Gateway servers by running the commands:
sudo apt-get update
sudo apt-get install linux-headers-$(uname -r)
- Install Keepalived on both Sensor Gateway servers by running the command:
sudo apt-get install keepalived
Keepalived packages are available under the default apt repositories.
- Set up Keepalived on the Sensor Gateway Server 1.
- Create or open with an editor the Keepalived configuration file /etc/keepalived/keepalived.conf:
vim /etc/keepalived/keepalived.conf
- Add the following settings. Update all values with your network and system configuration.
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
[email protected]
}
notification_email_from [email protected]
smtp_server localhost
smtp_connect_timeout 30
}
vrrp_script chk_sgw_service_status {
script "/bin/sh /usr/local/bin/sgw-service-check.sh"
interval 30
fall 3
rise 3
timeout 2
weight 5
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 101
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.10.121
}
track_script {
chk_sgw_service_status
}
}
- Set up Keepalived on the Sensor Gateway Server 2.
- Create or open with an editor the Keepalived configuration file /etc/keepalived/keepalived.conf:
vim /etc/keepalived/keepalived.conf
- Add the following settings. Update all values with your network and system configuration.
Note: Set the priority value lower than the one in the
Sensor Gateway Server 1. For example, the configuration below shows the priority with value of 100 and the
Sensor Gateway Server 1 has its priority as 101.
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
[email protected]
}
notification_email_from [email protected]
smtp_server localhost
smtp_connect_timeout 30
}
vrrp_script chk_sgw_service_status {
script "/bin/sh /usr/local/bin/sgw-service-check.sh"
interval 30
fall 3
rise 3
timeout 2
weight 5
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 101
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.10.121
}
track_script {
chk_sgw_service_status
}
}
- Create and edit /usr/local/bin/sgw-service-check.sh on both Sensor Gateway servers.
#!/bin/sh
curl -s -k https://localhost/sgw/health_check > /tmp/health_response.json
RESPONSE=$(jq '.ErrorCode' /tmp/health_response.json)
if [ "$RESPONSE" = "null" ]
then
exit 0
else
exit 1
fi
- Start the Keepalived service and configure it to auto-start on system boot.
sudo service keepalived start
- Check for assigned Virtual IP on the interface.
ip addr show eth0
By default, a virtual IP is assigned to the primary server. If the primary server goes down, it automatically assigns the virtual IP to the secondary server.
Sample output:
2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:b9:b0:de brd ff:ff:ff:ff:ff:ff
inet 192.168.10.111/24 brd 192.168.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.10.121/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::11ab:eb3b:dbce:a119/64 scope link
valid_lft forever preferred_lft forever
- Verify IP Failover.
- Shut down the primary server (Sensor Gateway Server 1) and check if IPs automatically assign to the secondary server.
- Start the Sensor Gateway Server 1 and stop the Sensor Gateway Server 2.
IPs automatically assign to the primary server.
- Check the log files to ensure the IP failover verification is working.
Sample output:
Feb 7 17:24:51 tecadmin Keepalived_healthcheckers[23177]: Registering Kernel netlink reflector
Feb 7 17:24:51 tecadmin Keepalived_healthcheckers[23177]: Registering Kernel netlink command channel
Feb 7 17:24:51 tecadmin Keepalived_healthcheckers[23177]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 7 17:24:51 tecadmin Keepalived_healthcheckers[23177]: Configuration is using : 11104 Bytes
Feb 7 17:24:51 tecadmin Keepalived_healthcheckers[23177]: Using LinkWatch kernel netlink reflector...
Feb 7 17:24:52 tecadmin Keepalived_vrrp[23178]: VRRP_Instance(VI_1) Transition to MASTER STATE
Feb 7 17:24:53 tecadmin Keepalived_vrrp[23178]: VRRP_Instance(VI_1) Entering MASTER STATE
Feb 7 17:24:53 tecadmin avahi-daemon[562]: Registering new address record for 192.168.10.121 on eth0.IPv4.
Appliance storage space is fully occupied by logs from /var/log/auth.log