It is a best practice to set up scheduled backups or replication for vRealize Log Insight nodes and clusters.
- Verify that no configuration problems exist on source and target sites before performing the backup or replication operations.
- Verify that cluster resource allocation is not at capacity.
In configurations with reasonable ingestion and query loads, the memory and swap usage can reach almost 100% capacity during backup and replication operations. Because memory is near capacity in a live environment, part of the memory spike is due to the vRealize Log Insight cluster usage. Also, the scheduled backup and replication operations can contribute significantly to the memory spike.
In some cases, worker nodes are disconnected momentarily for 1 to 3 minutes before rejoining master nodes, possibly because of high memory usage.
- Reduce the memory throttling on vRealize Log Insight nodes by doing one or both of the following:
- Allocate additional memory over the vRealize Log Insight recommended configurations.
- Schedule the recurring backups during off-peak hours.
- Enable regular backup or replication of vRealize Log Insight forwarders by using the same procedures that you use for the vRealize Log Insight server.
- Verify that the backup frequency and backup types are appropriately selected based on the available resources and customer-specific requirements.
- If the resources are not a problem and if it is supported by the tool, enable concurrent cluster node backups to speed up the backup process.
- Back up all the nodes at the same time.
What to do next
Monitoring—As the backup is in progress, check any environment or performance problems in the vRealize Log Insight setup. Most backup, restore, and disaster recovery tools provide monitoring capabilities.
During the backup process, check all the relevant logs on the production system because the user interface might not display all problems.