Use these best practices when performing disaster recovery on VMware Aria Automation Config

Note: Disaster Recovery should not be confused with HA (high availability), as disaster recovery results in the complete removal of the original site including all managed systems.

Disaster recovery can be performed on these types of sites:

  • Cold sites - the most expensive backup. Business operations can be restored only after a substantial loss of time.
  • Warm sites - A facility that has its own hardwarwe and network infrastructure in place with limited performance capabilities. Data and operations can be restored quickly, but not without some time delay.
  • Hot sites - the ideal backup solution (especially when the environment cannot afford downtime). A hot site is a complete replica of the primary site IT infrastructure and data synchronized in real time. Hot sites are costly and only considered in cases of mission critical operations.

When performing disaster recovery, use these best practices:

  • Use a multi Master configuration. If one Master goes down, the other Master continues to process the request.
  • The number of minions per Master should be configured based on the recommendations in the installation guide.
  • Take a backup of your Master every week and treat it as a mission critical system.

Disaster Recovery on a Single Master

Note: A single Master configuration is not recommended for disaster recovery.
  1. Plan for system downtime.
  2. Restore Master from a Master backup.
  3. Rerun jobs from VMware Aria Automation Config that were scheduled to be run during the time the Master was offline.

Disaster Recovery for Mult Master Configurations

Site Steps
Hot Site No downtime. Rerun jobs from VMware Aria Automation Config that were scheduled to be run during the time the Master was offline. The Multi Master picks up and continues processing while the other Master comes back up.
Warm Plan for a few hours to a day of downtime. Rerun jobs from VMware Aria Automation Config that were scheduled to be run during the time the Master was offline. The Multi Master picks up and continues processing while the other Master comes back up.
Cold Plan for several days worth of downtime.
  1. Manually backup your Salt Master.
  2. Restore from backup.
  3. Accept Minion Keys
  4. On the file server:
    1. Copy the files to the new Master (local file server).
    2. Create a shared file system between the multri masters, so maintain parity.
  5. Accept the master key in Config.

After performing disaster recovery, use the VMware Aria Automation Config Dashboard to monitor unexpected spikes in network/database activity and the load on each Master.

Disaster Recovery for Automation Config Cloud

VMware continuously monitors all Config services and alerts customers if there is a DR or downtime. Any specific actions needed by customers will be communicated and issues will be resolved based on our SLA.