To increase the availability of the vRealize Orchestrator services, start multiple vRealize Orchestrator server instances in a cluster with a shared database. vRealize Orchestrator works as a single instance until it is configured to work as part of a cluster.
Multiple vRealize Orchestrator server instances with identical server and plug-ins configurations work together in a cluster and share one database.
All vRealize Orchestrator server instances communicate with each other by exchanging heartbeats. Each heartbeat is a timestamp that the node writes to the shared database of the cluster at a certain time interval. Network problems, an unresponsive database server, or overload might cause an vRealize Orchestrator cluster node to stop responding. If an active vRealize Orchestrator server instance fails to send heartbeats within the failover timeout period, it is considered non-responsive. The failover timeout is equal to the value of the heartbeat interval multiplied by the number of the failover heartbeats. It serves as a definition for an unreliable node and can be customized according to the available resources and the production load.
An vRealize Orchestrator node enters standby mode when it loses connection to the database, and remains in this mode until the database connection is restored. The other nodes in the cluster take control of the active work, by resuming all interrupted workflows from their last unfinished items, such as scriptable tasks or workflow invocations.
You can monitor the state of your vRealize Orchestrator cluster from the System tab of the vRealize Orchestrator Client dashboard. To configure the cluster heartbeat, number of failover heartbeats, and the number of active nodes, navigate to the Orchestrator Cluster Management page of the vRealize Orchestrator Control Center.