By default, the Orchestrator server runs as a single instance in standalone mode. To increase the availability of the Orchestrator services, you can set up the Orchestrator server to work in cluster mode and start multiple Orchestrator server instances in a cluster with a shared database.
Orchestrator supports two server modes.
The Orchestrator server runs as a standalone instance.
Multiple Orchestrator server instances with identical server and plug-ins' configurations work together in a cluster and share one database. Only the active Orchestrator server instances respond to client requests and run workflows.
All Orchestrator server instances communicate with each other by exchanging heartbeats. Each heartbeat is a timestamp that the node writes to the cluster shared database at a certain time interval. Network problems, an unresponsive database server, or overloading might cause an Orchestrator cluster node to stop responding. If an active Orchestrator server instance fails to send heartbeats for the failover timeout, it is considered as non-responsive. The failover timeout is equal to the value of the heartbeat interval multiplied by the number of the failover heartbeats. It serves as a definition for an unreliable node and must be customized according to the available resources and the production load.
The non-responsive node is automatically shut down and one of the inactive instances takes control to resume all interrupted workflows from their last not completed items, such as scriptable tasks, workflow invocations, and so on. You can restart the node that was shut down by using an external script based on the Orchestrator REST API or manually.
Orchestrator does not provide a built-in tool for monitoring the cluster status and sending notifications in case of a failover. You can monitor the cluster state by using an external component such as a load balancer. To identify if a node is running, you can check if the REST API of this node is responding properly.
In cluster mode, when more than one Orchestrator server is active, the use of the Orchestrator client is not supported. If you have more than one active Orchestrator node in a cluster, when different users use the different Orchestrator nodes to modify one and the same resource, concurrency problems occur. To have more than one active Orchestrator server node in a cluster, you must develop the workflows that you need when Orchestrator is in standalone mode, and after that set up Orchestrator to work in cluster mode.