This section explains elastic HA for NSX Advanced Load Balancer Service Engines.
High Availability Modes
NSX Advanced Load Balancer supports the following two modes:
Service Engine Elastic HA mode: This combines scale-out performance and high availability
N+M mode (the default mode)
Legacy HA mode: This enables a smooth migration from legacy appliance based load balancers.
Elastic HA N+M Mode
The N+M mode is the default mode of Elastic HA. In this mode, each virtual service is placed on only one SE.
The 'N' in N+M is the minimum number of SEs required to place virtual services in the SE group. This calculation is performed by the NSX Advanced Load Balancer Controller based on Virtual Services per Service Engine parameter. The 'N' varies over time as the virtual services are placed on or removed from the group. The maximum number of Service Engines is labeled 'E'.
The 'M'in N+M is the number of additional SEs the NSX Advanced Load Balancer Controller spins up to handle 'M' number of SE failures without reducing the capacity of the SE group. The 'M' appears in Buffer Service Engines field.
The minimum scale per virtual service is labeled as 'B' and the maximum scale per virtual service is labeled as 'C'.
The buffer SE in N+M mode is the number of SE failures that the system can tolerate for the virtual services to be up and operational (placed on atleast one SE), but not in the same capacity. In the SE Group, if a minimum scale per virtual service is set and an additional SE is required, then increase the buffer SE according to the calculations.
You can select N+M mode parameters by navigating to . You can either create a new SE group or edit the existing one.
High Availability Mode options are available under the Placement tab.
Elastic HA N+M Mode Example
As per the image below in the left side, there are twenty virtual service placements on an SE group.
With virtual services per SE set to 8, N is 3 (20/8 = 2.5, which rounds to 3).
With M = 1, a total of N+M = 3 + 1 = 4 SEs are required in the group.
Note that no single SE in the group is completely idle. The Controller places virtual services on all available SEs. In N+M mode, NSX Advanced Load Balancer ensures enough buffer capacity exists in aggregate to handle one (M=1) SE failure. In this example, each of the four SEs has five virtual services placed. A total of 12 spare slots are still available for additional virtual service placements, which is sufficient to handle one SE failure.
The imbalance in loading disappears over time if one or both of two things happens:
New virtual services are placed on the group. As many as four virtual services can be placed without compromising the M=1 condition. They will be placed on SE5 because NSX Advanced Load Balancer chooses the least-loaded SE first.
The Auto-Rebalance option is selected.
With 'M' set to 1, the SE group is single-SE fault tolerant. Customers desiring multiple-SE fault tolerance can set 'M' higher. NSX Advanced Load Balancer permits 'M' to be dynamically increased by the administrator without interrupting any services. You can start with M=1 (typical of most N+M deployments), and increase it if the conditions warrant.
If an N+M group is scaled out to maximum number of Service Engines and 'N' times virtual services per SE is placed, then NSX Advanced Load Balancer will permit additional virtual service placements (into the spare capacity represented by 'M'), but an
HA_COMPROMISED event will be logged.
As shown in above image, with only four slots remaining just after the five re-placements, if NSX Advanced Load Balancer’s orchestrator mode is set to write access, NSX Advanced Load Balancer spins up SE5 to meet the M=1 condition, which in this case requires at least eight slots available for re-placements.
To provide time to identify the cause of a failure, the first SE that fails in an SE group is not automatically deleted even after five minutes. You can then perform troubleshooting on the failed SE and delete the virtual machine manually if restoration is not possible. The Controller will delete the SE virtual machine after three days if you have not manually deleted the same.
Elastic HA Active/Active
In active/active mode, NSX Advanced Load Balancer places each virtual service on more than one SE, as specified by Minimum Scale per Virtual Service parameter, the default minimum is two. If an SE in the group fails, then,
Virtual services that had been running are not interrupted. They continue to run on other SEs with degraded capacity until they can be placed once again.
If NSX Advanced Load Balancer’s orchestrator mode is set to write access, a new SE is automatically deployed to bring the SE group back to its previous capacity. After waiting for the new SE to spin up, the Controller places on it the virtual services that had been running on the failed SE.
Elastic HA Active/Active Example
The images illustrate SE failure and full recovery. The image shows a SE group with the specifications as listed below:
Virtual Services per Service Engine = 3 (label A in the UI)
Minimum Scale per Virtual Service = 2 (label B)
Maximum Scale per Virtual Service = 4 (label C)
Max Number of Services Engines = 6 (label E)
In a span of time, five virtual services (VS1-VS5) are placed. The VS3 is scaled from its initial two placements to third place, illustrating NSX Advanced Load Balancer’s support for 'N-way active' virtual services. The below image depicts five virtual services placed on an active/active SE group.
The below image displays that the SE3 has failed. As a result, one of the two VS2 instances and and one of three VS3 instances have failed. However, other three virtual services (VS1, VS4, VS5) are unaffected. Also, neither VS2 nor VS3 are interrupted, because these instances were placed on SE4, SE5, and SE6 previously and they continue to work with degraded performance. In the below image, you can also view a single SE failure in an active/active SE group.
The NSX Advanced Load Balancer Controller deploys SE7 as a replacement for SE3 and places VS2 and VS3 on it which brings both virtual services up to their prior level of performance. The below image shows the recovery of a single SE in an active/active SE group.
When Compact placement is enabled, NSX Advanced Load Balancer uses the minimum number of SEs required. When Distributed placement is enabled, NSX Advanced Load Balancer uses as many SEs as required within a limit allowed by maximum number of Service Engines. By default, Compact placement is enabled for Elastic HA, N+M (buffer) mode. And by default, Distributed placement is enabled for Elastic HA, Active/Active mode.
Compact Placement Example
The image below shows the effect of compact placement on an Elastic HA, N+M mode SE group where the maximum number of Service Engines is four. In both the compact placement and distributed placement examples, you can observe the following:
Eight virtual services are created in sequence.
After VS1 is placed, SE2 is deployed because M=1 (handles one SE failure).
When VS2 requires placement, NSX Advanced Load Balancer assigns it to an idle SE2 to make the best use of all the running SEs.
At this point, placement behavior diverges and is as described as follows:
Compact Placement ON: Subsequent placements of VS3 through VS8 does not require additional SEs to maintain HA (M=1 => one SE failure). With Compact placement ON, NSX Advanced Load Balancer prefers to place virtual services on existing SEs.
Distributed Placement ON: Subsequent placements of VS3 and VS4 results in scaling the SE group out to its maximum number four, illustrating NSX Advanced Load Balancer’s preference for performance at the expense of its resources. After reaching four deployed SEs which is the maximum number of SEs for this group, the NSX Advanced Load Balancer places virtual services VS5 through VS8 on pre-existing, least-loaded SEs. The below image shows the Elastic HA N+1 SE group with Compact placement ON and OFF. It has eight successive virtual service placements as shown.
Interaction of Compact Placement with Elastic HA Modes
The compact placement interacts in a subtle way with the elastic HA modes with respect to the timing.
Elastic HA N+M mode: Since the compact placement is ON by default in N+M mode, the NSX Advanced Load Balancer Controller deferred deployment of spare capacity is preferred instead of immediately packing the virtual services densely onto existing SEs.
Elastic HA active/active mode: Since the distributed placement option is ON by default in active/active mode, the NSX Advanced Load Balancer Controller delays the placement of VS2 and VS3 until the replacement of SE7 spin ups. Additional activities are not placed on the four surviving SEs (SE1, SE2, SE4, SE5). Instead, both virtual services are placed on a fresh SE so that all the virtual services perform like they did previously that is before the failure had taken place.
The Auto-Rebalance option applies only to the Elastic HA modes, and it is deactivated by default. If the Auto-Rebalance remains in not enabled then, an event is logged instead of performing migrations automatically. To enable Auto-Rebalance, see How To Configure Auto-rebalance Using NSX Advanced Load Balancer CLI.
If auto-rebalance is left in its default state, an event is logged instead of automatically performing migrations.