This section focuses on use case scenarios of scaling Service Engines.
Scale Use Cases
A non-scaled virtual service offers the most optimal packet path from the client to Avi Load Balancer Controller to the server. Scaling SEs may add an extra hop to some traffic (specifically traffic pushed to secondary SEs) for ingress packets. Scaling works well for the following use cases:
Traffic that involves minimal ingress and greater egress traffic, such as client/ server applications, HTTP or video streaming protocols. For instance, SEs may exist on hosts with single 10-Gbps NICs. While scaled out, the virtual service can still deliver 30 Gbps of traffic to clients.
Protocols or virtual service features that consume significant CPU resources, such as compression or Secure Sockets Layer (SSL)/ Transport Layer Security (TLS).
Concurrent connection counts that exceed the memory of a single SE.
Scaling does not work well for the following use case:
Traffic that involves significant client uploads beyond the network or packet per second capacity of a single SE (or specifically the underlying virtual machine). Since all ingress packets traverse the primary SE, scaling may not be of many benefits. For packet per second limitations, see documentation on the desired platform or hypervisor.
Impact on Existing Connections
Existing connections are not impacted by scaling out, as only new connections are eligible to be scaled to another SE. When scaling in, connections on the secondary SE are given 30 seconds to finish and are then terminated by the secondary SE. These connections will be flagged in the virtual service’s significant logs. Subsequent packets for the connection or client are eligible to be re-load balanced by the primary SE.
Secondary SE Failure
If a secondary SE fails, the primary will detect the failure quickly and forward subsequent packets to the remaining SEs handling the virtual service. Depending on the high availability mode selected, a new SE may also be automatically added to the group to fill the gap in capacity. Aside from the potential increase in connections, traffic to other SEs is not affected.
Primary SE Failure
If the primary SE fails, a new primary will be automatically chosen among the secondary SEs. Similar to a non-scaled failover event, the new primary will advertise a gratuitous ARP for the virtual service IP address. If the virtual service was using source IP persistence, the newly promoted primary will have a mirrored copy of the persistence table. Other persistence methods such as cookies and secure HTTPS are maintained by the client; therefore no mirroring is necessary. For TCP and UDP connections that were previously delegated to the newly promoted primary SE, the connections continue as normal, although now there is no need for these packets to incur the extra hop from the primary to the secondary.
For connections that were owned by the failed primary or by other secondary SEs, the new primary will need to rebuild their mapping in its connection table. As a new, non-SYN packet is received by the new primary, it will query the remaining SEs to see if they had been processing the connection. If they had, the connection flow will be re-established to the same SE. If no SE announces it had been handling the flow, it is assumed the flow was owned by the failed primary. The connection will be reset for TCP, or load balanced to a remaining SE for UDP.
Relation to HA modes
Scaling is different from high availability, however, the two are heavily intertwined. A scaled-out virtual service will experience no more than a performance degradation if a single SE in the group fails. Legacy HA active/ standby mode, a two-SE configuration, does not support scaling. Instead, service continuity depends on the existence of initialized standby virtual services on the surviving SE. These are capable of taking over with a single command.
Avi Load Balancer’s default HA mode is elastic HA N+M mode, which starts each virtual service for the SE group in a non-scaled mode on a single SE. In such a configuration, failure of an SE running non-scaled virtual services causes a brief service outage (of those virtual services only), during which the Controller places the affected virtual services on spare SE capacity. In contrast, a virtual service that has scaled to two or more SEs in an N+M group suffers no outage, but instead a potential performance reduction.