Application Continuity

With Tanzu Service Mesh, application owners can publish an application to the outside world in a highly available manner by configuring and exposing a global server load balancing (GSLB) service from within a global namespace.

By automating the process of publishing the application, Tanzu Service Mesh saves the organization a significant time in what it would usually take to “publish” a service out by opening tickets with different teams and setting up GSLB services. In addition, by providing an automated discovery of a new deployment of a service, Tanzu Service Mesh supports “bursting” use cases where applications can expand to new clusters and sites, and the service mesh adapts to the new configuration automatically. For example, you can publish a service out through integration with NSX Advanced Load Balancer (formerly Avi Networks) and/or AWS Route 53 and perform the configuration directly from the global namespace, enabling health checks for that service for failure detection.

When a new instance is deployed, the system automatically detects that and points GSLB services to the new location. The system can also handle the certificate deployment on the cluster where the published service is deployed, providing an additional level of security to the application operations.

The diagram below shows an example of application traffic flow with configured GSLB. The clusters are in the same global namespace. A new instance of the application has been deployed to Cluster - Site US - East on the GSLB site. Tanzu Service Mesh automatically updates the GSLB configuration with the new application endpoint and sets the external load balancer (NSX Advanced Load Balancer in this case; also applies to AWS Route 53) to route some of the traffic to the application service instances on Cluster - Site US - East.

Figure 1. GSLB Application Flow with NSX Advanced Load Balancer

Tanzu Service Mesh integrates public access and GSLB capabilities directly into the service mesh. Thanks to this integration, Tanzu Service Mesh detects problems and failures within the application chain that are not visible to traditional global load balancers and automatically initiates failovers to healthy service instances.

As an additional benefit, this integration provides enhanced high availability by helping to reduce downtime for scenarios where the published service is down, but the cluster and gateways are still up. GSLB services have a timeout specified. Until that timeout is reached, the GSLB service will send traffic to the endpoint even if the published service is down. This is sometimes referred to as “blackholed traffic.” In AWS Route 53, for example, this timeout is 60 seconds by default, which means that if the published service is down, the traffic can be “blackholed” for 60 seconds. With Tanzu Service Mesh, the ingress gateway running on the destination cluster will detect the failure first and route traffic to a healthy service instance in another cluster until GSLB catches up.

When configuring a public service in a global namespace, you can choose from the following global load balancing algorithms:

Round Robin. The round robin algorithm distributes user requests equally among the service instances by forwarding requests to each instance in turn.
Weighted. The weighted load balancing algorithm splits traffic to the service into percentage-based portions according to relative service weights.
High Availability, or Active-Passive. With this method, you can configure a failover routing policy by defining active and passive groups of service instances. Traffic is routed to the active group until at least one instance in the active group is healthy. If health checks determine that all the active instances are unhealthy, the load balancing algorithm fails over to the instances in the passive group.