Tanzu Service Mesh Service Autoscaling Overview

Autoscaling represents the ability of a service to automatically scale up or down to efficiently handle changes of the service demand. With Tanzu Service Mesh Service Autoscaler, developers and operators can have automatic scaling of microservices that meet changing levels of demand based on metrics, such as CPU or memory usage.

These metrics are available to Tanzu Service Mesh without needing additional code changes or metrics plugins.

TSM Autoscaler supports configuring an autoscaling policy for services inside a global namespace through the UI as well as API. For more information, see Approach 1: Configure GNS-Scoped Autoscaling Policy Using Tanzu Service Mesh UI. TSM Autoscaler also provides a Kubernetes Custom Resource Definition to configure autoscaling for services directly in cluster namespaces. For more information, see Approach 3: Deploying the Tanzu Service Mesh Service Autoscaler Through CRD. This approach for configuring autoscaling with CRD is available only for org-scoped autoscaling policies. Once an autoscaling policy is configured, Tanzu Service Mesh starts to monitor the autoscaler metric for the service and scales the service accordingly. You can also optionally configure an actionable SLO. For more information about configuring an SLO that controls autoscaling behavior, see Use Case 4: Using Actionable SLOs to Drive Autoscaling.

Tanzu Service Mesh incorporates autoscaling status tracking and behavior by displaying them in real time through its user interface.

Features of Tanzu Service Mesh Autoscaler

Tanzu Service Mesh Autoscaler supports Performance and Efficiency modes.
- The first mode is Performance mode, where Tanzu Service Mesh Autoscaler scales up service instances to meet an increase in demand, and it does not scale down instances when the demand decreases. In this mode, service instances are scaled up to optimize for speed and performance. This mode exists because in practice we have found that some stateful services, like an in-memory database, once scaled out, tend to remain scaled out.
- On the other hand, Efficiency Mode scales up and down to meet various changes in demand. In this mode, services are scaled up and down to optimize efficient use of infrastructure resources.
You can deploy autoscaling definitions (only org-scoped autoscaling policies with a Kubernetes CRD) alongside their application manifests into existing CI/CD pipelines and GitOps workflows without needing further code changes. You can deploy Tanzu Service Mesh Autoscaler non-intrusively in production or even as part of the development process for testing and simulation purposes and as part of CI/CD pipelines and GitOps workflows.
GNS-scoped autoscaling policies. You now can configure autoscaling policies inside a global namespace through the Tanzu Service Mesh Console UI or API.
With Tanzu Service Mesh GNS scoped actionable SLO, you can link a service level objective to a Tanzu Service Mesh autoscaling policy as a remediation action. That makes Tanzu Service Mesh autoscaling policy SLO aware, which means the autoscaler in addition to monitoring the scribe autoscaler trigger metrics typically CPU or memory, it is also going to monitor the SLI violations before it makes any auto scaling decisions for the service. For more information on Actionable SLOs in Tanzu Service Mesh see , Using Actionable SLOs to Drive Autoscaling.
You can configure an autoscaling trigger metric from a range of service metrics available in Tanzu Service Mesh. For a list of supported metrics, see the Tanzu Service Mesh Service Autoscaler Configuration Reference.
Tanzu Service Mesh supports configuring autoscaling policies in simulation mode. In simulation mode, autoscaling trigger metrics will be monitored, and the desired instance count will be calculated and displayed. However, the autoscaling policy will not be enforced.

For more information about the benefits and applicability of the Tanzu Service Mesh Autoscaler, see Use Cases.

Summary of Tanzu Service Mesh Autoscaler Configuration Approaches

Approach 1: a GNS-scoped autoscaling policy through the UI. You can now create autoscaling policies through the Tanzu Service Mesh Console UI. The UI supports creating GNS-scoped autoscaling policies. This means that the autoscaling policy will be applied to all the instances of the target service inside the global namespace.
Approach 2: a GNS-scoped autoscaling policy through API. You can also configure a GNS-scoped autoscaling policy through the Tanzu Service Mesh API.
Approach 3: an org-scoped autoscaling policy (cluster-scoped autoscaler). You can create autoscaling definitions through Custom Resource Definition in your clusters. This means that the autoscaling definition will be applied directly to the target service in the same cluster only. If the service is running in multiple clusters, you need to apply the CRD in each one of them, unlike a GNS-scoped policy where you apply the CRD to all the namespaces where the service is running as long as the namespaces are part of the global namespace.

Note:

Aproaches 1 and 2 are the recommended approaches.
A target service means that autoscaling configurations must target a service version in Tanzu Service Mesh that corresponds to a specific Deployment, StatefulSet, or ReplicaSet in Kubernetes. Whenever a new service version is introduced, you must explicitly create a corresponding autoscaling configuration.

Warning:

If you create a GNS-scoped autoscaling policy through the UI and an org-scoped autoscaling policy through CRD for the same target service, the GNS-scoped policy will override the org-scoped policy. There can be only one autoscaling policy per service version.