You can deploy the cluster autoscaler to automatically adjust the number of worker nodes in a TKG Service cluster based on the demands of your workloads.

About Cluster Autoscaling

The TKG Service Cluster Autoscaler is an implementation of the Kubernetes Cluster Autoscaler. For more information, refer to the cluster autoscaler documentation.

The cluster autoscaler supports scaling out and scaling in of cluster nodes. If you are running the cluster on a multi-zone Supervisor, the autoscaler can scale node pools assigned to a specific availability zone.

The cluster autoscaler is delivered as a standard package that you install on the cluster using either Kubectl or the Tanzu CLI. The cluster autoscaler runs as a deployment on the TKG cluster using service account credentials.

There is a 1-to-1 relationship between the autoscaler package minor version and the TKr minor version. For example, if you are using TKr 1.27.11, you should install v1.27.2 of the autoscaler. If there is a version mismatch, package reconciliation will fail.

While the cluster autoscaler supports both the scaling out and scaling in of worker nodes, there are some cases where the cluster autoscaler will not scale down nodes because some types of applications prevent nodes from scaling down. See "What types of pods can prevent CA from removing a node?" in the cluster autoscaler documentation.

Version Requirements

Cluster autoscaler has the following version requirements.
  • The minimum vSphere version is vSphere 8 U3
  • The minimum TKr version is TKr 1.27.x for vSphere 8
  • The minor version of the TKr and the minor version of the Cluster Autoscaler package must match

Package Requirements

The cluster autoscaler is delivered as a standard package. The minor version of the package must match the minor version of the TKr being used. For example, if you are using TKr 1.27.11, you should install v1.27.2 of the autoscaler. If there is a version mismatch, package reconciliation will fail.

You may need to locate the target package in a subsequent repository version. For example, v1.27.2 of the autoscaler is in the v2024.4.12 version of the standard package repository. Later autoscaler package versions, such as 1.28.x, 1.29.x, 1.30.x, etc, are located in subsequent repository versions. All standard package repositories can be found by running the following command:
imgpkg tag list -i projects.registry.vmware.com/tkg/packages/standard/repo

Workflow

The high-level workflow for enabling cluster autoscaling is as follows:
  1. Create a new TKG cluster, or update an existing TKG cluster, with autoscaler annotations and remove the replicas field in spec.topology.workers.machinedeployments.
  2. Install the package repository on the TKG cluster you created or updated.
  3. Install the autoscaler package on the TKG cluster you created or updated.

    The autoscaler is installed on the TKG cluster as a deployment in the kube-system namespace.

Refer to the following topics for detailed instructions:

Known Limitations

For vSphere 8 U3, the Custer Autoscaler has the following known limitations.
No Scaling to or from Zero

The current TKG Service does not support scaling from zero or to zero. For example, setting the minimum size annotation to 0, such as cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "0", is not supported.

No Equivalent Scaling
For TKr versions prior to v1.30, the Cluster Autoscaler does not support setting maximum size equal to minimum size on autoscaler annotations. For example, the autoscaler would take the following parameters as invalid input because they are set to be equivalent.
  • cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
  • cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "1"