Use CPU Entitlement Utilization as a scaling metric with App Autoscaler

You can configure App Autoscaler to use the CPU Entitlement Utilization metric to scale apps in your VMware Tanzu Application Service for VMs deployment.

Every app running on TAS for VMs is given a CPU Entitlement: the share of the CPU to which the app is entitled, relative to the other apps running on the platform. The CPU Entitlement is calculated based on the amount of memory configured for the app. For example, an app with a 2G memory limit has double the CPU Entitlement of an app with a 1G memory limit.

You can configure App Autoscaler to scale out additional instances of your app when the CPU Entitlement utilization of the app crosses a threshold. This can be useful for CPU-intensive apps.

You also might want to use CPU Entitlement Utilization as a scaling metric when other metrics are not applicable. For example, if your app depends on a backend service that can become slow, then you may not be able to usefully scale the app based on HTTP Request Latency.

Note

The CPU Entitlement Utilization metric replaces the deprecated CPU scaling rule type. VMware recommends that you perform the following steps:

Replace the older CPU scaling metric with the CPU Entitlement Utilization scaling metric
When migrating autoscaling rules from the old CPU scaling metric to the CPU Entitlement Utilization metric, review the thresholds to ensure expected behavior

You can configure Autoscaler to use CPU Entitlement Utilization as the scaling metric for an app:

Through the Cloud Foundry Command-Line Interface (cf CLI). For more information, see Configuring CPU Entitlement Utilization as the scaling metric for an app through the cf CLI.
Through Apps Manager. For more information, see Configure CPU Entitlement Utilization as the scaling metric for an app through Apps Manager.

To monitor when Autoscaler scales an app based on changes in CPU Entitlement Utilization, see Reviewing autoscaling events for changes in CPU Entitlement Utilization.

For information about use cases that might complicate or prevent you from configuring CPU Entitlement Utilization as the scaling metric for an app, see Special considerations for using CPU Entitlement Utilization as a scaling metric.

VMware recommends that you load-test your app to verify that the autoscaling rules you configured are effective. For more information, see Load-testing your app in Using Autoscaler in Production.

Configure CPU Entitlement Utilization as the scaling metric for an app through the cf CLI

The procedures in this section describe how to configure Autoscaler to use CPU Entitlement Utilization as the scaling metric for an app through the cf CLI.

You can configure Autoscaler to use CPU Entitlement Utilization as the scaling metric for an app in the following ways:

Using a manifest file. For more information, see Configure an autoscaling rule using a manifest file.
Using CLI commands. For more information, see Configure an autoscaling rule using CLI commands.

For the procedures in this section, you must use the App Autoscaler CLI plug-in. To download and install the App Autoscaler CLI plug-in, see Install the App Autoscaler CLI plug-in in Using the App Autoscaler CLI.

Configure an autoscaling rule by using a manifest file

You can configure autoscaling rules declaratively through a manifest file. This manifest file only configures Autoscaler, and does not interfere with any other existing app manifest files in your TAS for VMs deployment.

To configure an autoscaling rule that defines CPU Entitlement Utilization as its scaling metric using a manifest file:

In a terminal window, target the space in which the app you want to scale is deployed by running:
```
cf target -o ORG-NAME -s SPACE-NAME
```
Where:
- ORG-NAME is the name of the org containing the space in which the app you want to scale is deployed.
- SPACE-NAME is the name of the space in which the app you want to scale is deployed.
If the space in which the app you want to scale is deployed does not already have an Autoscaler service instance of Autoscaler deployed in it, create an Autoscaler service instance by running:
```
cf create-service app-autoscaler PLAN-NAME SERVICE-NAME
```
Where:
- PLAN-NAME is the name of the service plan you want to use for the Autoscaler service instance.
- SERVICE-INSTANCE-NAME is the name you want to give the Autoscaler service instance. For example, autoscaler.
If there is already an Autoscaler service instance in the space in which the app you want to scale is deployed, skip this step.
Bind the Autoscaler service instance you created in the previous step to the app you want to scale by running:
```
cf bind-service APP-NAME SERVICE-INSTANCE-NAME
```
Where:
- APP-NAME is the name of the app you want to scale.
- SERVICE-INSTANCE-NAME is the name of the Autoscaler service instance in the previous step.
To create a manifest file for Autoscaler that configures an autoscaling rule with CPU Entitlement Utilization as its scaling metric, create a YAML file that includes the following configuration parameters:
```
---
instance_limits:
  min: LOWER-SCALING-LIMIT
  max: UPPER-SCALING-LIMIT
rules:
- rule_type: cpu_entitlement
  threshold:
    min: MINIMUM-CPU-PERCENT-THRESHOLD
    max: MAXIMUM-CPU-PERCENT-THRESHOLD
scheduled_limit_changes: []
```
Where:
- LOWER-SCALING-LIMIT is the minimum number of instances you want Autoscaler to create for the app.
- UPPER-SCALING-LIMIT is the maximum number of instances you want Autoscaler to create for the app.
- MINIMUM-CPU-PERCENT-THRESHOLD is the minimum CPU Entitlement Utilization threshold as a percentage. If the average CPU Entitlement Utilization falls below this number, Autoscaler scales the number of app instances down. VMware recommends 30% as a default minimum threshold value.
- MAXIMUM-CPU-PERCENT-THRESHOLD is the maximum CPU Entitlement Utilization threshold as a percentage. If the average CPU Entitlement Utilization rises above this number, Autoscaler scales the number of app instances up. VMware recommends 80% as a default maximum threshold value.
The following example shows an Autoscaler manifest file with a minimum CPU Entitlement Utilization threshold of 30% and a maximum CPU Entitlement Utilization threshold of 80%:
```
---
instance_limits:
  min: 10
  max: 100
rules:
- rule_type: cpu_entitlement
  threshold:
    min: 30
    max: 80
scheduled_limit_changes: []
```
Apply the autoscaling rule you configured in the previous step to the app you want to scale by running:
```
cf configure-autoscaling APP-NAME MANIFEST-FILENAME
```
Where:
- APP-NAME is the name of the app.
- MANIFEST-FILENAME is the filename of the manifest file you created in the previous step. For example, autoscaler.yaml.

Configure an autoscaling rule by using CLI commands

To configure an autoscaling rule that defines CPU Entitlement Utilization as its scaling metric using CLI commands:

In a terminal window, target the space in which the app you want to scale is deployed by running:
```
cf target -o ORG-NAME -s SPACE-NAME
```
Where:
- ORG-NAME is the name of the org containing the space in which the app you want to scale is deployed.
- SPACE-NAME is the name of the space in which the app you want to scale is deployed.
If the space in which the app you want to scale is deployed does not already have a service instance of Autoscaler deployed in it, create an Autoscaler service instance by running:
```
cf create-service app-autoscaler PLAN-NAME SERVICE-INSTANCE-NAME
```
Where:
- PLAN-NAME is the name of the service plan you want to use for the Autoscaler service instance.
- SERVICE-INSTANCE-NAME is the name you want to give the Autoscaler service instance. For example, autoscaler.
If there is already an Autoscaler service instance in the space in which the app you want to scale is deployed, skip this step.
Bind the Autoscaler service instance you created in the previous step to the app you want to scale by running:
```
cf bind-service APP-NAME SERVICE-INSTANCE-NAME
```
Where:
- APP-NAME is the name of the app you want to scale.
- SERVICE-INSTANCE-NAME is the name of the Autoscaler service instance in the previous step.
Configure upper and lower scaling limits for the app by running:
```
cf update-autoscaling-limits APP-NAME LOWER-SCALING-LIMIT UPPER-SCALING-LIMIT
```
Where:
- APP-NAME is the name of the app.
- LOWER-SCALING-LIMIT is the minimum number of instances you want Autoscaler to create for the app.
- UPPER-SCALING-LIMIT is the maximum number of instances you want Autoscaler to create for the app.
Enable Autoscaler to begin making scaling decisions for the app by running:
```
cf enable-autoscaling APP-NAME
```
Where APP-NAME is the name of the app.
Create a cpu_entitlement autoscaling rule by running:
```
cf create-autoscaling-rule APP-NAME cpu_entitlement MINIMUM-CPU-PERCENT-THRESHOLD MAXIMUM-CPU-PERCENT-THRESHOLD
```
Where:
- APP-NAME is the name of the app for which you want to create an autoscaling rule.
- MINIMUM-CPU-PERCENT-THRESHOLD is the minimum CPU Entitlement Utilization threshold as a percentage. If the average CPU Entitlement Utilization falls below this number, Autoscaler scales the number of app instances down. VMware recommends 30% as a default minimum threshold value.
- MAXIMUM-CPU-PERCENT-THRESHOLD is the maximum CPU Entitlement Utilization threshold as a percentage. If the average CPU Entitlement Utilization rises above this number, Autoscaler scales the number of app instances up. VMware recommends 80% as a default maximum threshold value.
The following example command configures a cpu_entitlement autoscaling rule for the example-app app, with a minimum CPU Entitlement Utilization threshold of 30% and a maximum CPU Entitlement Utilization threshold of 80%:
```
cf create-autoscaling-rule example-app cpu_entitlement 30 80
```

Configure CPU Entitlement Utilization as the scaling metric for an app through Apps Manager

To configure Autoscaler to use CPU Entitlement Utilization as the scaling metric for an app using Apps Manager:

Log in to Apps Manager. For more information, see Logging in to Apps Manager.
Select the org that contains the space in which the app you want to scale is deployed.
Select the space in which the app you want to scale is deployed.
Under Processes and Instances, click Autoscaling Activated. The Manage Autoscaling window appears.
Next to Scaling Rules, click Add Rule. The Add Rule window appears.
From the Rule Type options, select CPU Entitlement Utilization. Then click Next.
1. For Scale down if less than, enter the minimum CPU Entitlement Utilization percentage threshold. If the average CPU Entitlement Utilization falls below this number, Autoscaler scales the number of app instances down. VMware recommends 30% as a default minimum threshold value.
2. For Scale up if more than, enter the maximum CPU Entitlement Utilization percentage threshold. If the average CPU Entitlement Utilization rises above this number, Autoscaler scales the number of app instances up. VMware recommends 80% as a default maximum threshold value.
Click Save.

Review autoscaling events for changes in CPU Entitlement Utilization

When Autoscaler scales the number of app instances up after the CPU Entitlement Utilization metric increases above the maximum threshold, Autoscaler records an autoscaling event.

You can monitor the autoscaling events that Autoscaler records for changes in CPU Entitlement Utilization:

Through the cf CLI. See Review autoscaling events for changes in CPU Entitlement Utilization through the cf CLI.
Through Apps Manager. See Review autoscaling events for changes in CPU Entitlement Utilization through Apps Manager.

Review autoscaling events for changes in CPU Entitlement Utilization through the cf CLI

To review the autoscaling events that Autoscaler records for changes in CPU Entitlement Utilization through the cf CLI:

In a terminal window, run:
```
cf autoscaling-events APP-NAME
```
Where APP-NAME is the name of the app for which you want to review autoscaling events.

If Autoscaler has scaled the number of app instances up due to increases in the CPU Entitlement Utilization metric, the above command returns output that contains autoscaling events similar to the following example:
```
Time                   Description
2024-03-13T21:47:45Z   Scaled up from 10 to 11 instances. Current CPU Entitlement usage of 172.28% is above upper threshold of 80.00%.
```

Review autoscaling events for changes in CPU Entitlement Utilization through Apps Manager

To review the autoscaling events that Autoscaler records for changes in CPU Entitlement Utilization through Apps Manager:

Log in to Apps Manager. For more information, see Logging in to Apps Manager.
Select the org that contains the space in which the app you want to scale is deployed.
Select the space in which the app you want to scale is deployed.
Under Processes and Instances, click Autoscaling Activated.
Under Event History, click View More. A list of autoscaling events appears. If Autoscaler has scaled the number of app instances up due to increases in the CPU Entitlement Utilization metric, the list of autoscaling events includes events similar to the following example:
```
Scaled up from 10 to 11 instances. Current CPU Entitlement usage of 172.28% is above upper threshold of 80.00%.
```

Special considerations for using CPU Entitlement Utilization as a scaling metric

This section describes use cases that might complicate or prevent you from configuring CPU Entitlement Utilization as the scaling metric for an app.

Review current CPU Entitlement Utilization

The cf CLI displays the older CPU metric rather than the CPU Entitlement Utilization metric. To review the CPU Entitlement Utilization of an app, you can install the Log Cache cf CLI plug-in and view the cpu_entitlement metric.

Install the Log Cache plug-in by running:
```
cf install-plugin -r CF-Community "log-cache"
```
Log Cache is a component of TAS for VMs that caches logs and metrics from across the platform.
View the CPU Entitlement Utilization metric values for the app as they are emitted by running:
```
cf tail example-app --name-filter cpu_entitlement --follow
```
The --follow flag appends output as metrics are emitted.

Log cache ejection

Within Log Cache, each app has its own bucket, which contains both app metrics and logs. By default, Log Cache can hold a maximum of 100,000 envelopes per app. Because the platform generates several envelopes per request, and recent app logs are held in the same bucket, a busy app might not have sufficient Log Cache CPU Entitlement Utilization history for Autoscaler to use in scaling decisions.

For more information, see Log Cache in Operating App Autoscaler.