This topic describes App Autoscaler. It includes information about default and custom scaling rules as well as App Autoscaler architecture.


App Autoscaler is a Marketplace service that scales apps in your environment based on app performance metrics or a schedule. This controls the cost of running apps while maintaining app performance.

You can use App Autoscaler to do the following:

  • Configure scaling rules that adjust app instance counts based on metrics thresholds
  • Modify the maximum and minimum number of instances for an app, either manually or following a schedule
  • Configure scaling factors so that the app scales more quickly. Use caution when setting the scaling factors by which to scale your applications up or down.

For example, you can configure App Autoscaler to scale down the number of instances for an app over the weekend. You can also configure App Autoscaler to scale up the number of instances for an app when the value of the CPU Usage metric increases above a custom threshold.

About App Autoscaler Scaling Rules

This section describes how App Autoscaler decides when to scale an app up or down.

It also provides information about the custom metrics, comparison metrics, and default metrics that you can use when you create scaling rules for an app in App Autoscaler.

How App Autoscaler Decides When to Scale

Every 35 seconds, App Autoscaler makes a decision about whether to scale up, scale down, or keep the same number of instances.

To make a scaling decision, App Autoscaler averages the values of a given metric for the most recent 120 seconds.

Note: Operators can edit the 35 second scaling interval and the 120 second metric collection interval for all apps within the org. For more information, see (Optional) Configure App Autoscaler in Configuring TAS for VMs.

App Autoscaler scales apps as follows:

  • Increment the instances by the app’s scale up factor when any metric exceeds its maximum threshold.
  • Decrement the instances by the app’s scale down factor when all metrics fall below their minimum thresholds.
  • Keep the same number of instances when app metrics do not exceed thresholds.

The following diagram provides an example of how App Autoscaler makes scaling decisions:

Read the description after this diagram for a description of the example shown in the diagram.

As shown in the diagram, an app has a maximum threshold of 200 milliseconds and a minimum threshold of 80 milliseconds for an HTTP latency metric. The scale up factor and scale down factor are not set in the scaling manifest, so the default value is one.

If HTTP latency averages 220 milliseconds for 120 seconds, App Autoscaler scales the app up one instance.

If HTTP latency then averages 70 milliseconds over the next 120 second window and the app’s other scaling metrics also fall below their minimum thresholds, App Autoscaler scales the app down one instance.

If the average value for HTTP latency over a 120 second window is below the maximum threshold of 200 milliseconds and above the minimum threshold of 80 milliseconds, App Autoscaler maintains the same number of instances for the app.

You can also set a maximum and minimum number of instances. For example, if an app exceeds the maximum threshold of a given metric, but the number of instances is already at the maximum number of allowed instances, App Autoscaler does not scale up the app.

Default Metrics for Scaling Rules

App Autoscaler includes several default metrics for which you can create scaling rules.

Note: VMware recommends that you define custom metrics for scaling rules instead of using the default metrics. Custom metrics allow you to more accurately monitor the performance of your apps based on your environment.

The following table lists the default metrics for App Autoscaler:

Metric Description Notes
CPU Utilization Average CPU percentage for all instances of the app. App CPU utilization data can vary greatly based on the number of CPU cores on Diego Cells and app density. For more information, see App Autoscaler advisory for scaling Apps based on the CPU utilization in the Knowledge Base.
Container Memory Utilization Average memory percentage for all instances of the app.
HTTP Throughput Total HTTP requests per second (divided by the total number of app instances).
HTTP Latency Average latency of apps response to HTTP requests. This does not include Gorouter processing time or other network latency.
Average is calculated on the middle 99% or middle 95% of all HTTP requests.
RabbitMQ Depth The queue length of the specified queue.

Custom Metrics for Scaling Rules

VMware recommends that you define custom metrics for App Autoscaler scaling rules. Custom metrics allow you to define the metrics that are the best indicators of app performance for your environment.

You can configure apps to emit custom metrics out of the Loggregator Firehose using Metric Registrar. For steps on how to configure your apps to emit custom metrics with Metric Registrar, see Registering Custom App Metrics.

Comparison Metrics for Scaling Rules

You can use the Comparison Metric field in App Autoscaler to define a scaling rule that divides one custom metric by another.

When you add a scaling rule, the Metric field is the dividend and the Comparison Metric field is the divisor.

App Autoscaler Architecture

The following diagram shows the components and architecture of App Autoscaler. It also shows how App Autoscaler components interact with VMware Tanzu Application Service for VMs (TAS for VMs) components to make app scaling decisions.

Two boxes represent the components of App Autoscaler. Several other boxes represent the Cloud Foundry components with which App Autoscaler interacts. The two boxes that represent App Autoscaler components are titled Autoscale GO app and Autoscale api. Autoscale GO app and Autoscale api appear on the right side of the diagram. They are within a box called Autoscaling Space, which is within another box called System Org. This indicates that the App Autoscaler components run in a space that is within an org on your Cloud Foundry deployment. The diagram also includes several arrows. First, there is an arrow that points from Autoscale GO app to the Cloud Foundry Load Balancer and Gorouter. Additional arrows go from the load balancer and the Gorouter boxes to boxes titled Cloud Cache and Cloud Controller. These arrows indicate that the Autoscale app makes requests to the Log Cache and Cloud Controller for app metrics and that these requested are routed through the Load Balancer and Gorouter. There is also an arrow from Autoscale GO app that points to a box titled MySQL proxy. The arrow pointing from the Autoscale GO app box to the MySQL proxy box indicates that the Autoscale app reads scaling rules that are stored in a MySQL databse. The diagram also includes arrows that point from Autoscale api to MySQL proxy and and box titled UAA. These arrows indicate that the Autoscale API authenticates using UAA and that the API stores scaling rules in the MySQL database. There is also an arrow that points to Autoscale api from a box that represents both the Cloud Foundry Command Line Interface and Apps Manager. This arrow indicates that you can access the Autoscale API from either the Cloud Foundry command line interface or Apps Manager.

View a larger version of this image.

As demonstrated in the architecture diagram, App Autoscaler makes scaling decisions based on autoscaling rules that users configure by using either the Cloud Foundry Command Line Interface (cf CLI) or Apps Manager. The Autoscale API stores these autoscaling rules in a MySQL database.

At a predefined interval, known as the scaling interval, the App Autoscaler app reads the scaling rules and retrieves app metric data from the Loggregator Log Cache. Then, App Autoscaler makes a scaling decision and communicates with the Cloud Controller to scale the app, if necessary.

For more information about Loggregator Log Cache, see Loggregator Architecture. For more information about the Cloud Controller, see Cloud Controller.

check-circle-line exclamation-circle-line close-line
Scroll to top icon