Troubleshoot when App Autoscaler fails to properly scale apps for your VMware Tanzu Application Service for VMs (TAS for VMs) deployment.
This section describes possible causes for Autoscaler failing to scale an app as you expect in your TAS for VMs deployment. To determine the cause, see the following sections.
Confirm that you configured the autoscaling rules for the app to respond as you expect to the traffic that the app is receiving. For example, when you use RabbitMQ queue depth as your scaling metric, you can compare the metrics that VMware Tanzu RabbitMQ for VMs (RabbitMQ) emits for the specified queue to the autoscaling rules you configured. If the autoscaling rules are not configured as you expect, you can adjust them accordingly.
Autoscaler averages metrics over the metric collection interval you configure in the App Autoscaler pane of the TAS for VMs tile.
If the values of the scaling metric for an app suddenly increase, the increase may not be reflected in an autoscaling event until the average for the metric increases.
The number of instances for the app may have already reached the upper scaling limit you configured.
To determine whether the app has reached the maximum number of instances:
In a terminal window, run:
cf autoscaling-events APP-NAME
Where APP-NAME
is the name of your app.
If Autoscaler has attempted to scale the app above the configured upper scaling limit, the above command returns output that contains an event description similar to the following example:
can not scale up. At max limit of 20 instances.
To review the upper and lower scaling limits you configured for the app, run:
cf autoscaling-apps
The previous command returns output similar to this example:
Name Guid Enabled Min Instances Max Instances Scale Up Factor Scale Down Factor example-app 85a6aff3-c739-4364-8df3-44f1aa50662b true 1 20 5 1
To resolve this issue, you can configure a higher upper scaling limit for the app. For more information, see Autoscaling limits in Using App Autoscaler in production.
The app may have already reached the number of app instances or used the amount of memory that the quota plan for the space or org allows. When an app reaches a resource quota for the space or org in which it is deployed, Autoscaler does not record an autoscaling event. For more information, see Reviewing space and org quota plans in Using App Autoscaler in production.
Review the quotas for both the space and the org in which your app is deployed to ensure that there are enough resources available in their quota plans for Autoscaler to create new app instances.
To review the available resource quotas for the space and org in which an app is deployed:
In a terminal window, view the name of the quota plan for the space in which the app is deployed by running:
cf space SPACE-NAME
Where SPACE-NAME
is the name of the space in which the app is deployed.
In the terminal output that appears, the value of space quota
is the name of the quota plan for the space.
Record the name of the quota plan for the space that you retrieved in the previous step.
View the name of the quota plan for the org containing the space in which the app is deployed by running:
cf org ORG-NAME
Where ORG-NAME
is the name of the org containing the space in which the app is deployed.
In the terminal output that appears, the value of quota
is the name of the quota plan for the org.
Record the name of the quota plan for the org that you retrieved in the previous step.
Review the available resource quotas for the space in which the app is deployed by running:
cf space-quota QUOTA-PLAN-NAME
Where QUOTA-PLAN-NAME
is the name of the space quota plan that you recorded in a previous step.
The previous command returns output similar to the following example:
Getting quota example-space info as [email protected]... OK Total Memory 2G Instance Memory 1G Routes 10 Services 10 Paid service plans allowed App instance limit 10 Reserved Route Ports 0
Review the available resource quotas for the org in which the app is deployed by running:
cf org-quota QUOTA-PLAN-NAME
Where QUOTA-PLAN-NAME
is the name of the org quota plan that you recorded in a previous step.
This command returns output similar to the following example:
Getting quota example-org info as [email protected]... OK Total Memory 10G Instance Memory 1G Routes 100 Services 100 Paid service plans allowed App instance limit 100 Reserved Route Ports 0
You can also try manually scaling the app. If the app has already reached one or more resource quota limits for your space or org, the Cloud Controller returns an error similar to the following example:
memory quota_exceeded FAILED
To scale an app manually, see Scale an app manually in Managing apps and service instances using Apps Manager or Scaling an app using cf scale.
Additionally, operators can configure verbose logging for Autoscaler that shows logs if Autoscaler fails to scale an app as a result of reaching a resource quota limit. For more information, see Configure verbose logging for Autoscaler in Operating Autoscaler.
In order for Autoscaler to function in a TAS for VMs deployment, you must set the App Autoscaler Errand to On in the Errands pane of the TAS for VMs tile, and you must configure Autoscaler to scale the apps you want to autoscale.
Autoscaling for an app:
Ensure that the App Autoscaler Errand is set to On in the Errands pane of the TAS for VMs tile. To configure the App Autoscaler Errand to deploy Autoscaler, see Configure errands in Configuring TAS for VMs.
Configure Autoscaler to scale the app through either the App Autoscaler Command-Line Interface (CLI) or Apps Manager. To configure autoscaling for an app through the App Autoscaler CLI, see Enable Autoscaling in Using the App Autoscaler CLI. To configure autoscaling for an app through Apps Manager, see Create and bind the App Autoscaler service in Scaling an App using App Autoscaler.
To see whether autoscaling is configured for the apps you want to scale:
In a terminal window, run:
cf autoscaling-apps
The above command returns a list of apps that are bound to Autoscaler, similar to the following example:
Presenting autoscaler apps in org example-org / space example-space as user Name Guid Enabled Min Instances Max Instances Scale Up Factor Scale Down Factor example-app e0077xa5-34mp-42l7-e377-e0fcb14d67f3 true 1 4 1 1 example-app-2 3e8x6a84-0m6p-4le9-9335-0aad3979d672 false 10 40 5 2 OKIf autoscaling is configured for an app that is bound to Autoscaler, the value shown in the `Enabled` column of the terminal output is `true`. If autoscaling is not configured for an app that is bound to Autoscaler, the value shown in the `Enabled` column of the terminal output is `false`.
Autoscaler may fail to scale an app if one of the TAS for VMs components on which it depends is failing or underscaled. For more information about these components, see Failure modes in Using debug logs for App Autoscaler.
These components might need to be restarted or scaled up before Autoscaler can scale your app. To restart TAS for VMs components, see Stopping and starting individual TAS for VMs VMs in Stopping and Starting Virtual Machines. To scale TAS for VMs components up, see Scaling up TAS for VMs in Scaling TAS for VMs.
This section describes possible causes for Autoscaler failing to scale an app using RabbitMQ queue depth as its scaling metric.
Autoscaler cannot scale an app if the RabbitMQ queue name specified in your autoscaling rule is incorrect. Ensure that the queue name you configured in the autoscaling rule for the app is correct.
You may see an autoscaling event similar to the following example:
Autoscaler did not receive any metrics for rabbitmq.example-queue during the scaling window. Scaling down is deferred until these metrics are available.
To receive RabbitMQ queue depth metrics, Autoscaler must be able to retrieve service binding credentials for the RabbitMQ HTTP Management API from Cloud Controller. However, when a TAS for VMs deployment is configured to store credentials in the runtime CredHub, Autoscaler cannot retrieve service binding credentials from Cloud Controller.
If the operator of your deployment has selected Yes under On Demand - Secure Service Instance Credentials with Runtime CredHub in the Global Settings for On-Demand Plans pane of the VMware Tanzu RabbitMQ for VMs tile and activated the Secure service instance credentials check box in the CredHub pane of the TAS for VMs tile, Autoscaler is unable to use RabbitMQ queue depth as the scaling metric for apps bound to RabbitMQ service instances.
As a workaround, you can configure your app to emit a Custom App Metric containing the RabbitMQ queue depth and configure Autoscaler to use this metric as its scaling metric for the app. For more information about Custom App Metrics, see Using custom scaling metrics.
This section describes possible causes for Autoscaler scaling an app up to a certain number of instances, but not to the level that you expect. To determine the cause, see the following sections.
The values of the scaling metric you configured Autoscaler to use for your app may be lower than you expect. For example, if you are load-testing your app and your load-testing tool is not working correctly, you may be generating less load on the app than you expect. For more information about load-testing an app, see Load-testing your app in Using App Autoscaler in production.
To review the values of your scaling metric, review the observability service that you use for your TAS for VMs deployment.
If any TAS for VMs components on which Autoscaler depends are underscaled, Autoscaler cannot scale the number of app instances up enough to accommodate the amount of traffic that the app is receiving.
For more information about how to determine which TAS for VMs components might be too underscaled to adequately support Autoscaler, see Failure modes in Using debug logs for App Autoscaler.
When the averaged metric values that you see in autoscaling events are lower than you expect, it often means that Log Cache is underscaled. If Log Cache is underscaled, Autoscaler might not be able to retrieve all of the metrics associated with your app. When Autoscaler cannot retrieve enough metrics, it cannot calculate an accurate average metric value to record in autoscaling events.
To determine whether Log Cache might be underscaled:
In a terminal window, run:
cf autoscaling-events
Thia command returns a list of autoscaling events that Autoscaler has recorded, which include averaged values of the scaling metric for your app.
If the averaged metric values in the autoscaling events for an app are lower than the actual load that the app is processing, Log Cache may be evicting some metrics from its cache.
To fix this issue, try the following actions:
Scale Log Cache horizontally by increasing the number of Log Cache instances in the Resource Config pane of the TAS for VMs tile. To scale Log Cache horizontally, see Scaling up TAS for VMs in Scaling TAS for VMs.
Scale Log Cache vertically by increasing the amount of memory in each instance of Log Cache. To scale Log Cache vertically, see Scaling an app using cf scale.
Increase the maximum number of envelopes that Log Cache can store per app by configuring the Maximum number of envelopes stored in Log Cache per source text boxes in the Advanced Features pane of the TAS for VMs tile. To configure the maximum number of envelopes that Log Cache can store, see Maximum number of envelopes stored in Log Cache per source in Configuring TAS for VMs.
For more information about how Log Cache supports Autoscaler, see Log Cache in Operating App Autoscaler.
Autoscaler is itself an app that TAS for VMs deploys. By default, there are three instances of the Autoscaler app, autoscale
, that can scale other apps in your TAS for VMs deployment. Each instance of Autoscaler queries the MySQL database in your TAS for VMs deployment to claim and process a set of apps that might must be scaled. Each Autoscaler instance claims a different set of apps to process and scale independently of the other two Autoscaler instances. After an Autoscaler instance has processed and scaled one set of apps, it queries the MySQL database again and repeats the process.
Each Autoscaler instance evaluates 10 apps at a time to verify whether they must be scaled. The speed at which an Autoscaler instance can process a batch of apps is verified primarily by the number of metrics it retrieves from Log Cache for the apps within that batch. If Log Cache holds a large number of metrics for an app, Autoscaler does not begin processing the next batch of apps until either it retrieves all of those metrics or the request that Autoscaler makes to Log Cache for those metrics times out.
To ensure that Autoscaler can scale all the apps for which you configured autoscaling,, you must increase the number of Autoscaler instances for your TAS for VMs deployment.
To increase the number of Autoscaler instances:
Go to the VMware Tanzu Operations Manager Installation Dashboard.
Click the VMware Tanzu Application Service tile.
Click App Autoscaler.
In the App Autoscaler instance count text box, enter the number of Autoscaler instances you want to deploy.
Click Save.
Return to the Tanzu Operations Manager Installation Dashboard.
Click Review Pending Changes.
Under the VMware Tanzu Application Service tile, click Errands. The Errands menu expands.
Ensure that the App Autoscaler Errand check box is activated.
Click Apply Changes.
Autoscaler can scale an app to and from zero instances only when it is configured to use RabbitMQ queue depth as its scaling metric.
Because other scaling metrics are not emitted when an app has zero instances, Autoscaler cannot scale an app to and from zero instances when it is configured to use any scaling metric other than RabbitMQ queue depth.
When Autoscaler scales an app up, the new app instances can be slow to respond to requests. As a result, if you configure Autoscaler to use HTTP Request Latency as the scaling metric for the app, Autoscaler might scale the number of app instances up even more to compensate for the slow response time of the app instances it had previously created.
When a new app instance is created, Diego runs health checks for the app instance. After all health checks are finished successfully, Diego considers the app instance to be healthy. Diego then registers a route to the app instance with the Gorouter, and the Gorouter begins routing requests to the app instance.
However, the app instance having passed all health checks may not indicate that the app instance is ready to respond to requests. Depending on the type of health check that you have configured for the app, the Gorouter may begin to route requests to app instances that are not ready to receive requests.
The following list describes when the Gorouter begins routing requests to an app instance after each type of health check:
If you configured a process
health check, the Gorouter can begin routing requests to the app instance when all processes within the app are running.
If you configured a port
health check, the Gorouter can begin routing requests to the app instance when the app instance receives a TCP. connections, even if it is not fully initialized.
If you configured an http
health check, the Gorouter can begin routing requests to the app instance when the app instance can provide an HTTP 200
response to the health check, which indicates that the app instance is ready to respond to HTTP requests from the Gorouter.
If the new app instances that Autoscaler has created are slow to respond to requests, you can configure Diego to run http
health checks for your app to ensure that the Gorouter cannot route requests to app instances before they are ready to respond.
For more information about how Diego runs an app, see How Diego runs an app in Diego Components and Architecture. For more information about app health checks, including how to configure health checks and the types of health checks you can configure, see Using app Health Checks.
Alternatively you can configure the cooldown period for the app to better align with the app’s startup time to avoid further scaling actions during that period. Note that the cooldown period does not change the metric interval used to make the scaling decision after the cooldown period has elapsed. So a cooldown period of slightly longer than the actual startup time of the app may provide a better representation of the app’s performance for the value used to make the subsequent scaling decision.
Only one service instance of Autoscaler can exist in each space in an org. If you try to create more than one Autoscaler service instance in a space, you see the following error:
Unexpected Response Response code: 409 CC code: 10001 CC error code: CF-ServiceBrokerConflict Request ID: b3b6236b-82f5-42a7-5015-338369d5fea2::9d7c8b3a-0c4f-4935-b493-d6fe3f417f3b Description: Service broker error: Only One App Autoscaler Service Instance Approved ed Per Space FAILED
You must use the existing Autoscaler service instance in your space and bind it to the app you want to autoscale. For more information, see Create and bind the App Autoscaler service in Scaling an App using App Autoscaler.
There are two versions of the App Autoscaler CLI extension for the Cloud Foundry Command-Line Interface (cf CLI):
The VMware Tanzu App Autoscaler CLI Plug-in, used with TAS for VMs
The Cloud Foundry App Auto-Scaler plug-in, used with open-source Cloud Foundry deployments
If you attempt to use the Cloud Foundry App Auto-Scaler plug-in to control the App Autoscaler instance in your TAS for VMs deployment, you see the following error:
Error: Invalid AutoScaler API endpoint
You must instead use the VMware Tanzu App Autoscaler CLI Plug-in. To download the VMware Tanzu App Autoscaler CLI plug-in, go to Broadcom Support.