When scaling app instances using App Autoscaler, here is what you must consider for your TAS for VMs deployment.
Autoscaler can ensure that an app always has the number of instances it needs to handle the amount of traffic that it receives. However, in order for Autoscaler to create more app instances, you must ensure that TAS for VMs is sufficiently scaled.
Using Autoscaler to scale your apps might cause the workload that your TAS for VMs deployment manages to fluctuate more widely and often. As a result, provisioning resources for a TAS for VMs deployment that includes Autoscaler is more challenging than provisioning resources for a TAS for VMs deployment in which apps are static or scaled manually.
Before you configure Autoscaler to scale your apps, you must ensure that your Diego Cells have enough resources to meet the autoscaling demands of your apps. To ensure that your Diego Cells can accommodate autoscaling, VMware recommends monitoring the total available memory, total available disk, and available free chunks across all Diego Cells in your TAS for VMs deployment.
Using observability tools such as Healthwatch for VMware Tanzu and VMware Tanzu Observability by Wavefront can help you monitor the resources that your Diego Cells use and alert you if your Diego Cells are likely to run out of resources. For more information, see the Healthwatch documentation and the Wavefront documentation.
Quota plans ensure that inefficient autoscaling rules or malicious requests do not cause Autoscaler to scale beyond the capabilities of your TAS for VMs deployment. You cannot configure Autoscaler to scale apps without ensuring that the space and org in which your app is deployed have quota plans that can accommodate your autoscaling needs.
When configuring Autoscaler to scale an app, you must ensure that the resource quotas allocated to the app are sufficient so Autoscaler can scale the app within the limits you configure. If the resource quotas that are available for the app to use are lower than the resources required to meet the upper scaling limit you configured, Autoscaler cannot scale up to that upper scaling limit.
For more information about creating and editing quota plans for your TAS for VMs deployment, see Creating and modifying quota plans.
When creating or changing quota plans for a space or org in your TAS for VMs deployment, consider whether you want to provision more or fewer resources for the apps you want Autoscaler to scale.
The following descriptions of each scenario include their benefits and drawbacks:
You can provision enough resources to set all apps to use their maximum allocated resource quotas simultaneously. However, you risk over-provisioning resources for your TAS for VMs deployment.
You can provision fewer resources, you can provision more Diego Cells from the IaaS on which your VMware Tanzu Operations Manager foundation is deployed if you require them. However, you risk Autoscaler being unable to accommodate scaling requests.
You might have already configured Autoscaler to scale some apps. However, if the scaling metric you configured Autoscaler to use for an app indicates that the number of app instances needs to be scaled up, but Autoscaler cannot scale the number of instances up further, the app might have reached its allocated quota limit for one or more resources.
Autoscaler does not record when resource quota limits prevent it from scaling an app. However, you can see when this occurs by configuring verbose logging for Autoscaler. To configure verbose logging for Autoscaler, see Configure verbose logging for Autoscaler.
To configure verbose logging for Autoscaler:
Go to the Tanzu Operations Manager Installation Dashboard.
Click the VMware Tanzu Application Service tile.
Select App Autoscaler.
Activate the Allow verbose logging* check box.
Click Save.
Return to the Tanzu Operations Manager Installation Dashboard.
Click Review Pending Changes.
Under the VMware Tanzu Application Service tile, click Errands. The Errands menu expands.
Ensure that the App Autoscaler Errand check box is selected.
Click Apply Changes.
Deploying TAS for VMs again briefly interrupts Autoscaler processes.
After you deploy TAS for VMs again, you can review the verbose logs for Autoscaler to identify which apps have reached one or more of their allocated resource quota limits. To identify these apps, see Identify apps that have reached resource quota limits.
To identify which apps have reached one or more of their allocated resource quota limits:
In a terminal window, run:
cf target -o system -s autoscaling
View the verbose logs for Autoscaler by running:
cf logs autoscale
If any apps have reached one or more of their allocated resource quota limits, the earlier command returns output that contains logs similar to the following example:
2019-10-18T12:40:58.19-0400[APP/PROC/WEB/0] OUT time="2019-10-18T16:40:58Z" level=info msg="Unable to scale. App instance quota has been reached. for app e59357x1-395a-4mp7-le36-ffbf4ec3de04 in space 9482047e-5x58-7a2m-p25l-es1w19o44b9a"
Record the global unique identifiers (GUIDs) for the apps and spaces that you identified in the earlier step. For example, in the example output in the previous stepit. The GUID for the app is e59357x1-395a-4mp7-le36-ffbf4ec3de04
, and the GUID for the space is 9482047e-5x58-7a2m-p25l-es1w19o44b9a
.
Send a GET
request to the Cloud Controller API to learn more about the resource quotas allocated to the apps and spaces you identified in a previous step.
To review the resource quotas allocated to an app, run:
cf curl /v3/apps/APP-GUID
Where APP-GUID
is the GUID for the app that you recorded in the previous step.
To review the resource quotas allocated to a space, run:
cf curl /v3/spaces/SPACE-GUID
Where SPACE-GUID
is the GUID for the space that you recorded in the previous step.
If you are adding autoscaling to an existing app, VMware recommends that you review historical request traffic patterns for the app to determine how many app instances are necessary to sufficiently handle future request traffic.
For example, if the overall app traffic follows a natural pattern, such as increased traffic during the local business day, review historical request traffic patterns for the app to see the following:
When the largest number of requests occurs each day.
The current static number of app instances that are necessary to accommodate the largest number of requests.
The number of app instances needed to accommodate app traffic outside of peak hours might be significantly lower than the number of app instances needed to accommodate app traffic during peak hours. In cases like this, VMware recommends configuring Autoscaler for your app to avoid using resources when the app does not need them to accommodate the maximum amount of expected traffic.
Autoscaler scales the number of app instances based primarily on the metric envelopes it receives from Log Cache. To ensure that Log Cache can support Autoscaler, you must ensure that Log Cache can store enough metric envelopes that Autoscaler can make appropriate scaling decisions.
By default, Log Cache can store a maximum of 100,000 envelopes per app. Because TAS for VMs generates several envelopes for every request, and Log Cache stores recent app logs in the same bucket, the history that Log Cache stores for busy apps might be insufficient for Autoscaler to make appropriate scaling decisions.
To view the number of envelopes that Log Cache stores and the cache duration for an app:
Download the Log Cache cf CLI plug-in from the Log Cache cf CLI Plugin repository on GitHub.
In a terminal window, run:
cf log-meta --source-type application
The earlier command returns output similar to the following example:
Source ID Source Source Type Count Expired Cache Duration e2xa9m8p-28l4-4ebf-8408-0548c8573c66 example-app-1 application 82142 79749 21h54m48s 8e3798xa-7m2p-40l6-9605-e7cae998b855 example-app-2 application 21040 23694 21h54m47s
Review the values in the Count
and Cache Duration
columns to view the number of envelopes that Log Cache stores and the cache duration for an app.
If necessary, you can increase the maximum number of envelopes that Log Cache can store per app by configuring the Maximum number of envelopes stored in Log Cache per source text box in the Advanced Features pane of the TAS for VMs tile. To configure the maximum number of envelopes that Log Cache can store, see Maximum number of envelopes stored in Log Cache per source in Configuring TAS for VMs.
Additionally, Log Cache might eject envelopes if it has an insufficient amount of memory. To fix this issue, you can scale Log Cache horizontally or vertically. To scale Log Cache horizontally or vertically, see Scaling an app using cf scale.
Apps that use scaling metrics that are emitted less frequently, such as container metrics, are more likely to be affected by Log Cache evicting envelopes.