The app container lifecycle on the Diego architecture

The lifecycle stages of an app container for your VMware Tanzu Application Service for VMs (TAS for VMs) deployments running on the Diego architecture include deployment, crash events, evacuation, and shutdown.

Deployment

The app deployment process involves uploading, staging, and starting the app in a container. Your app must successfully complete each of these phases within certain time limits. The default time limits for the phases are as follows:

Upload: 15 minutes
Stage: 15 minutes
Start: 60 seconds

Your administrator can change these defaults. Check with your administrator for the actual time limits set for app deployment.

Developers can change the time limit for starting apps through an app manifest or on the command line. For more information, see Deploying with App Manifests and Using App Health Checks.

Crash Events

If an app instance crashes, TAS for VMs automatically restarts it by rescheduling the instance on another container three times. After three failed restarts, TAS for VMs waits thirty seconds before attempting another restart. The wait time doubles each restart until the ninth restart, and remains at that duration until the 200th restart. After the 200th restart, TAS for VMs stops trying to restart the app instance.

Evacuation

Certain operator actions require restarting VMs with containers hosting app instances. For example, an operator who updates stemcells or installs a new version of TAS for VMs must restart all the VMs in a deployment.

TAS for VMs automatically relocates the instances on VMs that are shutting down through a process called evacuation. TAS for VMs recreates the app instances on another VM, waits until they are healthy, and then shuts down the old instances. During an evacuation, developers may see their app instances in a duplicated state for a brief period.

During this app duplication process, singleton app instances may become temporarily unavailable if the replacement instance does not become healthy within the Diego Cell’s evacuation timeout, which defaults to 10 minutes. Because of this, app developers with a low tolerance for brief downtime may prefer to run several instances of their app. See Run Multiple Instances to Increase Availability.

Shutdown

TAS for VMs requests a shutdown of your app instance in the following scenarios:

When a user runs cf scale, cf stop, cf push, cf delete, or cf restart-app-instance
As a result of a system event, such as the replacement procedure during Diego Cell evacuation or when an app instance stops because of a failed health check probe

To shut down the app, TAS for VMs sends the app process in the container a SIGTERM. By default, the process has ten seconds to shut down gracefully. If the process has not exited after ten seconds, TAS for VMs sends a SIGKILL.

By default, apps must finish their in-flight jobs within ten seconds of receiving the SIGTERM before TAS for VMs terminates the app with a SIGKILL. For instance, a web app must finish processing existing requests and stop accepting new requests. To modify the timeout period on the TAS for VMs tile or IST tile, go to the Advanced Settings tab and edit the “app graceful shutdown period” property. NOTE: This may increase the time it takes to drain diego cells causing increased deployment time.

If your apps require a longer period of time to finish in-flight jobs and gracefully shut down, you can increase the graceful shutdown period. For more information, see Configure Advanced Features in Configuring TAS for VMs.

Note One exception to the cases mentioned above is when monit restarts a crashed Diego Cell rep or Garden server. In this case, TAS for VMs immediately stops the apps that are still running using SIGKILL.