Troubleshooting Redis for VMware Tanzu Application Service

This topic for operators gives you troubleshooting techniques for Redis for VMware Tanzu Application Service.

Some of the troubleshooting approaches in this topic suggest potentially destructive operations. VMware recommends that you back up both your Tanzu Operations Manager and deployments before attempting such operations. For more information about backing up your setup and exporting your Tanzu Operations Manager installation, see Backing Up Deployments with BBR

Useful debugging commands

Before debugging, gather the following information about your deployment:

Current version of Redis for Tanzu Application Service, and, if upgrading, the previous version of Redis for Tanzu Application Service
Current version of Tanzu Operations Manager, and, if upgrading, the previous version of Tanzu Operations Manager

cf CLI commands

See the following table for Cloud Foundry Command Line Interface (cf CLI) commands commonly used while debugging:

To view the...	Command
API endpoint, org, and space	`cf target`
Service offerings available in the targeted org and space	`cf marketplace`
Apps deployed to the targeted org and space	`cf apps`
Service instances deployed to the targeted org and space	`cf services`
GUID for a specific service instance	`cf service SERVICE-INSTANCE --guid`
Service instance or application logs	`cf tail SERVICE-INSTANCE/APP`

BOSH CLI commands

See the following table for BOSH CLI commands commonly used while debugging:

Purpose	Command
View the targeted BOSH Director, version, and CPI	`bosh env`
View the deployments deployed through the targeted BOSH Director	`bosh deployments`
View the VMs for a given deployment	`bosh -d DEPLOYMENT vms`
SSH into a given deployment's VM	`bosh -d DEPLOYMENT ssh VM`

You can obtain general information after you SSH into a broker or service instance as follows:

To see system logs, go to /var/vcap/sys/log.
To check process health, run sudo monit summary.
To obtain a list of all processes, run ps aux.
To see disk usage, run df -h.
To see memory usage, run free -m.

You can obtain information specific to the cf-redis broker as follows:

For shared-VMs, the redis processes are co-located with the CF-Redis broker. You can check these VMs using ps aux | grep redis-server.
Shared-VM data is stored in /var/vcap/store/cf-redis-broker/redis-data.

About the Redis CLI

The redis-cli is a command line tool used to access a Redis server. You can use the redis-cli for create, read, update, and delete (CRUD) actions, and to set configuration values. For more information about the redis-cli, see redis-cli, the Redis command line interface in the Redis documentation.

To access the redis-cli, do the following:

Follow the instructions in Access the Redis Service to retrieve the password and port number for the service instance.
SSH into the service instance.
Connect to the Redis server and enter the redis-cli interactive mode by running:
```
LD_LIBRARY_PATH=/var/vcap/packages/openssl/lib/ /var/vcap/packages/redis/bin/redis-cli -p PORT -a PASSWORD
```
Where:
- PORT is the port number retrieved in step one.
- PASSWORD is the password retrieved in step one.
For more information about the redis-cli interactive mode, see [Interactive Mode](https://redis.io/topics/rediscli#interactive-mode) in the Redis documentation.

Troubleshooting errors

Start here if you are responding to a specific error or error messages.

Common services errors

The following errors occur in multiple services:

Failed installation
Cannot create or delete service instances
Broker request timeouts
Instance does not exist
Cannot bind to or unbind from service instances
Cannot connect to a service instance
Upgrade all service instances errand fails
Missing logs and metrics

Failed installation
Symptom	Redis for Tanzu Application Service fails to install.
Cause	Reasons for a failed installation include: Certificate issues: The on-demand broker (ODB) requires valid certificates. Deploy fails. There are multiple possible causes. Networking problems: Cloud Foundry cannot reach the Redis for Tanzu Application Service broker Cloud Foundry cannot reach the service instances The service network cannot access the BOSH Director The register broker errand fails. The smoke test errand fails. Resource sizing issues: These occur when the resource sizes selected for a plan are lower than Redis for Tanzu Application Service requires to function. Other service-specific issues.
Solution	To troubleshoot: Certificate issues: Ensure that your certificates are valid and generate new ones if necessary. To generate new certificates, contact Support. Deploy fails: View the logs using Tanzu Operations Manager to find out why the deployment is failing. Networking problems: For how to troubleshoot, see Networking problems. Register broker errand fails: For how to troubleshoot, see Register broker errand. Resource sizing issues: Verify your resource configuration in Tanzu Operations Manager and ensure that the configuration matches that recommended by the service.

Cannot create or delete service instances
Symptom	If developers report errors such as: Instance provisioning failed: There was a problem completing your request. Please contact your operations team providing the following information: service: redis-acceptance, service-instance-guid: ae9e232c-0bd5-4684-af27-1b08b0c70089, broker-request-id: 63da3a35-24aa-4183-aec6-db8294506bac, task-id: 442, operation: create
Cause	Reasons include: Problems with the deployment manifest Authentication errors Network errors Quota errors
Solution	To troubleshoot: If the BOSH error shows a problem with the deployment manifest, open the manifest in a text editor to inspect it. To continue troubleshooting, Log in to BOSH and target the Redis for Tanzu Application Service instance using the instructions on parsing a Cloud Foundry error message. Retrieve the BOSH task ID from the error message and run: bosh task TASK-ID See Access the broker logs and use the `broker-request-id` from the error message to search the logs for more information. Check for: Authentication errors Network errors Quota errors

Broker request timeouts
Symptom	If developers report errors such as: Server error, status code: 504, error code: 10001, message: The request to the service broker timed out: https://BROKER-URL/v2/service_instances/e34046d3-2379-40d0-a318-d54fc7a5b13f/service_bindings/aa635a3b-ef6d-41c3-a23f-55752f3f651b
Cause	Cloud Foundry might not be connected to the service broker, or there might be a large number of queued tasks.
Solution	To troubleshoot: Confirm that Cloud Foundry (CF) is connected to the service broker. Verify the BOSH queue size: Log in to BOSH as an admin. Run bosh tasks If there are a large number of queued tasks, the system might be under too much load. BOSH is configured with two workers and one status worker, which might not be enough for the level of load. If the task queue is long, advise app developers to try again after the system is under less load.

Instance does not exist
Symptom	If developers report errors such as: Server error, status code: 502, error code: 10001, message: Service broker error: instance does not exist
Cause	The instance might have been deleted.
Solution	To troubleshoot: Confirm that the Redis for Tanzu Application Service instance exists in BOSH and obtain the GUID CF by running: cf service MY-INSTANCE --guid Using the --guid flag you obtained, run: bosh -d service-instance_GUID vms If the BOSH deployment is not found, it was deleted from BOSH. Contact VMware Tanzu Support for help.

Cannot bind to or unbind from service instances
Symptom	If developers report errors such as: Server error, status code: 502, error code: 10001, message: Service broker error: There was a problem completing your request. Please contact your operations team providing the following information: service: example-service, service-instance-guid: 8d69de6c-88c6-4283-b8bc-1c46103714e2, broker-request-id: 15f4f87e-200a-4b1a-b76c-1c4b6597c2e1, operation: bind
Cause	This might be due to authentication or network errors.
Solution	To find out the issue with the binding: Access the service broker logs. Search the logs for the `broker-request-id` string listed in the error message above. Check for: Authentication errors Network errors Contact VMware Tanzu Support for help if you are unable to resolve the problem.

Cannot connect to a service instance
Symptom	Developers report that their app cannot use service instances that they created and bound.
Cause	The error might originate from the service or be network related.
Solution	To solve this issue, ask the user to send application logs that show the connection error. If the error originates from the service, then follow Redis for Tanzu Application Service-specific instructions. If the issue appears to be network-related, then: Verify that application security groups are configured correctly. Configured access for the service network that the tile is deployed to. Ensure that the network the TAS for VMs tile is deployed to has network access to the service network. You can find the network definition for this service network in the BOSH Director tile. In Tanzu Operations Manager go into the service tile and see the service network that is configured in the networks tab. In Tanzu Operations Manager go into the TAS for VMs tile and see the network it is assigned to. Ensure that these networks can access each other.

Upgrade all service instances errand fails
Symptom	The `upgrade-all-service-instances` errand fails.
Cause	There might be a problem with a particular instance.
Solution	To troubleshoot: Look at the errand output in the Tanzu Operations Manager log. If an instance has failed to upgrade, debug and fix it before running the errand again to prevent any failure issues from spreading to other on-demand instances. After the Tanzu Operations Manager log no longer lists the deployment as `failing`, re-run the errand to upgrade the rest of the instances.

Missing logs and metrics
Symptom	No logs are being emitted by the on-demand broker.
Cause	Syslog might not be configured correctly, or you might have network access issues.
Solution	To troubleshoot: Ensure that you have configured syslog for the tile. Verify that your syslog forwarding address is correct in Tanzu Operations Manager. Ensure that you have network connectivity between the networks that the tile is using and the syslog destination. If the destination is external, use the public ip VM extension feature available in your Tanzu Operations Manager tile configuration settings. Verify that Loggregator is emitting metrics: Install the `cf log-cache` plug-in. For instructions, see the Log Cache CLI Plugin GitHub repository. Find logs from your service instance by running: cf tail -f SERVICE_INSTANCE If no metrics appear within five minutes, verify that the broker network has access to the Loggregator system on all required ports. If you are unable to resolve the issue, contact Support.

Redis for Tanzu Application Service-specific errors

The following troubleshooting errors are specific to Redis for Tanzu Application Service:

AOF file corrupted, cannot start Redis instance
Saving error
Failed backup
Orphaned instances: BOSH Director cannot see your instances
Orphaned instances: Pivotal Platform cannot see your instances
Failed to set credentials in runtime CredHub
Service outage after deactivating TLS

AOF File Corrupted, Cannot Start Redis Instance
Symptom	One or more VMs might fail to start the Redis server during pre-start with the error message logged in syslog: [ErrorLog-TimeStamp] # Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix `filename` For more information about remote syslog forwarding, see Configure syslog forwarding.
Cause	In cases of hard crashes, for example, due to power loss or VM termination without running drain scripts, your AOF file might become corrupted. The error log printed out by Redis provides a clear means of recovery.
Solution	Solution for shared-VM instances: SSH into your `cf-redis-broker` instance. Navigate to the directory where your AOF file is stored. This is usually `/var/vcap/store/cf-redis-broker/redis-data/SERVICE-INSTANCE-GUID/`, where `SERVICE-INSTANCE-GUID` is the GUID for the affected service instance. Run the following command: /var/vcap/packages/redis/redis-check-aof appendonly.aof --fix To SSH out of the `cf-redis-broker` instance and restart, run the following command: bosh restart INSTANCE-GROUP/INSTANCE-ID Solution for on-demand-VM instances: SSH into your affected service instance. Navigate to the directory where your AOF file is stored. This is usually `/var/vcap/store/redis/`. Run: /var/vcap/packages/redis/redis-check-aof appendonly.aof --fix SSH out of the service instance and restart it by running: bosh restart INSTANCE-GROUP/INSTANCE-ID

AOF File Corrupted, Cannot Start Redis Instance

Symptom

One or more VMs might fail to start the Redis server during pre-start with the error message logged in syslog:

[ErrorLog-TimeStamp] # Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix `filename`

For more information about remote syslog forwarding, see Configure syslog forwarding.

Cause In cases of hard crashes, for example, due to power loss or VM termination without running drain scripts, your AOF file might become corrupted. The error log printed out by Redis provides a clear means of recovery.

Solution

Solution for shared-VM instances:

SSH into your cf-redis-broker instance.
Navigate to the directory where your AOF file is stored. This is usually /var/vcap/store/cf-redis-broker/redis-data/SERVICE-INSTANCE-GUID/, where SERVICE-INSTANCE-GUID is the GUID for the affected service instance.

Run the following command:

/var/vcap/packages/redis/redis-check-aof appendonly.aof --fix

To SSH out of the cf-redis-broker instance and restart, run the following command:
```
bosh restart INSTANCE-GROUP/INSTANCE-ID
```

Solution for on-demand-VM instances:

SSH into your affected service instance.
Navigate to the directory where your AOF file is stored. This is usually /var/vcap/store/redis/.

Run:

/var/vcap/packages/redis/redis-check-aof appendonly.aof --fix

SSH out of the service instance and restart it by running:
```
bosh restart INSTANCE-GROUP/INSTANCE-ID
```

Saving Error
Symptom	One of the following error messages is logged in syslog: Background saving error Failed opening the RDB file dump.rdb (in server root dir /var/vcap/store/redis) for saving: No space left on device For more information about remote syslog forwarding, see Configure syslog forwarding.
Cause	This might be logged when the configured disk size is too small, or if the Redis AOF uses all the disk space.
Solution	To prevent this error, do the following: Ensure the disk is configured to at least 2.5x the VM memory for the on-demand broker and 3.5x the VM memory for cf-redis-broker. Check if the AOF is using too much disk space by doing the following: BOSH SSH into the affected service instance VM. List the size of each file by running: cd /var/vcap/store/redis; ls -la

Saving Error

Symptom

One of the following error messages is logged in syslog:

Background saving error

Failed opening the RDB file dump.rdb (in server root dir /var/vcap/store/redis) for saving: No space left on device

For more information about remote syslog forwarding, see Configure syslog forwarding.

Cause

This might be logged when the configured disk size is too small, or if the Redis AOF uses all the disk space.

Solution

To prevent this error, do the following:

Ensure the disk is configured to at least 2.5x the VM memory for the on-demand broker and 3.5x the VM memory for cf-redis-broker.
Check if the AOF is using too much disk space by doing the following:
1. BOSH SSH into the affected service instance VM.
2. List the size of each file by running:
```
cd /var/vcap/store/redis; ls -la
```

Failed Backup
Symptom	The following error message is logged: Backup has failed. Redis must be running for a backup to run
Cause	This is logged if a backup is initiated against a Redis server that is down.
Solution	Ensure that the Redis server being backed up is running. To do this, run `bosh restart` against the affected service instance VM.

Orphaned instances: BOSH Director cannot see your instances
Symptom	When you run `cf curl /v2/service_instances`, some service instances are visible that are not visible to the BOSH Director. These orphaned instances can create issues. For example, they might hold on to a static IP address, causing IP conflicts.
Cause	Orphaned instances can occur in the following situations: Both TAS for VMs and BOSH maintain state. Orphaned instances can occur if the TAS for VMs state is out of sync with BOSH. For example, the deployments or VMs have been de-provisioned by BOSH but the call to update the TAS for VMs state failed. If a call to de-provision a service instance was made directly to BOSH rather than through the cf CLI.
Solution	You can solve this issue by doing one of the following: If this is the first occurrence: VMware recommends that you purge instances by running: cf purge-service-instance SERVICE-INSTANCE . If this is a repeated occurrence: Contact VMware Tanzu Support for further help, and include the following: A snippet of your `broker.log` around the time of the incident The deployment manifest of failed instances, hiding private information like passwords Any recent logs that you can recover from the failed service instance

Orphaned instances: The deployment cannot see your instances
Symptom	The deployment cannot see your broker or service instances. These instances exist, but cannot receive communication.
Cause	If you run `cf purge-service-instances` while your service instance or broker still exists, your service instance becomes orphaned.
Solution	If the deployment lost the details of your instances, but BOSH still has the deployment details, you can solve this issue by backing up the data on your service instance and creating a new service. To back up your data and create a new service instance: Retrieve your orphaned service instance GUID by running: bosh -d MY-DEPLOYMENT run-errand orphan-deployments Where `MY-DEPLOYMENT` is the name of your deployment. SSH into your orphaned service instance by running: bosh -e MY-ENV -d MY-DEPLOYMENT ssh VM-NAME/GUID Where: `MY-ENV` is the name of your environment. `MY-DEPLOYMENT` is the name of your deployment. `VM-NAME/GUID` is the name of your service instance and GUID that you obtained in step 1. Create an new RDB file by running: /var/vcap/jobs/redis-backups/bin/backup --snapshot This creates a new RDB file in `/var/vcap/store/redis-backup`. Push the RDB file to your backup location by running: /var/vcap/jobs/service-backup/bin/manual-backup For information about backup locations, see Configuring Automated Service Backups. Create a new service instance with the same configuration of the database you backed up. Retrieve your new service instance GUID, by running: bosh -e MY-ENV -d MY-DEPLOYMENT vms Where: `MY-ENV` is the name of your environment. `MY-DEPLOYMENT` is the name of your deployment. SSH into your new service instance by repeating step 2 above with the GUID that you retrieved in step 6. Create a new directory in new service instance by running: mkdir /var/vcap/store/MY-BACKUPS Save the RDB file in `/var/vcap/store/MY-BACKUPS/` to transfer it to the new instance. Replace `MY-BACKUPS` with the name of your backups directory. Verify the RDB file has not been corrupted by running: md5sum RDB-FILE Where `RDB-FILE` is the path to your RDB file. Restore your data by running: sudo /var/vcap/jobs/redis-backups/bin/restore --sourceRDB RDB-FILE Where `RDB-FILE` is the path to your RDB file.

Failed to set credentials in runtime CredHub
Symptom	If developers report errors such as: error: failed to set credentials in credential store: The request includes an unrecognized parameter 'mode'. Please update or remove this parameter and retry your request. error for user: There was a problem completing your request. Please contact your operations team providing the following information: service: p.redis, service-instance-guid: , broker-request-id: , operation: bind
Cause	Your service instances might not be running the latest version of Redis for Tanzu Application Service. You might experience compatibility issues with CredHub if your service instances are running Redis for Tanzu Application Service v1.14.3 or earlier.
Solution	Ensure you have the latest patch version of Redis for Tanzu Application Service installed. For more information about the latest patch, see the Redis for VMware Tanzu Application Service Release Notes. Run the `upgrade-all-service-instances` errand to ensure all service instances are running the latest service offering. For how to run the errand, see Upgrade All Service Instances. Running this errand causes a short period of downtime.

Service outage after deactivating TLS
Symptom	After deactivating TLS, apps that require on-demand Redis service instances become unresponsive.
Cause	When TLS is first activated, all on-demand service instances are re-created with two ports. Every new or re-created app receives the new credentials. Spring and Steeltoe apps are configured for activated TLS by default, but other languages and frameworks require further configuration. When TLS is deactivated, the TLS port is removed from all on-demand instances. This prevents the apps from connecting to the instance.
Solution	First, consider activating TLS. The compliance body that oversees your apps might require TLS to be activated. Also, switching between activated and deactivated TLS incurs downtime. To activate TLS, follow these steps: In your Tanzu Operations Manager home page, select the Redis tile. Navigate to On-Demand Service Settings. On the Enable TLS section, ensure it is set to Optional. Click Save. Navigate back to the Tanzu Operations Manager home page and click Review Pending Changes. Ensure the Recreate All On-Demand Service Instances errand is enabled under the Redis section and then click Apply Changes. To continue with TLS deactivated, follow these steps: Unbind, bind, and re-stage every app that was affected by deactivating TLS. For more information, see Introduction for App Developers. This makes Spring and Steeltoe apps default to non-TLS configuration. Manually configure any other relevant languages and frameworks to work with TLS deactivated.

Service outage after deactivating TLS

Symptom

After deactivating TLS, apps that require on-demand Redis service instances become unresponsive.

Cause

When TLS is first activated, all on-demand service instances are re-created with two ports. Every new or re-created app receives the new credentials. Spring and Steeltoe apps are configured for activated TLS by default, but other languages and frameworks require further configuration.

When TLS is deactivated, the TLS port is removed from all on-demand instances. This prevents the apps from connecting to the instance.

Solution

First, consider activating TLS. The compliance body that oversees your apps might require TLS to be activated. Also, switching between activated and deactivated TLS incurs downtime.

To activate TLS, follow these steps:

In your Tanzu Operations Manager home page, select the Redis tile.
Navigate to On-Demand Service Settings.
On the Enable TLS section, ensure it is set to Optional.
Click Save.
Navigate back to the Tanzu Operations Manager home page and click Review Pending Changes.
Ensure the Recreate All On-Demand Service Instances errand is enabled under the Redis section and then click Apply Changes.

To continue with TLS deactivated, follow these steps:

Unbind, bind, and re-stage every app that was affected by deactivating TLS. For more information, see Introduction for App Developers. This makes Spring and Steeltoe apps default to non-TLS configuration.
Manually configure any other relevant languages and frameworks to work with TLS deactivated.

Troubleshooting components

This section provides guidance on checking for, and fixing, issues in cf-redis and on-demand service components.

BOSH problems

Large BOSH queue

On-demand service brokers add tasks to the BOSH request queue, which can back up and cause delay under heavy loads. An app developer who requests a new Redis for Tanzu Application Service instance sees create in progress in the Cloud Foundry Command Line Interface (cf CLI) until BOSH processes the queued request.

Tanzu Operations Manager deploys two BOSH workers to process its queue.

Configuration

Service instances in failing state

The VM or disk type that you configured in the plan page of the tile in Tanzu Operations Manager might not be large enough for the Redis for Tanzu Application Service service instance to start. See tile-specific guidance on resource requirements.

Authentication

UAA changes

If you rotated any UAA user credentials then you might see authentication issues in the service broker logs.

To resolve this, redeploy the Redis for Tanzu Application Service tile in Tanzu Operations Manager. This provides the broker with the latest configuration.

Caution You must ensure that any changes to UAA credentials are reflected in the Tanzu Operations Manager credentials tab of the VMware Tanzu Application Service for VMs tile.

Networking

Common issues with networking include:

Issue	Solution
Latency when connecting to the Redis for Tanzu Application Service service instance to create or delete a binding.	Try again or improve network performance.
Firewall rules are blocking connections from the Redis for Tanzu Application Service service broker to the service instance.	Open the Redis for Tanzu Application Service tile in Tanzu Operations Manager and verify that the two networks configured in the Networks pane allow access to each other.
Firewall rules are blocking connections from the service network to the BOSH Director network.	Ensure that service instances can access the Director so that the BOSH agents can report in.
Apps cannot access the service network.	Configure Cloud Foundry application security groups to allow runtime access to the service network.
Problems accessing BOSH’s UAA or the BOSH director.	Follow network troubleshooting and verify that the BOSH Director is online.

Validate service broker connectivity to service instances

To validate connectivity:

View the BOSH deployment name for your service broker by running:
```
bosh deployments
```
SSH into the Redis for Tanzu Application Service service broker by running:
```
bosh -d DEPLOYMENT-NAME ssh
```
If no BOSH task-id appears in the error message, look in the broker log using the broker-request-id from the task.

Validate app access to a service instance

Use the cf ssh command to access to the app container, then connect to the Redis for Tanzu Application Service service instance using the binding included in the VCAP_SERVICES environment variable.

Quotas

Plan quota issues

If developers report errors such as:

Message: Service broker error: The quota for this service plan has been exceeded.
Please contact your Operator for help.

Verify your current plan quota.
Increase the plan quota.
Log in to Tanzu Operations Manager.
Reconfigure the quota on the plan page.
Deploy the tile.
Find who is using the plan quota and take the appropriate action.

Global quota issues

If developers report errors such as:

Message: Service broker error: The quota for this service has been exceeded.
Please contact your Operator for help.

Verify your current global quota.
Increase the global quota.
Log in to Tanzu Operations Manager.
Reconfigure the quota on the on-demand settings page.
Deploy the tile.
Find out who is using the quota and take the appropriate action.

Failing jobs and unhealthy instances

To find out if there is an issue with the Redis for Tanzu Application Service deployment:

Inspect the VMs by running:

bosh -d service-instance_GUID vms --vitals

For additional information, run:

bosh -d service-instance_GUID instances --ps --vitals

If the VM is failing, follow the service-specific information. Any unadvised corrective actions (such as running BOSH restart on a VM) can cause issues in the service instance.

Techniques for troubleshooting

This section contains instructions on:

Interacting with the on-demand service broker
Interacting with on-demand service instance BOSH deployments
Performing general maintenance and housekeeping tasks

Parse a Cloud Foundry (CF) error message

Failed operations (create, update, bind, unbind, delete) cause an error message. You can retrieve the error message later by running the cf CLI command cf service INSTANCE-NAME.

$ cf service myservice

Service instance: myservice
Service: super-db
Bound apps:
Tags:
Plan: dedicated-vm
Description: Dedicated Instance
Documentation url:
Dashboard:

Last Operation
Status: create failed
Message: Instance provisioning failed: There was a problem completing your request.
     Please contact your operations team providing the following information:
     service: redis-acceptance,
     service-instance-guid: ae9e232c-0bd5-4684-af27-1b08b0c70089,
     broker-request-id: 63da3a35-24aa-4183-aec6-db8294506bac,
     task-id: 442,
     operation: create
Started: 2017-03-13T10:16:55Z
Updated: 2017-03-13T10:17:58Z

Use the information in the Message field to debug further. Provide this information to Support when filing a ticket.

The task-id field maps to the BOSH task ID. For more information about a failed BOSH task, use the bosh task TASK-ID.

The broker-request-guid maps to the portion of the On-Demand Service Broker log containing the failed step. Access the broker log through your syslog aggregator, or access BOSH logs for the broker by typing bosh logs broker 0. If you have more than one broker instance, repeat this process for each instance.

Access broker and instance logs and VMs

Before following these procedures, log in to the cf CLI and the BOSH CLI.

Access broker logs and VMs

You can access logs using Tanzu Operations Manager by clicking on the Logs tab in the tile and downloading the broker logs.

To access logs using the BOSH CLI:

To identify the on-demand broker (ODB) deployment run:
```
bosh deployments
```
To view VMs in the deployment run:
```
bosh -d DEPLOYMENT-NAME instances
```
To SSH onto the VM run:
```
bosh -d DEPLOYMENT-NAME ssh
```
To Download the broker logs run:
```
bosh -d DEPLOYMENT-NAME logs
```

The archive generated by BOSH includes the following logs:

Log Name	Description
broker.stdout.log	Requests to the on-demand broker and the actions the broker performs while orchestrating the request (e.g. generating a manifest and calling BOSH). Start here when troubleshooting.
bpm.log	Control script logs for starting and stopping the on-demand broker.
post-start.stderr.log	Errors that occur during post-start verification.
post-start.stdout.log	Post-start verification.
drain.stderr.log	Errors that occur while running the drain script.

Access service instance logs and VMs

To target an individual service instance deployment, retrieve the GUID of your service instance with the following cf CLI command:
```
cf service MY-SERVICE --guid
```
To view VMs in the deployment, run:
```
bosh -d service-instance_GUID instances
```
To SSH into a VM, run:
```
bosh -d service-instance_GUID ssh
```
To download the instance logs, run:
```
bosh -d service-instance_GUID logs
```

Run service broker errands to manage brokers and instances

From the BOSH CLI, you can run service broker errands that manage the service brokers and perform mass operations on the service instances that the brokers created. These service broker errands include:

register-broker registers a broker with the Cloud Controller and lists it in the Marketplace.
deregister-broker deregisters a broker with the Cloud Controller and removes it from the Marketplace.
upgrade-all-service-instances upgrades existing instances of a service to its latest installed version.
delete-all-service-instances deletes all instances of service.
orphan-deployments detects "orphan" instances that are running on BOSH but not registered with the Cloud Controller.

To run an errand:

bosh -d DEPLOYMENT-NAME run-errand ERRAND-NAME

For example:

bosh -d my-deployment run-errand deregister-broker

Register broker

The register-broker errand:

Registers the service broker with Cloud Controller.
Activates service access for any plans that are enabled on the tile.
Deactivates service access for any plans that are deactivated on the tile.
Does nothing for any plans that are set to manual on the tile.

Run this errand whenever the broker is re-deployed with new catalog metadata to update the Marketplace.

Plans with deactivated service access are only visible to admin Cloud Foundry users. Non-admin Cloud Foundry users, including Org Managers and Space Managers, cannot see these plans.

Deregister broker

This errand deregisters a broker from Cloud Foundry.

The errand:

Deletes the service broker from Cloud Controller
Fails if there are any service instances, with or without bindings

Use the Delete All Service Instances errand to delete any existing service instances.

To run the errand:

bosh -d DEPLOYMENT-NAME run-errand deregister-broker

Upgrade all service instances

The upgrade-all-service-instances errand:

Collects all the service instances that the on-demand broker has registered.
Issues an upgrade command and deploys the a new manifest to the on-demand broker for each service instance.
Adds to a retry list any instances that have ongoing BOSH tasks at the time of upgrade.
Retries any instances in the retry list until all instances are upgraded.

When you make changes to the plan configuration, the errand upgrades all the Redis for Tanzu Application Service service instances to the latest version of the plan.

If any instance fails to upgrade, the errand fails immediately. This prevents systemic problems from spreading to the rest of your service instances.

Delete all service instances

This errand uses the Cloud Controller API to delete all instances of your broker service offering in every Cloud Foundry org and space. It deletes only instances the Cloud Controller knows about. It does not delete orphan BOSH deployments.

Important Orphan BOSH deployments do not correspond to a known service instance. While rare, orphan deployments can occur. Use the orphan-deployments errand to identify them.

The delete-all-service-instances:

Unbinds all apps from the service instances.
Deletes all service instances sequentially. Each service instance deletion includes:
1. Running any pre-delete errands
2. Deleting the BOSH deployment of the service instance
3. Removing any ODB-managed secrets from BOSH CredHub
4. Checking for instance deletion failure, which causes the errand to failfailing immediately
Determines whether any instances were created while the errand was running. If new instances are detected, the errand returns an error. In this case, VMware recommends running the errand again.

CautionUse extreme caution when running this errand. Use it only when you want to destroy all of the on-demand service instances in an environment.

To run the errand:

bosh -d service-instance_GUID delete-deployment

Detect orphaned service instances

A service instance is defined as "orphaned" when the BOSH deployment for the instance is still running, but the service is no longer registered in Cloud Foundry.

The orphan-deployments errand collates a list of service deployments that have no matching service instances in Cloud Foundry and return the list to the operator. It is then up to the operator to remove the orphaned BOSH deployments.

To run the errand:

bosh -d DEPLOYMENT-NAME run-errand orphan-deployments

If orphan deployments exist---The errand script does the following:

Exit with exit code 10
Output a list of deployment names under a [stdout] header
Provide a detailed error message under a [stderr] header

For example:

[stdout]
[{"deployment\_name":"service-instance\_80e3c5a7-80be-49f0-8512-44840f3c4d1b"}]

[stderr]
Orphan BOSH deployments detected with no corresponding service instance in Cloud Foundry. Before deleting any deployment it is recommended to verify the service instance no longer exists in Cloud Foundry and any data is safe to delete.

Errand 'orphan-deployments' completed with error (exit code 10)

These details are also available through the BOSH /tasks/ API endpoint for use in scripting:

$ curl 'https://bosh-user:bosh-password@bosh-url:25555/tasks/task-id/output?type=result' | jq .
{
  "exit_code": 10,
  "stdout": "[{"deployment_name":"service-instance_80e3c5a7-80be-49f0-8512-44840f3c4d1b"}]\n",
  "stderr": "Orphan BOSH deployments detected with no corresponding service instance in Cloud Foundry. Before deleting any deployment it is recommended to verify the service instance no longer exists in Cloud Foundry and any data is safe to delete.\n",
  "logs": {
    "blobstore_id": "d830c4bf-8086-4bc2-8c1d-54d3a3c6d88d"
  }
}

If no orphan deployments exist---The errand script:

Exit with exit code 0
Stdout is an empty list of deployments
Stderr is None

[stdout]
[]

[stderr]
None

Errand 'orphan-deployments' completed successfully (exit code 0)

If the errand encounters an error during running---The errand script does the following:

Exit with exit 1
Stdout is empty
Any error messages are under stderr

To clean up orphaned instances, run the following command on each instance:

Caution Running this command might leave IaaS resources in an unusable state.

bosh delete-deployment service-instance_SERVICE-INSTANCE-GUID

Get admin credentials for a service instance

To retrieve the admin credentials for a service instance from BOSH CredHub:

Use the cf CLI to find the GUID associated with the service instance for which you want to retrieve credentials by running:
```
cf service SERVICE-INSTANCE-NAME --guid
```
For example:
```
$ cf service my-service-instance --guid

12345678-90ab-cdef-1234-567890abcdef
```
If you do not know the name of the service instance, you can list service instances in the space with cf services.
Follow the steps in Gather Credential and IP Address information and Log in to the Tanzu Operations Manager VM with SSH of Advanced Troubleshooting with the BOSH CLI to SSH into the Tanzu Operations Manager VM.
From the Tanzu Operations Manager VM, log in to your BOSH Director with the BOSH CLI. See Authenticate with the BOSH Director VM in Advanced Troubleshooting with the BOSH CLI.
Find the values for BOSH_CLIENT and BOSH_CLIENT_SECRET:
1. In the Tanzu Operations Manager Installation Dashboard, click the BOSH Director tile.
2. Click the Credentials tab.
3. In the BOSH Director section, click the link to the BOSH Commandline Credentials .
4. Record the values for BOSH_CLIENT and BOSH_CLIENT_SECRET.

Set the API target of the CredHub CLI to your BOSH CredHub server by running:

credhub api https://BOSH-DIRECTOR-IP:8844 \
      --ca-cert=/var/tempest/workspaces/default/root_ca_certificate

Where BOSH-DIRECTOR-IP is the IP address of the BOSH Director VM.

For example:

$ credhub api https://10.0.0.5:8844 \
      --ca-cert=/var/tempest/workspaces/default/root_ca_certificate

credhub login \
    --client-name=BOSH-CLIENT \
    --client-secret=BOSH-CLIENT-SECRET

For example:

$ credhub login \
      --client-name=credhub \
      --client-secret=abcdefghijklm123456789

Use the CredHub CLI to retrieve the credentials :

Retrieve the password for the admin user by running:

credhub get -n /p-bosh/service-instance_GUID/admin_password

In the output, the password appears under value. Record the password.
For example:

$ credhub get \
  -n /p-bosh/service-instance_70d30bb6-7f30-441a-a87c-05a5e4afff26/admin_password 

  id: d6e5bd10-3b60-4a1a-9e01-c76da688b847
  name: /p-bosh/service-instance_70d30bb6-7f30-441a-a87c-05a5e4afff26/admin_password
  type: password
  value: UMF2DXsqNPPlCNWMdVMcNv7RC3Wi10
  version_created_at: 2018-04-02T23:16:09Z

Reinstall a tile

To reinstall a tile in the same environment where it was previously uninstalled:

Ensure that the previous tile was correctly uninstalled as follows:
1. Log in as an admin by running:
```
cf login
```
2. Confirm that the Marketplace does not list Redis for Tanzu Application Service by running:
```
cf m
```
3. Log in to BOSH as an admin by running:
```
bosh log-in
```
4. Display your BOSH deployments to confirm that the output does not show the Redis for Tanzu Application Service deployment by running:
```
bosh deployments
```
5. Run the "delete-all-service-instances" errand to delete every instance of the service.
6. Run the "deregister-broker" errand to delete the service broker.
7. Delete the service broker BOSH deployment by running:
```
bosh delete-deployment BROKER-DEPLOYMENT-NAME
```
8. Reinstall the tile.

View resource saturation and scaling

To view usage statistics for any service, run:

Run:
```
bosh -d DEPLOYMENT-NAME vms --vitals
```
To view process-level information, run:
```
bosh -d DEPLOYMENT-NAME instances --ps
```

Identify apps using a service instance

To identify which apps are using a specific service instance using the name of the BOSH deployment:

Take the deployment name and strip the service-instance_ leaving you with the GUID.
Log in to Cloud Foundry as an admin.

Obtain a list of all service bindings by running::

cf curl /v2/service_instances/GUID/service_bindings

The output from the curl gives you a list of resources, with each item referencing a service binding, which contains the APP-URL. To find the name, org, and space for the app, run:
1. cf curl APP-URL and record the app name under entity.name.
2. cf curl SPACE-URL to obtain the space, using the entity.space_url from the curl. Record the space name under entity.name.
3. cf curl ORGANIZATION-URL to obtain the org, using the entity.organization_url from the curl. Record the organization name under entity.name.

Important When running cf curl ensure that you query all pages, because the responses are limited to a certain number of bindings per page. The default is 50. To find the next page, curl the value under next_url.

Monitor the quota saturation and service instance count

Quota saturation and total number of service instances are available through ODB metrics emitted to Loggregator. These are the metric names:

Metric Name	Description
`on-demand-broker/SERVICE-NAME-MARKETPLACE/quota_remaining`	global quota remaining for all instances across all plans
`on-demand-broker/SERVICE-NAME-MARKETPLACE/PLAN-NAME/quota_remaining`	quota remaining for a particular plan
`on-demand-broker/SERVICE-NAME-MARKETPLACE/total_instances`	total instances created across all plans
`on-demand-broker/SERVICE-NAME-MARKETPLACE/PLAN-NAME/total_instances`	total instances created for a given plan

Importants Quota metrics are not emitted if no quota was set.

VMware Tanzu Support articles

The following are VMware Tanzu Support articles about Redis for Tanzu Application Service: