This topic for operators gives you troubleshooting techniques for Redis for VMware Tanzu Application Service.
Important Some of the troubleshooting approaches in this topic suggest potentially destructive operations. VMware recommends that you back up both your Tanzu Operations Manager and deployments before attempting such operations. For more information about backing up your setup and exporting your Tanzu Operations Manager installation, see Backing Up Deployments with BBR
Before debugging, gather the following information about your deployment:
See the following table for Cloud Foundry Command Line Interface (cf CLI) commands commonly used while debugging:
To view the... | Command |
---|---|
API endpoint, org, and space | cf target |
Service offerings available in the targeted org and space | cf marketplace |
Apps deployed to the targeted org and space | cf apps |
Service instances deployed to the targeted org and space | cf services |
GUID for a specific service instance | cf service SERVICE-INSTANCE --guid |
Service instance or application logs | cf tail SERVICE-INSTANCE/APP |
See the following table for BOSH CLI commands commonly used while debugging:
Purpose | Command |
---|---|
View the targeted BOSH Director, version, and CPI | bosh env |
View the deployments deployed through the targeted BOSH Director | bosh deployments |
View the VMs for a given deployment | bosh -d DEPLOYMENT vms |
SSH into a given deployment's VM | bosh -d DEPLOYMENT ssh VM |
You can obtain general information after you SSH into a broker or service instance as follows:
/var/vcap/sys/log
.sudo monit summary
.ps aux
.df -h
.free -m
.You can obtain information specific to the cf-redis broker as follows:
ps aux | grep redis-server
./var/vcap/store/cf-redis-broker/redis-data
.The redis-cli is a command line tool used to access a Redis server. You can use the redis-cli for create, read, update, and delete (CRUD) actions, and to set configuration values. For more information about the redis-cli, see redis-cli, the Redis command line interface in the Redis documentation.
To access the redis-cli, do the following:
Follow the instructions in Access the Redis Service to retrieve the password and port number for the service instance.
SSH into the service instance.
Connect to the Redis server and enter the redis-cli interactive mode by running:
/var/vcap/packages/redis/bin/redis-cli -p PORT -a PASSWORD
Where:
PORT
is the port number retrieved in step one.PASSWORD
is the password retrieved in step one.Start here if you are responding to a specific error or error messages.
The following errors occur in multiple services:
Failed installation |
|
---|---|
Symptom | Redis for Tanzu Application Service fails to install. |
Cause | Reasons for a failed installation include:
|
Solution | To troubleshoot:
|
Cannot create or delete service instances |
|
---|---|
Symptom | If developers report errors such as: Instance provisioning failed: There was a problem completing your request. Please contact your operations team providing the following information: service: redis-acceptance, service-instance-guid: ae9e232c-0bd5-4684-af27-1b08b0c70089, broker-request-id: 63da3a35-24aa-4183-aec6-db8294506bac, task-id: 442, operation: create |
Cause | Reasons include:
|
Solution | To troubleshoot:
|
Broker request timeouts |
|
---|---|
Symptom | If developers report errors such as: Server error, status code: 504, error code: 10001, message: The request to the service broker timed out: https://BROKER-URL/v2/service_instances/e34046d3-2379-40d0-a318-d54fc7a5b13f/service_bindings/aa635a3b-ef6d-41c3-a23f-55752f3f651b |
Cause | Cloud Foundry might not be connected to the service broker, or there might be a large number of queued tasks. |
Solution | To troubleshoot:
|
Instance does not exist |
|
---|---|
Symptom | If developers report errors such as: Server error, status code: 502, error code: 10001, message: Service broker error: instance does not exist |
Cause | The instance might have been deleted. |
Solution | To troubleshoot:
|
Cannot bind to or unbind from service instances |
|
---|---|
Symptom | If developers report errors such as: Server error, status code: 502, error code: 10001, message: Service broker error: There was a problem completing your request. Please contact your operations team providing the following information: service: example-service, service-instance-guid: 8d69de6c-88c6-4283-b8bc-1c46103714e2, broker-request-id: 15f4f87e-200a-4b1a-b76c-1c4b6597c2e1, operation: bind |
Cause | This might be due to authentication or network errors. |
Solution | To find out the issue with the binding:
|
Cannot connect to a service instance |
|
---|---|
Symptom | Developers report that their app cannot use service instances that they created and bound. |
Cause | The error might originate from the service or be network related. |
Solution | To solve this issue, ask the user to send application logs that show the connection error. If the error originates from the service, then follow Redis for Tanzu Application Service-specific instructions. If the issue appears to be network-related, then:
|
Upgrade all service instances errand fails |
|
---|---|
Symptom | The upgrade-all-service-instances errand fails. |
Cause | There might be a problem with a particular instance. |
Solution | To troubleshoot:
|
Missing logs and metrics |
|
---|---|
Symptom | No logs are being emitted by the on-demand broker. |
Cause | Syslog might not be configured correctly, or you might have network access issues. |
Solution | To troubleshoot:
|
The following troubleshooting errors are specific to Redis for Tanzu Application Service:
AOF File Corrupted, Cannot Start Redis Instance |
|
---|---|
Symptom | One or more VMs might fail to start the Redis server during pre-start with the error message logged in syslog: [ErrorLog-TimeStamp] # Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix `filename`For more information about remote syslog forwarding, see Configure syslog forwarding. |
Cause | In cases of hard crashes, for example, due to power loss or VM termination without running drain scripts, your AOF file might become corrupted. The error log printed out by Redis provides a clear means of recovery. |
Solution | Solution for shared-VM instances:
|
Saving Error |
|
---|---|
Symptom | One of the following error messages is logged in syslog:Background saving error Failed opening the RDB file dump.rdb (in server root dir /var/vcap/store/redis) for saving: No space left on deviceFor more information about remote syslog forwarding, see Configure syslog forwarding. |
Cause | This might be logged when the configured disk size is too small, or if the Redis AOF uses all the disk space. |
Solution | To prevent this error, do the following:
|
Failed Backup |
|
---|---|
Symptom | The following error message is logged:Backup has failed. Redis must be running for a backup to run |
Cause | This is logged if a backup is initiated against a Redis server that is down. |
Solution | Ensure that the Redis server being backed up is running. To do this, run bosh restart against the affected service instance VM. |
Orphaned instances: BOSH Director cannot see your instances |
|
---|---|
Symptom | When you run cf curl /v2/service_instances , some service instances are visible that are not visible to the BOSH Director. These orphaned instances can create issues. For example, they might hold on to a static IP address, causing IP conflicts. |
Cause | Orphaned instances can occur in the following situations:
|
Solution | You can solve this issue by doing one of the following:
|
Orphaned instances: The deployment cannot see your instances |
|
---|---|
Symptom | The deployment cannot see your broker or service instances. These instances exist, but cannot receive communication. |
Cause | If you run cf purge-service-instances while your service instance or broker still exists, your service instance becomes orphaned. |
Solution | If the deployment lost the details of your instances, but BOSH still has the deployment details, you can solve this issue by backing up the data on your service instance and creating a new service. To back up your data and create a new service instance:
|
Failed to set credentials in runtime CredHub |
|
---|---|
Symptom | If developers report errors such as: error: failed to set credentials in credential store: The request includes an unrecognized parameter 'mode'. Please update or remove this parameter and retry your request. error for user: There was a problem completing your request. Please contact your operations team providing the following information: service: p.redis, service-instance-guid: , broker-request-id: , operation: bind |
Cause | Your service instances might not be running the latest version of Redis for Tanzu Application Service. You might experience compatibility issues with CredHub if your service instances are running Redis for Tanzu Application Service v1.14.3 or earlier. |
Solution |
|
Service outage after deactivating TLS |
|
---|---|
Symptom | After deactivating TLS, apps that require on-demand Redis service instances become unresponsive. |
Cause | When TLS is first activated, all on-demand service instances are re-created with two ports. Every new or re-created app receives the new credentials. Spring and Steeltoe apps are configured for activated TLS by default, but other languages and frameworks require further configuration. When TLS is deactivated, the TLS port is removed from all on-demand instances. This prevents the apps from connecting to the instance. |
Solution | First, consider activating TLS. The compliance body that oversees your apps might require TLS to be activated. Also, switching between activated and deactivated TLS incurs downtime. To activate TLS, follow these steps:
To continue with TLS deactivated, follow these steps:
|
This section provides guidance on checking for, and fixing, issues in cf-redis and on-demand service components.
On-demand service brokers add tasks to the BOSH request queue, which can back up and cause delay under heavy loads. An app developer who requests a new Redis for Tanzu Application Service instance sees create in progress
in the Cloud Foundry Command Line Interface (cf CLI) until BOSH processes the queued request.
Tanzu Operations Manager deploys two BOSH workers to process its queue.
The VM or disk type that you configured in the plan page of the tile in Tanzu Operations Manager might not be large enough for the Redis for Tanzu Application Service service instance to start. See tile-specific guidance on resource requirements.
If you rotated any UAA user credentials then you might see authentication issues in the service broker logs.
To resolve this, redeploy the Redis for Tanzu Application Service tile in Tanzu Operations Manager. This provides the broker with the latest configuration.
Caution You must ensure that any changes to UAA credentials are reflected in the Tanzu Operations Manager credentials
tab of the VMware Tanzu Application Service for VMs tile.
Common issues with networking include:
Issue | Solution |
---|---|
Latency when connecting to the Redis for Tanzu Application Service service instance to create or delete a binding. | Try again or improve network performance. |
Firewall rules are blocking connections from the Redis for Tanzu Application Service service broker to the service instance. | Open the Redis for Tanzu Application Service tile in Tanzu Operations Manager and verify that the two networks configured in the Networks pane allow access to each other. |
Firewall rules are blocking connections from the service network to the BOSH Director network. | Ensure that service instances can access the Director so that the BOSH agents can report in. |
Apps cannot access the service network. | Configure Cloud Foundry application security groups to allow runtime access to the service network. |
Problems accessing BOSH’s UAA or the BOSH director. | Follow network troubleshooting and verify that the BOSH Director is online. |
To validate connectivity:
View the BOSH deployment name for your service broker by running:
bosh deployments
SSH into the Redis for Tanzu Application Service service broker by running:
bosh -d DEPLOYMENT-NAME ssh
If no BOSH task-id
appears in the error message, look in the broker log using the broker-request-id
from the task.
Use the cf ssh
command to access to the app container, then connect to the Redis for Tanzu Application Service service instance using the binding included in the VCAP_SERVICES
environment variable.
If developers report errors such as:
Message: Service broker error: The quota for this service plan has been exceeded. Please contact your Operator for help.
If developers report errors such as:
Message: Service broker error: The quota for this service has been exceeded. Please contact your Operator for help.
To find out if there is an issue with the Redis for Tanzu Application Service deployment:
Inspect the VMs by running:
bosh -d service-instance_GUID vms --vitals
For additional information, run:
bosh -d service-instance_GUID instances --ps --vitals
If the VM is failing, follow the service-specific information. Any unadvised corrective actions (such as running BOSH restart
on a VM) can cause issues in the service instance.
This section contains instructions on:
Failed operations (create, update, bind, unbind, delete) cause an error message. You can retrieve the error message later by running the cf CLI command cf service INSTANCE-NAME
.
$ cf service myservice Service instance: myservice Service: super-db Bound apps: Tags: Plan: dedicated-vm Description: Dedicated Instance Documentation url: Dashboard: Last Operation Status: create failed Message: Instance provisioning failed: There was a problem completing your request. Please contact your operations team providing the following information: service: redis-acceptance, service-instance-guid: ae9e232c-0bd5-4684-af27-1b08b0c70089, broker-request-id: 63da3a35-24aa-4183-aec6-db8294506bac, task-id: 442, operation: create Started: 2017-03-13T10:16:55Z Updated: 2017-03-13T10:17:58Z
Use the information in the Message
field to debug further. Provide this information to Support when filing a ticket.
The task-id
field maps to the BOSH task ID. For more information about a failed BOSH task, use the bosh task TASK-ID
.
The broker-request-guid
maps to the portion of the On-Demand Service Broker log containing the failed step. Access the broker log through your syslog aggregator, or access BOSH logs for the broker by typing bosh logs broker 0
. If you have more than one broker instance, repeat this process for each instance.
Before following these procedures, log in to the cf CLI and the BOSH CLI.
You can access logs using Tanzu Operations Manager by clicking on the Logs tab in the tile and downloading the broker logs.
To access logs using the BOSH CLI:
To identify the on-demand broker (ODB) deployment run:
bosh deployments
To view VMs in the deployment run:
bosh -d DEPLOYMENT-NAME instances
To SSH onto the VM run:
bosh -d DEPLOYMENT-NAME ssh
To Download the broker logs run:
bosh -d DEPLOYMENT-NAME logs
The archive generated by BOSH includes the following logs:
Log Name | Description |
---|---|
broker.stdout.log | Requests to the on-demand broker and the actions the broker performs while orchestrating the request (e.g. generating a manifest and calling BOSH). Start here when troubleshooting. |
bpm.log | Control script logs for starting and stopping the on-demand broker. |
post-start.stderr.log | Errors that occur during post-start verification. |
post-start.stdout.log | Post-start verification. |
drain.stderr.log | Errors that occur while running the drain script. |
To target an individual service instance deployment, retrieve the GUID of your service instance with the following cf CLI command:
cf service MY-SERVICE --guid
To view VMs in the deployment, run:
bosh -d service-instance_GUID instances
To SSH into a VM, run:
bosh -d service-instance_GUID ssh
To download the instance logs, run:
bosh -d service-instance_GUID logs
From the BOSH CLI, you can run service broker errands that manage the service brokers and perform mass operations on the service instances that the brokers created. These service broker errands include:
register-broker
registers a broker with the Cloud Controller and lists it in the Marketplace.
deregister-broker
deregisters a broker with the Cloud Controller and removes it from the Marketplace.
upgrade-all-service-instances
upgrades existing instances of a service to its latest installed version.
delete-all-service-instances
deletes all instances of service.
orphan-deployments
detects "orphan" instances that are running on BOSH but not registered with the Cloud Controller.
To run an errand:
bosh -d DEPLOYMENT-NAME run-errand ERRAND-NAME
For example:
bosh -d my-deployment run-errand deregister-broker
The register-broker
errand:
Run this errand whenever the broker is re-deployed with new catalog metadata to update the Marketplace.
Plans with deactivated service access are only visible to admin Cloud Foundry users. Non-admin Cloud Foundry users, including Org Managers and Space Managers, cannot see these plans.
This errand deregisters a broker from Cloud Foundry.
The errand:
Use the Delete All Service Instances errand to delete any existing service instances.
To run the errand:
bosh -d DEPLOYMENT-NAME run-errand deregister-broker
The upgrade-all-service-instances
errand:
When you make changes to the plan configuration, the errand upgrades all the Redis for Tanzu Application Service service instances to the latest version of the plan.
If any instance fails to upgrade, the errand fails immediately. This prevents systemic problems from spreading to the rest of your service instances.
This errand uses the Cloud Controller API to delete all instances of your broker service offering in every Cloud Foundry org and space. It deletes only instances the Cloud Controller knows about. It does not delete orphan BOSH deployments.
Important Orphan BOSH deployments do not correspond to a known service instance. While rare, orphan deployments can occur. Use the orphan-deployments
errand to identify them.
The delete-all-service-instances
:
CautionUse extreme caution when running this errand. Use it only when you want to destroy all of the on-demand service instances in an environment.
To run the errand:
bosh -d service-instance_GUID delete-deployment
A service instance is defined as "orphaned" when the BOSH deployment for the instance is still running, but the service is no longer registered in Cloud Foundry.
The orphan-deployments
errand collates a list of service deployments that have no matching service instances in Cloud Foundry and return the list to the operator. It is then up to the operator to remove the orphaned BOSH deployments.
To run the errand:
bosh -d DEPLOYMENT-NAME run-errand orphan-deployments
If orphan deployments exist---The errand script does the following:
[stdout]
header[stderr]
headerFor example:
[stdout] [{"deployment\_name":"service-instance\_80e3c5a7-80be-49f0-8512-44840f3c4d1b"}] [stderr] Orphan BOSH deployments detected with no corresponding service instance in Cloud Foundry. Before deleting any deployment it is recommended to verify the service instance no longer exists in Cloud Foundry and any data is safe to delete. Errand 'orphan-deployments' completed with error (exit code 10)
These details are also available through the BOSH /tasks/
API endpoint for use in scripting:
$ curl 'https://bosh-user:bosh-password@bosh-url:25555/tasks/task-id/output?type=result' | jq .
{
"exit_code": 10,
"stdout": "[{"deployment_name":"service-instance_80e3c5a7-80be-49f0-8512-44840f3c4d1b"}]\n",
"stderr": "Orphan BOSH deployments detected with no corresponding service instance in Cloud Foundry. Before deleting any deployment it is recommended to verify the service instance no longer exists in Cloud Foundry and any data is safe to delete.\n",
"logs": {
"blobstore_id": "d830c4bf-8086-4bc2-8c1d-54d3a3c6d88d"
}
}
If no orphan deployments exist---The errand script:
None
[stdout] [] [stderr] None Errand 'orphan-deployments' completed successfully (exit code 0)
If the errand encounters an error during running---The errand script does the following:
To clean up orphaned instances, run the following command on each instance:
Caution Running this command might leave IaaS resources in an unusable state.
bosh delete-deployment service-instance_SERVICE-INSTANCE-GUID
To retrieve the admin credentials for a service instance from BOSH CredHub:
cf service SERVICE-INSTANCE-NAME --guid
For example: $ cf service my-service-instance --guid 12345678-90ab-cdef-1234-567890abcdefIf you do not know the name of the service instance, you can list service instances in the space with
cf services
. Find the values for BOSH_CLIENT
and BOSH_CLIENT_SECRET
:
BOSH_CLIENT
and BOSH_CLIENT_SECRET
.credhub api https://BOSH-DIRECTOR-IP:8844 \
--ca-cert=/var/tempest/workspaces/default/root_ca_certificate
Where BOSH-DIRECTOR-IP
is the IP address of the BOSH Director VM. $ credhub api https://10.0.0.5:8844 \
--ca-cert=/var/tempest/workspaces/default/root_ca_certificate
credhub login \
--client-name=BOSH-CLIENT \
--client-secret=BOSH-CLIENT-SECRET
For example:
$ credhub login \ --client-name=credhub \ --client-secret=abcdefghijklm123456789
Use the CredHub CLI to retrieve the credentials :
credhub get -n /p-bosh/service-instance_GUID/admin_password
In the output, the password appears under value
. Record the password.$ credhub get \ -n /p-bosh/service-instance_70d30bb6-7f30-441a-a87c-05a5e4afff26/admin_password
id: d6e5bd10-3b60-4a1a-9e01-c76da688b847 name: /p-bosh/service-instance_70d30bb6-7f30-441a-a87c-05a5e4afff26/admin_password type: password value: UMF2DXsqNPPlCNWMdVMcNv7RC3Wi10 version_created_at: 2018-04-02T23:16:09Z
To reinstall a tile in the same environment where it was previously uninstalled:
cf login
cf m
bosh log-in
bosh deployments
bosh delete-deployment BROKER-DEPLOYMENT-NAME
To view usage statistics for any service, run:
Run:
bosh -d DEPLOYMENT-NAME vms --vitals
To view process-level information, run:
bosh -d DEPLOYMENT-NAME instances --ps
To identify which apps are using a specific service instance using the name of the BOSH deployment:
service-instance_
leaving you with the GUID.cf curl /v2/service_instances/GUID/service_bindings
resources
, with each item referencing a service binding, which contains the APP-URL
. To find the name, org, and space for the app, run:
cf curl APP-URL
and record the app name under entity.name
.cf curl SPACE-URL
to obtain the space, using the entity.space_url
from the curl. Record the space name under entity.name
. cf curl ORGANIZATION-URL
to obtain the org, using the entity.organization_url
from the curl. Record the organization name under entity.name
. Important When running cf curl
ensure that you query all pages, because the responses are limited to a certain number of bindings per page. The default is 50. To find the next page, curl the value under next_url
.
Quota saturation and total number of service instances are available through ODB metrics emitted to Loggregator. These are the metric names:
Metric Name | Description |
---|---|
on-demand-broker/SERVICE-NAME-MARKETPLACE/quota_remaining |
global quota remaining for all instances across all plans |
on-demand-broker/SERVICE-NAME-MARKETPLACE/PLAN-NAME/quota_remaining |
quota remaining for a particular plan |
on-demand-broker/SERVICE-NAME-MARKETPLACE/total_instances |
total instances created across all plans |
on-demand-broker/SERVICE-NAME-MARKETPLACE/PLAN-NAME/total_instances |
total instances created for a given plan |
Importants Quota metrics are not emitted if no quota was set.
The following are VMware Tanzu Support articles about Redis for Tanzu Application Service: