This topic tells you about the set of smoke tests that Redis for VMware Tanzu Application Service runs during installation to confirm system health.

The tests run in the org system and in the space tanzu-services. The tests run as an app instance with a restrictive App Security Group (ASG).

Smoke test steps

The smoke tests perform the following tasks for each available service plan:

  1. Targets the org system and space tanzu-services (creating them if they do not exist).
  2. Deploys an instance of the CF Redis Example App to this space.
  3. Creates a Redis instance and binds it to the CF Redis Example App.
  4. Creates a service key to retrieve the Redis instance IP address.
  5. Creates a restrictive security group, redis-smoke-tests-sg, and binds it to the space.
  6. Checks that the CF Redis Example App can write to and read from the Redis instance.

Security groups

Smoke tests create a new App security group for the CF Redis Example App (redis-smoke-tests-sg) and delete it after the tests finish. This security group has the following rules:

[
    {
      "protocol": "tcp",
      "destination": "<broker IP address>",
      "ports": "32768-61000" // Ephemeral port range (assigned to shared-vm instances)
    }
]

This allows outbound traffic from the test app to the Redis shared-VM service instances.

Smoke test resilience

Smoke tests can fail due to reasons outside of the Redis deployment. For example, network latency causing timeouts or the Cloud Foundry instance dropping requests. They might also fail because they are being run in the wrong space.

The smoke tests implement a retry policy for commands issued to CF, for two reasons:

  • To avoid smoke test failures due to temporary issues such as the ones previously mentioned.
  • To ensure that the service instances and bindings created for testing are cleaned up.

Smoke tests retry failed commands against CF. They use a linear back-off with a baseline of 0.2 seconds, for a maximum of 30 attempts per command. Therefore, assuming that the first attempt is at 0s and fails instantly, subsequent retries are at 0.2s, 0.6s, 1.2s and so on until either the command succeeds or the maximum number of attempts is reached.

The linear back-off was selected as a good middle ground between:

  • Situations where the system is generally unstable, such as load-balancing issues, where max number of retries are preferred.
  • Situations where the system is experiencing a failure that lasts a few seconds, such as restart of a Cloud Foundry VM, where it is preferable to wait before reattempting the command.

Considerations

The retry policy does not guard against a more permanent Cloud Foundry downtime or network connectivity issues. In this case, commands fail after the maximum number of attempts and might leave claimed instances behind. VMware recommends deactivating automatic smoke tests, and manually releasing any claimed instances in case of upgrades or scheduled downtime.

Troubleshooting

If errors occur while the smoke tests are run, they are summarized at the end of the errand log output. Detailed logs can be found where the failure occurs. Here are some common failures:

Error Failed to target Cloud Foundry
Cause Your deployment is unresponsive.
Solution Examine the detailed error message in the logs and check the Troubleshooting deployment problems for advice.
Error Failed to bind Redis service instance to test app.
Cause Your deployment's broker has not been registered.
Solution Examine the broker-registrar installation step output and troubleshoot any problems.

When you encounter an error when running smoke tests, it can be helpful to search the log for other instances of the error summary printed at the end of the tests, for example, Failed to target Cloud Foundry. Lookout for TIP: … in the logs next to any error output for further troubleshooting hints.

check-circle-line exclamation-circle-line close-line
Scroll to top icon