This topic gives you troubleshooting information for Tanzu Cloud Service Broker for AWS.

Troubleshoot Errors

Start here if you have a specific error or error messages.

Common Services Errors

The following errors can occur in multiple services:

Error Broker trying to recreate the instance when updating it
Operation update
Symptom The instance status is update failed and the message is similar to
update failed: Error: Instance cannot be destroyed on main.tf **** has lifecycle.prevent_destroy set, but the plan calls for this resource to be destroyed.
Cause The update request for a field is failing because one of the following is true:
  • The field cannot be updated
  • The new value for a property, or combination of properties, would cause an instance recreation
The failing update request might be an indication of an out-of-band update performed on the instance.
Examples An out-of-band upgrade of the Redis version to a newer major version causes the broker to try to downgrade to the previous version, which causes instance recreation.
Solution
  • If the property can be updated, pass the parameter in the update request to match the IaaS configuration.
  • If the property can be updated, but specified in the instance plan, then possible solutions include:
    • Rolling back the change in the IaaS
    • Changing the value in the instance plan
Error Broker trying to recreate the instance when changing plan
Operation update plan
Symptom The instance status is update failed and the message is similar to
update failed: Error: Instance cannot be destroyed on main.tf **** has lifecycle.prevent_destroy set, but the plan calls for this resource to be destroyed.
Cause The update request for the plan is failing because the new plan contains incompatible property values.
Examples Execute the plan update operation for Redis by setting a version of Redis earlier than the previously created instance. The downgrade of the version is not allowed because it involves the recreation of the instance.
Solution Update the instance to a plan with the compatible values.

Amazon ElastiCache for Redis Errors

The following errors can occur in Amazon ElastiCache for Redis:

Error Invalid parameter group
Operation create or update
Symptom Errors containing:
  • InvalidParameterCombination
  • InvalidParameterValue
Cause The value of parameter_group_name points to a parameter group that is not compatible with the version of Redis specified in redis_version.
Solution
  • Set the parameter_group_name to "" so that the default is used.
  • Set the parameter_group_name to a parameter group whose family matches the Redis version.
Error Snapshotting state while adding or removing nodes
Operation update
Symptom Errors containing:
unexpected state 'snapshotting', wanted target 'available'.
Cause An AWS snapshot was started during the operation.
Solution Retry the operation.
Error Unable to create instance without specifying minor version (redis 7 only).
Operation create or update
Symptom Errors containing:
  • InvalidParameterCombination: Cannot find version 7.x for redis
  • engine_version: Redis versions must match major.minor when using version 6 or higher
  • .
Cause There is an [underlying error in AWS API](https://github.com/hashicorp/terraform-provider-aws/issues/27918) preventing this scenario.
Solution
  • Unfortunately there is no workaround. Specify a minor version when using Redis version 7 and set auto_minor_version_upgrade to false.

Amazon General RDS Errors

The following errors can occur in any Amazon RDS instance:

Error Reaching AWS subnets quota in a subnet group for RDS
Operation create
Symptom Errors containing:
  • DBSubnetQuotaExceededFault
Cause There is a resource quota for AWS called Subnets per database subnet group that establishes the maximum number of subnets per database subnet group to each supported region to 20.
When operators/developers do not supply an existing subnet group in the plan or provision time, the CSB creates a subnet group. The CSB adds all the present subnets in the specified VPC to the new subnet group. For example, let’s say the operator:
  • Specifies a VPC with 25 subnets through the tile.
  • Does not specify a database subnet group in the plan.
  • Does not specify a database subnet group at provisioning time.
Then the CSB creates a database subnet group and adds all subnets, 25 in this example, to the database subnet group. Hence, this operation breaches the AWS resource quota.
Solution Create a custom database subnet group through the AWS console and add the desired subnets for RDS instances to use. Then use the database subnet group name as a plan or provision parameter.
Error Major engine version should be specified when auto_minor_version_upgrade is enabled
Operation create or update
Symptom Errors containing:
  • Resource postcondition failed ............ .......................................... A Major engine version should be specified when auto_minor_version_upgrade is enabled. Expected engine version: x.x - got: x.x.x
Cause A business rule prevents you from creating or updating an RDS instance with a configuration that enables auto_minor_version_upgrade and does not select a major engine version. AWS automatically upgrades the minor versions, but you must pick a major version.
Solution Create or update your RDS instance with either auto minor version upgrade deactivated or auto minor version upgrade enabled but select a major engine version.
To find out the major version, you can run the following command: aws rds describe-db-engine-versions --engine aurora-mysql --engine-version 5.7.mysql_aurora.2.02.3 --include-all --region us-west-2 | jq -r '.DBEngineVersions[] | { engine_version: .EngineVersion, major_version: .MajorEngineVersion }' Substitute the engine, aurora-mysql, and the engine version, 5.7.mysql_aurora.2.02.3, with the values that you want.
Error Engine version not found when using a major version
Operation create or update
Symptom Errors containing:
  • InvalidParameterCombination: Cannot find version (minor engine version x.x.x) for (specific engine)
  • Example: InvalidParameterCombination: Cannot find version 8.0.mysql_aurora.3.04.0 for aurora-mysql
Cause The AWS API cannot eventually find a minor version within its catalog. Various causes can induce this error, such as:
  • Limited version pool just before the release of a new minor version..
  • Eventual inconsistency between the read API and write API.
Solution Create or update your RDS instance with auto minor version upgrade deactivated and select a specific minor engine version.
Error incompatible-network state
Operation create
Symptom Errors containing:
  • incompatible-network
Cause An incompatible-network state indicates one or more of the following is true of the Amazon RDS DB instance:
  • There are no available IP addresses in the subnet that the Amazon RDS DB instance was launched into.
  • The subnet used in the Amazon RDS DB subnet group no longer exists in the Amazon Virtual Private Cloud (Amazon VPC).
Solution AWS does not make any guarantees as to what subnet from the subnet group an RDS instance is launched in. Although you can assume it is going to balance new instance creation among all the subnets in the group, in reality, this doesn't happen. This means one subnet in the group can run out of IPs, while the others are widely unused. To work around this issue, create a custom DB subnet group through the AWS console and choose the subnets that still have available IP addresses from the navigation pane. Then use the DB subnet group name as a plan or provision parameter.
Error Unreachable publicly accessible DB
Operation create or update
Symptom All following conditions must be occurring:
  • Service instance is configured with the property publicly_accessible: true.
  • The database is not reachable from outside your Tanzu Application Service foundation.
  • Apps within your Tanzu Application Service foundation can connect without issues by using a service binding.
Cause Several factors may contribute to the appearance of this error:
  • The service instance was associated to an unexpected VPC.
    Pitfall: when aws_vpc_id is left blank, the service instance is created in whatever VPC is specified in the Tile's config.
    Or in AWS' default VPC when not specified.
  • The service instance was associated to some unexpected subnets.
    Pitfall: when rds_subnet_group is left blank, the service instance is associated to whatever subnet group is specified in the Tile's Service Offering config if present.
    A new subnet group containing all subnets present in the VPC is created and assigned to the service instance when not specified.
  • The service instance was associated to an unexpected security group.
    Pitfall: when rds_vpc_security_group_ids is left blank the service instance is associated to whatever subnet group is specified in the Tile's Service Offering config if present.
    A new security group allows all ingress traffic but no egress traffic is created and assigned to the service instance when not specified.
  • The subnet group associated to the service instance contains some private subnets.
    Pitfall: according to AWS official docs, for a database instance to be publicly accessible, all of the subnets in its database subnet group must be public.
  • The security groups associated to the service instance are missing some rules to allow routing your external traffic, or some rules conflict with one another.
Recommendations For operators:
  • Explicitly specify aws_vpc_id, rds_subnet_group, and rds_vpc_security_group_ids in the plans or specify a default value in the Service Offering configuration when this option is present.
    There is no way to specify at plan level that a property is mandatory and can't be left empty when creating an instance, so if your use case doesn't allow you to set these fields in the plan, keep in mind the pitfalls listed in the preceding Cause section.
  • Set publicly_accessible: false in the plans if your VPC, subnets, and security groups are not designed with public dtabases in mind or if you want to disallow them.
Solution
  1. Check whether explicitly specifying aws_vpc_id, rds_subnet_group, and rds_vpc_security_group_ids solves the issue.
  2. If any of these fields are enforced by the plan, ask maintainers of the plan if they support public databases.
  3. Check whether you have correctly configured your Security group rules.
  4. Check whether you have correctly configured your database subnet group.
Error You can't modify storage type
Operation Update
Symptom Errors containing:
  • InvalidParameterCombination and the text
    You can't modify storage type
Cause Some modifications to storage-type are not allowed by AWS. Including but not limited to:
  • Changing from io1 to standard (magnetic) or vice versa.
Solution For non-production test instances where data is irrelevant the most straightforward solution is to delete the service instance with the wrong storage_type and create a new one. For production instances accidentally created with storage_type: standard, the only solution is to back up the existing instance and restore it in a new instance with the right storage_type. For production instances created with a different storage_type that you want to migrate to storage_type: standard.
Error You can't currently modify the storage of this DB instance because the previous storage change is being optimized
Operation Update
Symptom Errors containing:
  • InvalidParameterCombination: and the text
     You can't currently modify the storage of this DB instance because the previous storage change is being optimized.
Cause Scaling storage usually doesn't cause any outage or performance degradation of the database instance. After you modify the storage size for a database instance, the status of the database instance is storage-optimization. Storage optimization can take several hours. You can't make further storage modifications for either six (6) hours or until storage optimization has completed on the instance, whichever is longer.
Solution Update any other properties not related to disk immediately and postpone the modification of disk-related properties. If you need to upscale your storage capacity, frequently enabling storage autoscaling might be a better option. See Amazon RDS for MSSQL configuration Parameters - max_allocated_storage.
Error InvalidParameterCombination: You can't specify IOPS or storage throughput for engine postgres and a storage size less than 400
Operation Create
Symptom PostgreSQL/MySQL instances fail to create with this error if storage_type is set to gp3 and storage_gb is less than a 400 GB event when iops is not specified.
Cause The broker has a default value for IOPS of 3000 that is used if no value is specified and IOPS configuration is possible for the `storage_type` requested. However, this value cannot be set if the `storage_gb` value is below a certain threshold. For more information, see the [AWS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html#gp3-storage)
Solution Specifying a value of 0 for iops prevents the broker from setting iops in the instance. Baseline storage performance is still maintained by AWS as [documented]((https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html#gp3-storage)). You have two alternatives
  • Specify "iops":0 in the plans. This value is configured for all instances of the plan and can't be overridden on an instance-per-instance basis.
  • Specify "iops":0 as provision parameter: This can be configured in each service instance individually. cf create-service csb-aws-postgresql PLAN_NAME SERVICE_INSTANCE_NAME -c '{"iops":0}'
Error InvalidParameterCombination: You can't specify IOPS or storage throughput for engine postgres and a storage size less than 400
Operation Update and Upgrade
Symptom PostgreSQL/MySQL instances fail to create with this error if storage_type is set to gp3 and storage_gb is less than a 400 GB event when iops is not specified.
Cause While creating instances with "storage_type": "gp3" and "storage_gb" < 400GB can be achieved by setting `iops: 0` as instructed in [gp3-iops--issue](#gp3-iops-create-issue), this doesn't work for updates. The AWS API sets a default iops value and setting it to 0 is interpreted as trying to change that default value.
Solution Specifying a value of 3000 for iops preserves the AWS set default. Instances with the specified conditions must be created with "iops": 0 and then updated or moved to a plan that specifies "iops": 3000 for further updates/upgrades operations to work.
  • Specify "iops":3000 in the plans: This value is configured for all instances of the plan and can't be overridden on an instance-per-instance basis.
  • Specify "iops":3000 as update parameter. This can be configured in each service instance individually. cf update-service csb-aws-postgresql -c '{"iops":3000}'

Amazon PostgreSQL Errors

The following errors can occur in any Amazon PostgreSQL instance:

Error User does not have permission for tables created by other user in the PUBLIC schema
Operation Bindings modifying/reading tables created by other bindings in the PUBLIC schema
Symptom Errors mentioning lack of permissions/ownership:
  • must be owner of table
  • permission denied for table
Cause The Cloud Foundry binding model implies that multiple bindings can query or edit the same tables. This is particularly useful for rotating credentials where unbind and bind operations are needed. Additionally, the broker does not support creating bindings with different levels of access to the objects created. This means that all bindings need the same access to all objects and can query and edit them regardless of what binding created them in the first place. This conflicts with the PostgreSQL permission model, where the user that created an object is the owner and is the only one who can edit tables and query them, unless permission is explicitly granted to other roles. This applies for bindings and service keys.
Specifically, the following issues can happen:
  • Binding A not having access to a table that binding B created, when binding B created the table after binding A was created.
  • Binding A cannot read tables created by binding B until a new binding C is created.
  • Binding A cannot change tables created by binding B until binding B is deleted (unbound from its app).
Solution All the database users created when binding and creating service keys with Tanzu Cloud Service Broker for AWS are assigned the role binding_user_group. This implies they all have access to tables created by the binding_user_group role. Creating any objects with the binding_user_group role instead of the binding user resolves any of the issues mentioned here. You can achieve this by running SET ROLE binding_user_group before any other instruction in the SQL script that creates your object or framework performing database migrations. If you have issues with tables already created, you must either:
  1. Unbind the application that has created the objects and bind again (or delete the service key that has created the objects). This is because when unbinding, Tanzu Cloud Service Broker for AWS automatically transfers ownership of existing objects to the binding_user_group role.
  2. Manually transfer ownership to the binding_user_group with the following statement "ALTER TABLE tab_name OWNER TO binding_user_group;". You must run this statement after logging in to the database with the credentials from the binding/service key used to create the objects.
  3. If you only need other bindings to perform data operations, you can create new bindings for interacting with these objects. This is because Tanzu Cloud Service Broker for AWS assigns permissions to all existing tables whenever a new binding is created. However, this new binding does not have permissions to perform DDL operations until points 1. or 2. are implemented.

Amazon MySQL Errors

The following errors can occur in any Amazon MySQL instance:

Error InvalidParameterCombination: You can't specify IOPS or storage throughput for engine mysql and a storage size less than 400
Operation Create
Symptom Instance fails to create with error mentioned above if `storage_type` is set to `gp3` and `storage_gb` is less than 400GB event when `iops` is not specified.
Cause The broker has a default value for iops of 3000 that is used if no value is specified and iops configuration is possible for the `storage_type` requested. However this value cannot be set if the `storage_gb` value is below a certain threshold For more information see [AWS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html#gp3-storage)
Solution Specifying a value of `0` for `iops` will prevent the broker from setting `iops` in the instance. Baseline storage performance will still be maintained by AWS as [documented]((https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html#gp3-storage)). You have two alternatives
  • Specify `"iops":0` in the plans: This value will be configured for all instances of the plan and can't be overridden on an instance-per-instance bases
  • Specify `"iops":0` as provision parameter: This can be configured in each service instance individually. cf create-service csb-aws-mysql PLAN_NAME SERVICE_INSTANCE_NAME -c '{"iops":0}'

Amazon MSSQL Errors

The following errors can occur in any Amazon MSSQL instance:

Error InvalidParameterValue: Backup retention cannot be set to zero for DB Instance xxxxx since it has Multi-AZ enabled on it.
Operation Update
Symptom The instance fails to update with this error if `backup_retention_period` is set to `0` and `multi_az` is set to `false` in the same updating operation.
Cause The broker handles asynchronous parallel operations in the upgrade operation and has no ability to set the order of execution of the two updates.
Solution The updates need to be sequential:
      You must disable `multi_az` and wait until the operation finishes. cf update-service SERVICE_INSTANCE_NAME -c '{"multi_az": false}'
      Set `backup_retention_period` to `0`. cf update-service SERVICE_INSTANCE_NAME -c '{"backup_retention_period":0}'

Amazon Aurora Errors

The following errors can occur in any Aurora instance:

Error The following error message is displayed in the pg_upgrade_server.log logs: FATAL: shared memory segment sizes are configured too large
Operation Update
Symptom Major upgrade fails and error in pg_upgrade_server.log (viewable from the AWS console) shows the preceding error.
Cause Major version upgrade in Amazon Aurora for PostgreSQL with small instance types is not straightforward. In particular, upgrade for small instances or serverless with low max capacity, for example 2 ACUs, causes an error even through the AWS console.
Solution Perform the following changes sequentially:
  1. First:
    • Temporarily update to a bigger instance_class type.
    • or, if using serverless instance type, increase the max capacity serverless_max_capacity to at least 4 ACUs
    • cf update-service SERVICE_INSTANCE_NAME -c '{"serverless_max_capacity": 4}'
  2. Once that's done, you can retry the major version upgrade. For example:
  3. cf update-service SERVICE_INSTANCE_NAME -c '{"engine_version": "14"}'
  4. Finally, scale down instance_class or serverless_max_capacity to its previous value, since the extra capacity was only needed during the actual upgrade.
  5. cf update-service SERVICE_INSTANCE_NAME -c '{"serverless_max_capacity": 2}'
check-circle-line exclamation-circle-line close-line
Scroll to top icon