This page lists breaking changes when upgrading VMware Tanzu Application Service for VMs to v6.0.
New installations of TAS for VMs v6.0 no longer default to include cflinuxfs3 in the list of stacks. Operators can continue to Configure Cloud Controller to install cflinuxfs3, if desired.
Upgrading an existing foundation to TAS for VMs v6.0 does not remove cflinuxfs3. Apps running on cflinuxfs3 continue to run as normal, and developers can continue to push applications using the cflinuxfs3 stack. However, if operators remove cflinuxfs3 from an existing foundation, then cflinuxfs3 is no longer installed when deploying TAS for VMs, unless the default stack list is configured to include cflinuxfs3 (see above).
Support for the cflinuxfs3 stack is deprecated and will be removed in a future release of TAS for VMs. If you have not already, migrate all applications off of cflinuxfs3 and remove the stack from your foundations.
The absolute_entitlement
and absolute_usage
metrics are no longer emitted for each container. They are replaced by the cpu_entitlement
metric. If you have any dashboards that reference the absolute_entitlement
and absolute_usage
metrics, update the dashboards to use the new metric.
Due to the removal of these metrics, the experimental CPU Entitlement Plug-in no longer functions. If you use this plug-in to view CPU entitlement usage, you can instead view the cpu_entitlement
metric, for example using the Log Cache cf CLI plug-in.
If you are upgrading from TAS 5.0 to TAS 6.0, review the following breaking changes to ensure a smooth upgrade.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.37 2.13.19 3.0.9 4.0.1 |
The App Autoscaler API may error during deployment due to it now requiring JRE 17.
You are impacted if you have customized the java offline buildpack to use a JRE other than OpenJDK and the default JRE version in the buildpack or defined by an environment variable group is not JRE 17.
You are likely impacted if the following line appears in the logs for the autoscale-api application:
ERR java.lang.UnsupportedClassVersionError:
org/springframework/boot/loader/JarLauncher has been compiled by a more
recent version of the Java Runtime (class file version 61.0), this version of the Java Runtime only recognizes class file versions up to 55.0
Ensure that JRE 17 is available in your java offline buildpack.
If the java runtime you are using is the Oracle JRE, then upgrade to a version of TAS that ships with cf-autoscaling version 249.2.6. As of that version of cf-autoscaling, the autoscale-api application configures JBP_CONFIG_ORACLE_JRE to self-select Oracle JRE 17.
Alternatively, customers can temporarily override their buildpack defaults in order to run the deploy-autoscaler
errand:
Set an environment variable group that changes the default version of Java to 17 across all applications in a foundation, e.g.
$ cf set-staging-environment-variable-group '{"JBP_DEFAULT_OPEN_JDK_JRE":"{jre: {version: 17.+ }}"}'
Generate the correct parameters for your java runtime by viewing the options available in the java buildpack. Also be careful to merge your new parameters with any parameters already set in the environment variable group.
Trigger the deploy-autoscaler errand and see it succeed.
Remove the parameters that you added to the environment variable group and re-set the remaining parameters to undo your changes.
The App Autoscaler API was bumped to JRE 17. Affected versions set the required JRE version with an OpenJDK JRE specific environment variable.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11 2.13 3.0 4.0 5.0 |
The Spring AutoReconfiguration library supplied by the buildpack is deprecated, as well as the Spring Cloud Connectors project. It is recommended to use the java-cfenv library instead for accessing bound services in Spring Boot apps. To make it easier to migrate to this library, the Java Buildpack will now install the Java CfEnv library for Spring Boot 3 apps only, and the Spring AutoReconfiguration library will no longer be installed for these apps.
Apps may be affected if they have already been migrated to Spring Boot 3 and are relying on Spring Autoreconfiguration, i.e.
You have not set the variable JBP_CONFIG_SPRING_AUTO_RECONFIGURATION '{enabled: false}'
The app is bound to a service of one of these types:
The startup logs for the app will show a log entry such as 'dataSource' bean of type with 'javax.sql.DataSource' reconfigured with 'mysql' bean
Apps using Spring Boot 2.x will continue to receive the Spring AutoReconfiguration library and should not be affected by this change.
In most cases, the Java CfEnv library should replace the functionality of the Spring Autoreconfiguration library for Spring Boot 3 apps. The Java CfEnv library examines bound services of the above types (except SMTP) and sets well-known Spring Boot properties so that Spring Boot’s Autoconfiguration can kick-in.
The Spring AutoReconfiguration library uses the Spring Cloud Connectors project which has been deprecated since 2019. Java CfEnv is the recommended library for accessing bound services.
UAA’s ability to act as a SAML identity provider has been removed in preparation for replacing its dependency on Spring SAML Extension with Spring Security SAML 2 support. You can no longer use UAA as your SAML IdP.
Note that UAA’s ability to integrate with an upstream SAML identity provider as a SAML service provider is unaffected by this change.
If you use UAA as a SAML IdP, you are impacted. If you are unsure, get the list of registered SAML service providers from UAA’s “/saml/service-providers” endpoint. (See https://docs.cloudfoundry.org/api/uaa/version/76.31.0/#list.) If you get a non-empty list in response, then you are using UAA as a SAML IdP.
Note that “/saml/service-providers” endpoint has been also removed from the latest UAA version as part of the SAML IdP functionality removal.
UAA now supports acting as an identity provider over OIDC. If the system that acts as a SAML service provider can also integrate with OIDC identity providers, you should switch it to use that protocol instead.
Spring SAML Extension has reached the end of support. UAA is replacing it with Spring Security SAML 2 support to keep the SAML feature compatible with latest Spring versions. Since Spring Security does not provide Identity Provider support, UAA is dropping its SAML IdP functionality.
If you have customized the password policy settings for local UAA users, these settings will be restored to the default policy unless you take action.
If you currently have configured the local UAA user password policy by setting the fields under the “Internal user store” option in the Authentication and Enterprise SSO pane, then your existing settings will be affected.
You can instead configure the password policies for local UAA users in a new section on the UAA pane of the TAS tile. This configuration section is available in TAS 4.0 and 5.0 and will be preserved in the upgrade to 6.0.
In previous TAS versions, the local UAA password policies could be configured only if you selected the Internal user store option in the Authentication and Enterprise SSO pane. The local UAA password policy configuration option is now moved to the UAA pane, where you can customize the settings regardless of the Authentication and Enterprise SSO option you chose.
This change is only relevant to foundations with third-party services listening on port 7070.
SSH into a router VM and use lsof -Pi
to determine if port 7070 is being used. Be sure to check any routers deployed by IST as well.
If your foundation is impacted, use Tanzu Operations Manager to configure use of a different port.
For TAS, go to the Networking tab and scroll down to the Route Services section. If route services are enabled, you will see a text box labeled “The port used for internal route services requests.” Set the value to a known available port, or set the value to 0 to allow the operating system to choose an available port at deploy time.
For IST, go to the Networking tab and scroll down to the text box labeled “The port used for internal route services requests.” Follow the same instructions as you did for TAS.
Your TAS deployment will fail or become insecure if you are using MySQL 5.7 as TAS’s external system database, as MySQL 5.7 will soon reach End-Of-Life.
You are impacted if you have configured a MySQL 5.7 database instance as the external system database for your TAS deployment. You can check your current system database settings by selecting “Databases” pane in the TAS for VMs tile. If under “System databases location” section, the “External database server” option is selected and the configured database server is based on MySQL 5.7, you are impacted.
Upgrade your external system database to a supported MySQL version (such as MySQL 8.0).
This change was triggered by the fact that as MySQL 5.7 reaches its official End-Of-Life date (31 Oct 2023), many TAS components as well the database client libraries that they depend on will follow suit to remove support and testing for MySQL 5.7.
Operators can ensure successful deployment of NATS servers, which propagate routes from services and apps to Gorouter, by confirming that their nats-release instances have successfully migrated to NATS v2.
If your TAS environment is already on v2.11.26 or greater, make sure that your NATS instances have successfully migrated by checking that NATS 2.0 is running (see KB article in link).
If your deployment fails, check nats instance logs. Migration details, including any possible error messages, can be found under /var/vcap/sys/log/nats-tls/nats-tls-wrapper.stdout.log
In TAS v2.11.26, nats-release upgraded its underlying software package from NATS 1.0 (package name gnatsd
) to NATS 2.0 (package name nats-server
). The nats-release contains NATS 1.0, which will start as a fallback in case the migration to NATS 2.0 fails. In preparation for the future removal of NATS 1.0, nats-release will fail in post-start if it detects that NATS 1.0 is running instead of NATS 2.0.
Existing clients of the App Autoscaler API may break if they make requests to resources
including trailing slashes.
If you have written code that makes requests to the App Autoscaler API with trailing slashes or have documentation that describes making requests with trailing slashes you are impacted.
Requests that specify trailing slashes will now receive a 404 response.
This does not affect users who are using the autoscaler cf CLI plugin.
Modify your client or documentation to no longer make requests with trailing slashes.
This is a Spring Framework default change to improve security posture:
https://github.com/spring-projects/spring-framework/issues/28552
Existing automation that parses logs from TASW components may break if they are still expecting timestamps formatted as epoch time. Specifically, logs from groot
and garden_windows
have been converted to RFC 3339 timestamps.
You are impacted if you have log parsing for TASW’s groot
and garden_windows
jobs that rely on timestamps being in epoch format.
If it was necessary to adjust log parsing for TAS/TASW 4.0 to account for the change in timestamp format, update log parsing in the same manner, but for the TASW groot
and garden_windows
jobs.
Non-standardized and non-human readable timestamps in logs make debugging TAS more difficult. Starting in TAS/TASW 4.0, timestamps have been logged using RFC 3339 format in all cases but groot and garden_windows.
Existing automation that configures a log timestamp format will break if product config is not updated.
You are impacted if you set the logging_timestamp_format
product config property.
This property has had no effect since TAS 4.0. Update your configuration to no longer specify the property.
Non-standardized and non-human readable timestamps in logs make debugging TAS more difficult.
Existing automation that configures aggregate syslog drains will break if product config is not updated.
You are impacted if you have configured aggregate syslog drains with the syslog_agent_aggregate_drains
property.
References to the old property should be updated to refer to the new mtls_syslog_agent_aggregate_drains
property.
This property is used to define aggregate syslog drains with keys and certificates for Mutual TLS, as well as to define aggregate drains that are not using Mutual TLS.
Comma-separated strings are no longer accepted, instead pass an array of aggregate drains:
.properties.mtls_syslog_agent_aggregate_drains:
value:
- url: syslog-tls://HOSTNAME:PORT
- url: syslog-tls://ANOTHER-HOSTNAME:PORT
Additionally, consider if your syslog aggregate drains should be updated to use Mutual TLS.
From TAS 5.0 operators can configure aggregate drains that support Mutual TLS for improved security. This necessitated changing the structure of the syslog aggregate drains property to allow these new fields to be specified.
The behavior of any custom integration using Prometheus to scrape the Metrics Agent will change. The Metrics Agent is protected by mutual TLS, and would require using a certificate issued by TAS to access. We do not know of any products or integrations which use this agent.
You would have built a custom integration which scrapes all VMs via Prometheus on port 14726.
If you are using the Metrics Agent, it is possible in TAS 5.0 to re-enable it by unchecking the checkbox in the System Logging configuration page. We would encourage you instead to switch to using the Prometheus Exporter in the OpenTelemetry Collector. If you
are unable to migrate to the Prometheus Exporter please let VMware know as the Metrics Agent is slated for removal in TAS 6.0.
The Metrics Agent was designed as a way for Healthwatch and other components to retrieve metrics without using the Loggregator Firehose. No components ended up adopting it, and with the introduction of the OpenTelemetry Collector we have much more flexible options for egress metrics so are slating this for removal instead of driving adoption towards it.
If you are upgrading from TAS v3.0 to TAS v6.0, review the following breaking changes to ensure a smooth upgrade.
From TAS 4.0.0 components logs are always output with RFC 3339 timestamps.
Check the “System Logging” pane in the TAS configuration. If “Timestamp format for component logs” is set to “Maintain previous format” then you are impacted.
The corresponding product configuration is where .properties.logging_timestamp_format
is set to deprecated
.
If you have systems that rely on the old timestamp format for component logs when parsing log lines then these will need to be updated to handle the RFC 3339 timestamp format.
Non-standardized and non-human readable timestamps in logs make debugging TAS more difficult.
The option to set a global app log rate limit in TAS/IST/TASW was removed.
If you were using this feature to limit your overall app log throughput, then you may see an increase in log load after upgrading.
If you have enabled the App log rate limit option in the TAS/IST/TASW “App Containers” tab then you are impacted.
You should look to replace use of this feature with org, space, and app, byte-based log rate limits. For details, see App Log Rate Limits.
If you are impacted and concerned about log load in your system during, and after, upgrading, consider either:
Scaling up your logging system before upgrading to TAS v4.0 to compensate for the increased log load.
Upgrading to TAS v3.0 first to set app-level log rate limits while retaining the global app log rate limit.
Since the introduction of org, space, and app log rate limits in TAS v3.0, this feature was moved from beta to deprecated.
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/4.0/tas-for-vms/runtime-rn.html#breaking-changes
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/4.0/tas-for-vms/app-log-rate-limits.html
This will cause versions of the cf CLI before v6.52.0 to no longer be able to retrieve logs when called with cf logs --recent
.
If you have users or automation that are using cf CLI versions that are older than v6.52.0, they are impacted.
Upgrade clients to a newer version of the cf CLI.
The cf CLI has been using Log Cache as the source for cf logs --recent
since v6.52.0. This change is removing a deprecated path for log retrieval. cf CLI v6 is not compatible with TAS 2.13 or higher, this is a special notice of removal.
If you have invalid CA certificates listed in your Ops Manager and TAS configurations, you will need to update these in order to deploy TAS 4.0.
Any entries that are not valid for CA certificates cause an error in Ops Manager. You must remove or replace invalid entries.
Check any CA certs specified in your Director or TAS tiles against openssl x509 -text -in <cert-file>
. If any errors are returned, you should fix or regenerate the certificate.
Replace the invalid CA certificates with up-to-date and valid certificates. Alternatively, if there is no replacement, simply remove the invalid CA certificate.
Previously, Gorouter and Opsmanager would ignore the invalid certificates. This change has been made to help make operators aware of any copy/paste problems when applying CA certificates.
If end-users are sending requests with large request headers to your environment, they may now experience 431 status code errors.
You will see 431 status code in your gorouter access logs, with the X-Cf-RouterError: max-request-size-exceeded
response header set.
If you want to accommodate large request headers, you can increase the default size in the Networking
tab of Ops Manager under the field Maximum request header
size. We do not recommend this accommodation as a long-term option.
NOTE: This limit is specifically for the Method, Request URI, and protocol line of an HTTP request, as well as any HTTP Headers in the request.
Max request header size now defaults to 48 kb. Upgrading to 4.0 sets the Max request header size in kb to 48, unless the existing configuration was already lower. Lowering this value establishes a better security posture; large request headers could consume router resources or leak memory.
If your apps haven’t been upgraded to be compatible with Ruby 3.1, you may experience errors.
You will see app errors, logged in
cf logs [APP NAME] --recent
Update your apps to be compatible with Ruby 3.1. If you need a temporary workaround while upgrading apps, you can upload both the Ruby Buildpack 1.9.0 as well as a previous buildpack to your environment until apps have been upgraded and the previous buildpack can be removed.
If you create custom iptables rules on your Diego cells, they may not be visible when running iptables -L because we have upgraded the default version of iptables. However these custom rules do still apply.
You can see any custom rules you have by running iptables-legacy, to evoke iptables 1.6.x.
Update your custom iptables logic to use the /sbin/iptables*
binaries, which run iptables 1.8.x (backed by nftables).
If you need to interact with your custom rules before upgrading, you can use the command iptables-legacy as a short-term workaround.
On the Jammy Jellyfish stemcell, the iptables command uses the nftables framework (1.8.x) instead of the iptables
firewall (1.6.x) We made garden default to using the system iptables binary. However, we make both versions available to make custom iptables rules accessible in the short-term.
If you are upgrading from TAS v4.0 to TAS v6.0, review the following breaking changes to ensure a smooth upgrade.
Versions Introduced
Product | TAS | Isolation Segment | TAS Windows |
---|---|---|---|
Version(s) | 2.11.26 2.12.18 3.0.0 |
2.11.20 2.12.13 3.0.0 |
2.11.20 2.12.12 3.0.0 |
The behavior of line-based application log rate limiting has changed. Previously application logs would be buffered to some extent and then released at the configured rate. Now application logs that exceed the rate limit are dropped immediately.
You are impacted if you have configured the deprecated line-based application log rate limiting and have applications that emit logs in excess of the configured limit.
This is expected behavior and helps ensure that logs that are output are timely. VMware recommends use of quota-based log rate limits for fine-grained control over application log rates.
This behavior was modified as part of changes to Diego to support granular log rate limits on orgs, spaces and individual apps in TAS 3.0.
Versions Introduced
Product | TAS | Isolation Segment | TAS Windows |
---|---|---|---|
Version(s) | 2.11.19 2.12.12 2.13.5 3.0.0 4.0.0 |
2.11.13 2.12.7 2.13.2 3.0.0 4.0.0 |
2.11.13 2.12.7 2.13.2 3.0.0 4.0.0 |
System Logging will be impacted if older SHA-1 hash function certificates are used.
Review your configured application syslog drains. If you have a certificate for a syslog drain configured with a SHA-1 hash, you are impacted.
Regenerate impacted certificates so that they don’t use the SHA-1 hash function.
Logging components provided by loggregator-agent-release have been upgraded to Go 1.18. From Go 1.18 the treatment of certificates is stricter and certificates signed with the SHA-1 hash function will be rejected. Go stopped accepting certificates signed with the SHA-1 hash function because of security concerns.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 3.0.0 4.0.0 |
Applications pushed to TAS 3.0+ will inherit a default operator-configurable log rate limit if one is not specified.
You are impacted if you are upgrading to TAS 3.0+ and are pushing applications to the platform. Applications that already exist on the platform prior to upgrade will default to not being log rate limited.
Determine if the default log rate for new applications (16 K/s) is appropriate for your needs. If you have applications with verbose logs exceeding the default limit and do not wish their logs to be dropped for exceeding the limit then consider:
pushing your applications with a specified app log rate limit
modifying the platform default log rate limit
The default log rate limit for new applications is configurable in the “App Developer Controls” tab in the TAS product configuration. The corresponding property is .properties.cloud_controller_default_log_rate_limit_app.
The ability to set the log rate limit in a granular way is intended to allow operators to protect both the logging system within TAS and external integrations that receive logs. Setting a platform default limit for new applications makes it more likely that applications will have a log rate limit set allowing quotas to be imposed.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.20 2.12.13 2.13.5 |
Upgrading to a version of TAS that introduces a new database index for App Autoscaler on the service bindings table may error if you have previously manually created the same index.
You are impacted if you have previously manually added a database index following the instructions in the knowledge base article Autoscale application fails with MySQL Deadlock errors and are upgrading to one of the following TAS versions:
2.13.5
2.13.6
Upgrade to a newer patch release of TAS that does not have this limitation.
An error was made in the implementation of the database migration that meant that an attempt was made to add the database index despite it already existing in the database.
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.13/tas-for-vms/runtime-rn.html#2-13-5
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.13/tas-for-vms/runtime-rn.html#2-13-7
https://broadcomcms-software.wolkenservicedesk.com/external/article?articleNumber=298085
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 3.0.0 4.0.0 |
The BOSH System Metrics Forwarder is removed from TAS for VMs
You are impacted if you have been using the deprecated BOSH System Metrics Server and Forwarder. You are using the BOSH System Metrics Server if you have selected the “Enable BOSH System Metrics Server (deprecated)” option in the Ops Manager Director product configuration.
To continue receiving system metrics, you must select the “Enable System Metrics” checkbox in the Director Config pane of the BOSH Director tile.
To avoid deployment, platform automation, or data collection failures, you must update any queries that reference bosh-system-metrics-forwarder
as the source_id for metrics to reference system_metrics_agent
instead.
Additionally, metric names now include underscores instead of periods. For example, the metric named system.cpu.sys in previous versions of TAS for VMs is named system_cpu_sys
in TAS for VMs v3.0.
BOSH System Metrics Server and Forwarder are deprecated in favor of System Metrics server and scraper.
If your environment was running an HAProxy instance group, you must reconfigure your networking configurations or else will get a deployment error when upgrading to TAS 3.0.
To check if you are running HAProxy, open the Networking tab in Ops Manager. Under the field “Routing TLS Termination”, you’ll see “HAProxy”.
Change your networking and resource configurations to point directly to Gorouter. Full directions are in the “HAProxy Removed” doc above.
HAProxy was previously supported as an instance group in front of gorouter in order to support features not offered by gorouter. Now that gorouter supports TLS termination, HAProxy is no longer needed.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 3.0.0 4.0.0 |
From TAS 3.0.0 Log Cache uses syslog ingress, and the option to use nozzle ingress has been removed.
You are impacted if you have not checked the “Enable Log Cache syslog ingestion” option, or set the corresponding .properties.enable_log_cache_syslog_ingestion
product property to false
. From TAS 3.0.0 this is no longer a supported configuration.
In addition Diego Cells with high logging volume might experience higher CPU usage than they did prior to this change.
Confirm that your instances, including those within isolation segments, are permitted to establish connections to Log Cache nodes on port 6067. You may need to update firewall rules to allow logs to flow directly from your instances to the Log Cache syslog server.
Consider scaling your Diego Cells if you have applications with high logging volume due to the increased load from the syslog agent on the Diego Cell.
Log Cache ingestion via the Reverse Log Proxy (RLP) has been removed.
If you were monitoring ActiveLocks
on your environment, upgrading to TAS 3.0.0 will increase your metric by 1.
You are impacted if your environment Key Performance Indicators includes Locket ActiveLocks
.
Remove this metric from any KPI monitoring you maintain.
ActiveLocks
is a metric emitted by Locket, a distributed locking service used by multiple services in CloudFoundry. With the introduction of a new, opt-in service that uses Locket, the expected number of active locks increased by 1. Our community noted that a feature-dependent was a smell of a bad performance indicator; while the information is useful in debugging, it is not useful as a performance indicator and we no longer recommend monitoring it as an environment health metric.
cflinuxfs3 (backed by Ubuntu Bionic) has been deprecated in favor of cflinuxfs4 (backed by Ubuntu Jammy). If apps are configured to use the latest stack, upgrading TAS will upgrade app stacks and may cause a push failure.
You can see what stack your apps are configured to use by running:
cf audit-stack
If apps are followed by cflinuxfs3
, they are running an outdated stack.
You should re-stage your app to use cflinuxfs4
.
cf push <APP_NAME> -s cflinuxfs4
Resolve any push errors prior to upgrading TAS. This may entail updating the buildpack that your app uses. If fixing app staging is not possible before the TAS upgrade you can configure apps to run on cflinuxfs3
as a temporary workaround.
“Stacks” are the pre-built root file system that Cloud Foundry uses to create app containers. Keeping these up-to-date with operating system updates is key to updating security vulnerabilities and other issues.
TAS cflinuxfs4 Migration Proposal discusses stack change behaviors for each TAS release
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.27 2.12.28 2.13.13 |
Diego and Routing components have been updated to be more strict with TLS protocols. External services and databases, end users, and services making requests to gorouter should be using TLS 1.2 and have certs signed with a newer hash than SHA-1, or else they will experience TLS errors.
For services talking to Diego components, if your environment is impacted they will not be able to successfully make a connection to Diego. For external users services talking to gorouter, you should see TLS errors in your access logs.
If you are using an external database, Diego will throw errors trying to start processes like Locket and BBS.
You can check the Signature Algorithm of your external database and external services with the following command:
echo '' | openssl s_client -connect <HOSTNAME>:<PORT> -servername
<HOSTNAME> 2>/dev/null | openssl x509 -noout -text | grep
'Signature Algorithm'
You can check that TLS 1.2 is supported with the following command:
echo openssl s_client -connect google.com:443 -servername google.com
-tls1_2 || echo "TLS 1.2 unsupported"
If you see no output, TLS 1.2 is supported. If you see the TLS 1.2 unsupported message, it is unsupported and will need to be updated.
Make sure that services talking to your Diego components and services talking to gorouter are not using TLS 1.0 or 1.1, or using SHA-1 certificates.
Diego and Routing components, including gorouter, use an up-to-date version of Golang. As of Golang 1.18, TLS requirements have been made more strict in two major ways. First, TLS 1.0 and 1.1 disabled in favor of TLS 1.2. Second, crypto/x509 will now reject certificates signed with the SHA-1 hash function. See Golang release notes for details.
Versions Introduced
Product | TAS | Isolation Segment | TAS Windows |
---|---|---|---|
Version(s) | 2.11.14 2.12.7 2.13.0 3.0.0 4.0.0 |
2.11.10 2.12.4 2.13.0 3.0.0 4.0.0 |
2.11.10 2.12.5 2.13.0 3.0.0 4.0.0 |
Logging components provided by loggregator-agent-release and syslog-release have been upgraded to Go 1.17.
From Go 1.17 the treatment of IP addresses is stricter and any IP addresses where an octet starts with a leading zero is now invalid.
Review your configured application syslog drains and the address of any configured syslog destination in the TAS “System Logging” tab. If your configuration contains an IP address with an octet that has a leading zero then you are impacted.
Modify your configuration to express the ip address without a leading zero.
The Go Developers chose to disallow IP addresses with octets with leading zeros due to them presenting a security concern.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.8 2.12.1 2.13.0 3.0.0 4.0.0 |
The frequency with which system metrics are scraped increased from a fixed frequency of every 1 minute to a configurable default of every 15 seconds. This may cause increased load on your logging and metrics infrastructure.
If you have enabled system metrics in the Ops Manager director config and you are upgrading to a version of TAS after the versions that introduced this change then you are impacted.
You are not impacted if you are using the deprecated BOSH system metrics server instead of the recommended system metrics scraper.
The default 15 second scrape frequency is recommended. You may not need to take action if your logging and metrics infrastructure is already scaled sufficiently to handle the increased number of metrics. Aside from scaling another option is to reduce the scrape frequency.
The scrape interval can be changed by modifying the .properties.system_metrics_scraper_scrape_interval TAS property. This property is configurable within Ops Manager under the System Logging tab as “System metrics scrape interval”.
Prior to the versions that this change was introduced the system metrics scrape interval was hard-coded to every 1 minute.
Changes internal to the Cloud Controller worker may cause cf push
downtime during upgrade.
You have a significant (>100) number of cloud_controller_worker
VMs.
While there is an existing mitigation in place, it may not be sufficient for large foundations which might still experience downtime when running cf push. Please see https://github.com/cloudfoundry/cloud_controller_ng/issues/2748 for more details
Rails 6 introduced a new interface for the Error class, which means Rails 5 servers cannot de-serialize the new Error class. CAPI release in TAS 2.13 contains upgrade of Ruby on Rails dependencies to Ruby on Rails 6.1 (from Ruby on Rails 5).
While upgrading Cloud Controller VMs to this release, new jobs created by API servers containing the Rails 6 upgrade may serialize the new Errors interface in the database as part of jobs for the Worker VMs to pick up. Worker VMs which have not upgraded, and are still running Cloud Controller with Rails 5, will fail to de-serialize the Error class’s new interface.
TAS for VMs v2.13 does not support cf CLI v6; this means that cf CLI v6 and TAS for VMs v2.13 and later are not tested together, and if you or your users encounter issues using the v6 CLI, you will need to upgrade to v7 to get effective support.
The change from cf CLI v6 to v7 (and v7 to v8) is itself a major version bump, with its own breaking changes.
Two examples:
the flags supported by cf push have changed, See the articles on Upgrading for more details.
the way quota is counted for staging apps is more conservative, and this can cause push failures. See the linked KB for details.
If any automation or clients targeting the platform (especially app developer pipelines, or custom service brokers) that have not already explicitly updated to cf CLI version v7 or greater.
Your users must upgrade to cf CLI v7 or cf CLI v8. To upgrade to a supported cf CLI version, see Upgrading to cf CLI v7 or Upgrading to cf CLI v8.
The newer versions of the CF cli target the new v3 version of the Cloud Controller API. The cf CLI v6 relies on the deprecated v2 Cloud Controller API, which may be entirely removed in a future TAS version line.
As of TAS for VMs v2.13, the Log Cache component runs on its own Log Cache instance group, and is no longer deployed on Doppler instances. Operators should ensure they scale Log Cache appropriately.
You are impacted if you are upgrading to TAS 2.13.0 or higher.
Scale up your Log Cache instance count. VMware recommends scaling up to match the number of VMs and amount of memory as your Doppler instances before the upgrade. Starting larger and then adjusting for actual use is safer than a deployment failure.
You can also consider reducing the memory allocation for Doppler instances now that Log Cache is no longer deployed there.
Separating Log Cache out to its own instance group allows it to be scaled independently of Dopplers and Traffic Controllers. For example, to provide more memory for storing logs and metrics.
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.13/tas-for-vms/runtime-rn.html#log-metric-topology
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.13/tas-for-vms/runtime-rn.html#separate-log-cache
The logging Metric topology has been changed. From TAS 2.13.0 Log Cache uses syslog ingress by default.
You may be impacted if you are not seeing logs and metrics from Diego Cells deployed by TAS or the Isolation Segment or Tanzu Application Service for VMs [Windows] products.
In addition Diego Cells with high logging volume might experience higher CPU usage than they did prior to this change.
Confirm that your instances, including those within isolation segments, are permitted to establish connections to Log Cache nodes on port 6067. You may need to update firewall rules to allow logs to flow directly from your instances to the Log Cache syslog server.
Consider scaling your Diego Cells if you have applications with high logging volume due to the increased load from the syslog agent on the Diego Cell.
During the upgrade Diego Cells may receive the new Log Cache BOSH DNS name log-cache.service.cf.internal
and attempt to send logs and metrics over syslog. VMware recommends upgrading to at least the following patch versions prior to the upgrade and additionally re-deploying the Isolation Segment and TAS for VMs [Windows] products so that the new Log Cache BOSH DNS name is resolvable.
TAS for VMs v2.11.16
TAS for VMs v2.12.9
As of TAS for VMs v2.13, the Log Cache component runs on its own Log Cache instance group, and is no longer deployed on Doppler instances. In addition, syslog ingestion was set as the default for Log Cache to use industry standard protocols.
Service instance metrics might not be retrievable using the Log Cache cf CLI plugin.
If you use the Log Cache cf CLI plugin to retrieve service instance metrics and your service tiles use log cache syslog ingestion you will be impacted.
If you need to retrieve metrics from service tiles that do not support this feature:
upgrade to a version of the service tile that allows for syslog ingestion OR
for TAS 2.13 only, deactivate the Enable Log Cache syslog ingestion checkbox in the System Logging pane of the TAS for VMs tile. The associated product property is: .properties.enable_log_cache_syslog_ingestion
. Note that this is a temporary solution, as this setting is no longer available in TAS 3.0.0.
As of TAS for VMs v2.13, the Log Cache component runs on its own Log Cache instance group, and is no longer deployed on Doppler instances. In addition, syslog ingestion was set as the default for Log Cache.
V1 firehose nozzles such as the Splunk Nozzle for VMware Tanzu may fail to connect to the firehose during an upgrade.
You will be impacted during the upgrade if both of the following are true:
You have a V1 firehose nozzle deployed
You upgrading to TAS v2.13.0 - v2.13.12 or TAS 3.0.0 - 3.0.2
VMware recommends that you upgrade to a more recent version of TAS that does not have this issue.
The Traffic Controller component which V1 nozzles connect to had previously blocked on startup until Log Cache was available. In TAS 2.13 Log Cache was separated out to a separate instance group meaning that Traffic Controller blocks and is unavailable until the new Log Cache instances become available.
If you have IPv4 addresses that contain decimal components leading zeros, (ie 192.168.020.100) you will receive a deploy error. You will need to reformat your IP (192.168.20.100) and deploy again.
This affects properties that feed into all releases that use Golang v1.17. If impacted, you will receive Bosh templating errors during deploy.
Operators can remove the leading zeros and try deploying again.
From Golang release notes: The ParseIP and ParseCIDR functions in Golang’s net library now reject IPv4 addresses which contain decimal components with leading zeros. These components were always interpreted as decimal, but some operating systems treat them as octal. This mismatch could hypothetically lead to security issues if a Go application was used to validate IP addresses which were then used in their original form with non-Go applications which interpreted components as octal.
Routing-release keeps up-to-date with Golang, and Golang 1.17 requires certs to include a Subject Alternative Name (SAN). (This is an enforcement of a deprecation that was introduced in Golang 1.15). If any certificates for services that terminate TLS connections in Gorouter lack a SAN, clients cannot connect to servers and deployment fails. External systems that the Gorouter connects must also have certificates with a valid SAN, or else requests experience a failed TLS handshake.
For all certs in Ops Manager, copy the cert text to a file and decode it with the following command to see if it contains a SAN:
openssl x509 -noout -text -in [FILE]
Follow this same process for certs on external services that gorouter connects to. See the Knowledge Base article above for detailed instructions.
If any certs do not contain a SAN, you must rotate certs with a newly-generated cert that contains a SAN. See the Ops Manager documentation above.
Golang’s crypto/x509
library uses certs to verify the server or client hostname. In the past, operators could use the Common Name field to input hostname; as of Golang 1.15, the Common Name field has been deprecated for hostname verification and the Subject Alternative Name must be provided to verify hostname.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.3 2.13.0 |
If your clients or proxies that access apps cannot handle a chunked response, or expect a Content-Length header, they will break.
One common symptom is that applications now return duplicate Transfer-Encoding headers, and gorouter logs error “too many transfer encodings”.
Fix your clients to be able to handle a chunked response.
Previous versions of TAS may, for short responses, silently remove the Transfer encoding header and replace it with a Content Length header. This convenience was dependent on Golang 1.15 and lead to a false sense of mitigation.
Golang 1.16 now prioritizes flushing partial requests to the client and no longer changes the response to not be chunked.
https://broadcomcms-software.wolkenservicedesk.com/external/article?articleNumber=298288
https://broadcomcms-software.wolkenservicedesk.com/external/article?articleNumber=298108
TAS 2.11.16 moved aggregate drain configuration to the syslog binding cache to improve deploy speed. Dependent on your configuration this could cause the Smoke Test errand to fail.
You are impacted if “Enable Log Cache syslog ingestion” is checked and “Default loggregator drain metadata” is unchecked and you attempt to upgrade to a version of TAS from 2.11.16 - 2.11.21.
The corresponding product properties are
.properties.enable_log_cache_syslog_ingestion
.properties.default_loggregator_drain_metadata
Upgrade to TAS 2.11.22 or greater.
Operators may configure aggregate drains to send all application logs to a syslog destination. The same mechanism may be used within TAS to populate Log Cache. If metadata is not enabled for the Log Cache aggregate drain then Log Cache will not have the metadata expected to function correctly.
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.11/tas-for-vms/runtime-rn.html#2-11-16
https://docs.vmware.com/en/VMware-Tanzu-Application-Service/2.11/tas-for-vms/runtime-rn.html#2-11-22
To minimize downtime for developers pushing apps, upgrade from TAS for VMs v2.11.9 or later. Upgrading from earlier patch versions can result in an Unknown Error when pushing apps.
Your current version of TAS is 2.11.0 through 2.11.8.
You should upgrade to TAS 2.11.9 or higher before upgrading to 2.12 or higher.
Cloud Controller in affected versions has chance that app lifecycle_type could be nil when determining app lifecycle.
If applications do not support HTTP/2 when using Container to Container (C2C) communication through the Envoy proxy (ports 61001 or 61443), requests between these apps will fail.
If you are using container to container networking, and have network policies allowing apps to talk to other apps on ports 61001 or 61443, you may be affected.
Use the cf network-policies command to list policies, and look for those ports.
Investigate the destination applications for these policies, to determine if they support HTTP/2 requests. If not, the client applications will need to be updated to negotiate down to HTTP/1.1 when making their requests.
ALPN is the Application-Layer Protocol Negotiation, which allows HTTP connections to negotiate what protocols are supported.
Envoy is a proxy sitting alongside each application container to facilitate TLS termination.
Versions Introduced
Product | TAS | ||
---|---|---|---|
Version(s) | 2.11.10 2.12.6 |
There is a chance that your apps are not capable of processing the longer 8-byte Trace-ID header. If so, you may receive 400-level errors after gorouter forwards requests to your app.
Foundations that do not enable Zipkin are not affected. To see if zipkin is enabled, look at the Networking tab of the TAS tile in Ops Manager, and see if Enable Zipkin is checked.
If Zipkin is enabled, applications incompatible with this change will throw errors in their application or access logs, related to Zipkin header length.
The header size was increased from 8-bytes to 16-byes in accordance to the W3 standard; you must update your app to be able to handle 16-byte request headers.
Zipkin is a library that allows users to trace a request through mutliple componenets, with the help of a Trace-ID that is the same for the lifecycle of one request. Having a longer Trace-ID is in compliance with W3 standards and also decreases the chance of generating duplicate IDs.
After these changes are handled, continue reading the next sections.