VMware Telco Cloud Automation 3.0 | 30 Nov 2023 | Build - TCA: 22803370 : TCA Cloud Native: 22803361 | Airgap: 22829662 | Release Code: R152 |
VMware Telco Cloud Automation 3.0 | 30 Nov 2023 | Build - TCA: 22803370 : TCA Cloud Native: 22803361 | Airgap: 22829662 | Release Code: R152 |
TCA now offers a new automation layer called Workflow Hub, that enables the definition and automated execution of pipelines, with several built-in workflows, a GUI designer, and a domain specific language to create and/or write new ones, also the possibility to call 3rd party tools.
Workflow Hub serves as an umbrella orchestrator that seamlessly integrates with VMware Telco Cloud Automation, various VMware telco cloud products, and third-party tools. This empowers network operators to effortlessly create customized multi-cloud workflows, bolstered by Workflow Hub's support for the Serverless Workflow DSL.
The built-in workflows are:
Ansible Integration
Using the "Execute an Ansible Playbook" workflow, you can run Ansible playbooks from Workflow Hub. Ansible playbooks are executed as Kubernetes jobs. You can run this playbook on any cluster you choose by providing it's kubeconfig in the workflow input. The Git repository where the playbook scripts are hosted must be the Kubernetes cluster that the workflow uses, and the Kubernetes cluster must be accessible to Workflow Hub.
Supports Generic Rest API Endpoints
Workflow Hub supports calling any OpenAPI compliant REST APIs via the rest type function defined in the serverless workflow DSL.
To support non OpenAPI end-points, you can upload an equivalent OpenAPI schema to Workflow Hub, which then interacts with the end-point using the stored schema. Workflow Hub supports GET/POST/PUT/DELETE operations.
Supports custom Certificate Authority
When communicating with https endpoints that are not signed by a well-known CA, the REST calls fail due to server validation checks. In order to trust such CAs, you can add them to Workflow Hub.
You have the option to bypass this validation when the end-points are local and therefore there is no risk of man-in-the-middle attacks. The decision to bypass or not is specified in the workflow and is different for each end-point accessed by the workflow.
Multi-tenancy ensures that the CAs of the tenants are isolated.
Secret Manager Service
You can register an AWS secret manager with Workflow Hub, which can be used in workflows to fetch sensitive data instead of relying on workflow input.
Multi-tenancy and RBAC
Workflow Hub supports multi-tenancy with controls for managing resource utilization between tenants. Users within a tenant have controlled access to resources based on their privileges. Each resource in Workflow Hub has its own set of privileges, enabling fine-grained access control via RBAC.
Prebuilt workflows are accessible to all tenants but not editable. To modify a prebuilt workflow, you can clone it and make changes within your tenant space.
Workflow Schedules
Using workflow schedules, you can schedule workflows to run at a specific time, or periodically on specific days of the week or month. You can create the following types of schedules:
Calendar: Use this schedule type to run the workflow schedule on particular days of the week or month.
Interval: Use this schedule type to run the workflow schedule periodically by specifying an interval.
CronString: Use this schedule type to run the workflow schedule as specified by the CronString expression.
Orchestration of complex workflows
Workflow Hub supports orchestration of complex workflows. You can stitch the bundled workflows together to create a more complex E2E orchestration. Additionally, you can define your own Serverless Workflows that you can execute on Workflow Hub.
The following workflows are available by default as part of Workflow Hub which you can readily use:
ZTP-based host deployment
Creation of Management Cluster
Creation of Workload Cluster
Creation of Nodepool
Upgrade of Management Cluster
Upgrade of Workload Cluster
Upgrade of Nodepool
Creation of Add-on
CNF Instantiation
CNF Upgrade
Ansible Integration
CAAS and CNF related workflows have been developed for TCA 2.3 feature parity. The same feature set has been tested on TCA 3.0 as well. ZTP workflows have been developed and tested against TCA 3.0.
Kafka Message Publishing
As a Workflow Hub user, you can create messages to report the status of the workflow over Kafka to another entity. You can report errors to a higher layer orchestrator that can take actions, for example, ticketing and RCA. You can create schemas for these messages based on the requirement. The schemas for the messages support the Async-API schema format.
Multi-tenancy provides tenant awareness capability to the VMware Telco Cloud Automation resources. Multi-tenancy allows you to perform the following:
Manage multiple identity providers (IDP)
Manage multiple tenants
Share resources between tenants
Enable tenant-specific Role Based Access Control (RBAC) and Attribute Based Access Control (ABAC)
Enable tenant-specific audit logs
Enable tenant-specific resource tagging
Enable automatic read access to the dependent resources
Enable tenant-aware REST APIs
Enable automatic upgradation of VMware Telco Cloud Automation from 2.3 to 3.0 for roles and privileges
Change ownership by modifying the details of the user who created the resource, as required when deleting a tenant.
View the resources owned or shared with the selected tenant.
An Airgap Appliance OVA is introduced to replace the previous Airgap server solution. VMware supports Airgap Appliance's lifecycle including patching and upgrading. The new Airgap OVA is secured by following the Security Technical Implementation Guide (STIG) guidelines.
Support for TKG management and workload cluster upgrades
VMware Telco Cloud Automation, release 3.0 supports the following TKG management and workload cluster upgrades:
TKG management cluster from 1.24.10 to 1.25 and 1.26.5.
TKG workload cluster from 1.24.10 to any k8s cluster versions supported in TCA 3.0.
CaaS Addons
Supports multiple NFS provisioners.
CaaS Upgrade from TCA 2.3 to 3.0
Starting TCA release 3.0:
You can upgrade the TKG management cluster 1.24.10 (in TCA 2.3 ) to 1.25 and 1.26.8.
You can upgrade the TKG workload cluster from 1.24.10 in TCA 2.3 to any supported k8s cluster version.
CaaS Security
TKG K8s STIG hardening security guidelines are enforced for TKG workload clusters, both Standard Classy Cluster and Classy Single Node Cluster.
Standard ClusterClass-based Cluster and ClusterClass-based Single Node Cluster
Besides the legacy workload cluster, which is now renamed as Standard Cluster, the following workload clusters are introduced:
Standard ClusterClass-based Cluster: Based on CAPI cluster class, this cluster has a few TKG cluster features including vertical node scaling, topology variables, NTP config support, K8s STIG compliance, and Node IPAM support.
ClusterClass-based Single Node Cluster: Single-node workload cluster based on CAPI cluster class. Classy Single Node Cluster is supported K8 1.26.5 onwards.
Kubernetes Node IPAM support for TKG Standard Classy Workload Clusters
VMware Tanzu Kubernetes Grid 2.3.1 uptake
New Kubernetes versions are supported: 1.26.8, 1.25.13, 1.24.17
TKG 2.3.1 uptake with 1.26-based Kubernetes Clusters supports:
Lifecycle management of TKG 2.3.1 clusters.
TKG workload clusters (Standard Cluster, Classy Standard Cluster, and Classy Single Node Cluster) with Kubernetes versions 1.24.17, 1.25.13, and 1.26.8.
TKG management clusters with Kubernetes version 1.26.8.
Enhanced customer experience by simplifying the user interface
Improved inventory management by the introduction of database
Improved management of tasks
A new TCA VM appliance replaces the existing TCA VM appliances: The new TCA appliance is built on Photon OS (v4) that resolves several critical security issues of the legacy OS that was part of the old VM appliance.
TCA 3.0 is offered in the following variants:
TCA Manager + Workflow Hub: Available as an X-large deployment option
Standard TCA Manager: Available as a large deployment option
Standard TCA Control Plane: Available as a medium deployment option
Reduced Scale TCA Control Plane - Available as the small deployment option
For the sizing information, see https://docs.vmware.com/en/VMware-Telco-Cloud-Automation/3.0/tca-deploymentguide/GUID-111A5C6C-D146-478E-BD1A-956BE65C8938.html.
VMware Telco Cloud Automation 3.0 internally runs as a Cloud Native application.
TCA 3.0-compliant deployment:
Provides a simplified and automated system of migration from TCA 2.3 deployment to TCA 3.0.
The migration process replaces the upgrade procedure used for upgrading TCA until release 2.3.
For more information on the ports and firewall requirements, see Important Notes.
For more information on the simplified migration of TCA 2.3 to TCA 3.0, see the VMware Telco Cloud Automation Deployment Guide.
A new username 'tca' is introduced to manage/configure the TCA 3.0 appliances.
The ‘admin’ user is for SSH purposes only.
TCA 3.0 supports CNF rollback of successfully reconfigured or upgraded instances. You can roll back any reconfiguration or upgrade performed on CNF post-TCA 3.0.
Rollback is performed in the reverse order of the upgrade or reconfigure operation.
Rollback allows you to run the helm rollback hooks defined in the helm chart.
You can select multiple node pools during CNF instantiation. Selecting multiple node pools applies the same customizations to all the node pools parallelly during a single CNF instantiation.
The multinode pool feature allows you to apply Dynamic Infrastructure Policies (DIP) during CNF instantiation, upgrade, or reconfiguration of CNF instance. The following options are available:
CNF instantiated with no infrastructure requirements - CNF is upgraded with infra requirements. You can select multiple node pools and specify the batch size to process node pools parallelly.
CNF instantiated with infrastructure requirements - CNF is upgraded with additional infrastructure requirements. The node pools and batch size from CNF instantiation are retained.
Integration with VMware Aria Operations for logs
TCA integrates with VMware Aria Operations to export the TCA application logs for long-term storage and retrieval.
You can import the TCA content pack to VMware Aria Operations for logs. The content pack provides pre-defined filters that you can use to filter logs from specific TCA services.
You can configure VMware Aria Operations for logs details through the Appliance Management UI on TCA-M and TCA-CP.
Supports open cluster terminal where you have deployed TCA or TCA-CP.
A new command ‘debug-tca’ is introduced to perform various operations such as viewing logs, connecting to the database, and debugging services.
VMware Telco Cloud Automation 3.0 provides interoperability support for the following.
Product |
Supported Versions |
---|---|
VMware vCenter Server |
7.0 U3, 7.0 U3o, 8.0 U1, 8.0 U1d |
VMware vSphere |
7.0 U3, 7.0 U3o, 8.0 U1, 8.0 U1c |
VMware NSX-T |
4.1.1 |
VMware Tanzu Kubernetes Grid |
2.3.1 Kubernetes 1.26.8, 1.25.13, 1.24.17 |
VMware Cloud Director |
10.4.2 |
VMware Managed Cloud on AWS |
M22 |
VMware vRealize Orchestrator (Previously known as vRealize Orchestrator) |
8.12, 8.13 |
VMware Aria Operations for Logs (Previously known as vRealize Log Insight) |
8.12 |
Harbor |
2.6.3, 2.7.1, 2.8.2
Note:
Harbor 2.7.1 and 2.8.2 are OCI-only versions. |
VMware Integrated OpenStack |
7.3 |
You can create VNF IPv6 Catalogs with IPv6 network configuration or interface.
You can instantiate IPv6 VNF Catalogs to create IPv6 networks and allocate the IPv6 network to a VNF when deploying in vCD/vC Cloud.
You can perform VNF (with IPv6 interfaces) lifecycle management operations such as Terminate, Scale, Heal, or Operate.
SCP, SSH, and Custom vRO workflows where the destination is IPv6 VNF, the vDUs do not work if the vRO appliance is deployed as IPv4. Therefore, you can use TCA 3.0 SCP, SSH workflow which doesn't require vRO interaction during workflow execution.
The following are the license consumption updates:
TCA license consumption mechanism is aligned with the new pricing and packaging model
TCA measures usage across virtualized and containerized infrastructure and workloads
System alerts are sent when the license threshold crosses or reaches the expiry date
TCA-CP no longer requires a license for activation.
Support for:
Azure deployment which also includes Azure Container Registry (ACR)
Deployment of TCA in AKS
Deployment of cloud-native functions to AKS
Azure Container Registry (ACR) - DONE
Updated password requirements for TCA 3.0 appliances
VMware Telco Cloud Automation appliances come with updated password policies.
The password must have a minimum of:
Eight characters
One digit
One lowercase character
One uppercase character
One special character
login Username Changes for Appliance UI in TCA 3.0
The default username for VMware Telco Cloud Automation 3.0 Appliance is ‘tca’. Admin users can no longer log in to the Appliance Management UI.
Ensure that you are logging in as the ‘tca’ user for any of the day-0 and day-1 operations on the Appliance Management UI.
Additional ports and firewall requirements
VMware Telco Cloud Automation 3.0 introduces additional functionality and features that require additional ports and firewalls to be configured, as required.
The following table lists a few major additional ports. For a detailed list, see http://ports.vmware.com/.
Source |
Destination |
Port |
Protocol |
Service Description |
---|---|---|---|---|
VMware Telco Cloud Automation Manager |
VMware Telco Cloud Automation Control Plane |
9092 |
TCP |
Kafka Communication between TCA appliances. |
VMware Telco Cloud Automation Manager |
VMware Telco Cloud Automation Control Plane |
9093, 9094 |
TCP |
Kafka Communication between TCA appliances - For HA Deployments. |
VMware Telco Cloud Automation Manager |
VMware Aria Operations for Logs |
9543 |
TCP |
Log Ingestion for vRLI System. Depends on vRLI Configuration - TLS / SSL Log Ingestion for vRLI System. |
VMware Telco Cloud Automation Control Plane |
VMware Aria Operations for Logs |
9543 |
TCP |
Log Ingestion for vRLI System. Depends on vRLI Configuration - TLS / SSL Log Ingestion for vRLI System |
VMware Tanzu Kubernetes Grid - Workload Cluster |
Airgap Server |
8043 |
TCP |
Download K8S Operators for TCA, Kernel Packages and other libraries / binaries (for environments without Internet access). |
VMware Tanzu Kubernetes Grid - Management Cluster |
Airgap Server |
8043 |
TCP |
Download K8S Operators for TCA, Kernel Packages and other libraries / binaries (for environments without Internet access). |
Administrative Workstation/Jumphost |
Airgap Server / TCA Migration Tool |
22 |
TCP |
SSH access to execute the TCA Migration tool. |
Airgap Server / TCA Migration Tool |
VMware Telco Cloud Automation Manager |
443, 9443 |
TCP |
Backup, Restore and Migrate the TCA Manager Appliance. |
Airgap Server / TCA Migration Tool |
VMware Telco Cloud Automation Control Plane |
9443 |
TCP |
Backup, restore and migrate the TCA Control Plane appliance. |
Airgap Server / TCA Migration Tool |
VMware vCenter Server |
443 |
TCP |
Access is required to each vCenter where TCA Manager and Control Plane appliances are deployed. It is required to upload templates and migrate TCA appliances. |
Airgap Server / TCA Migration Tool |
Web Server |
443 |
TCP |
Download the TCA VM appliances. |
Downloading the BYOI Template
Ensure that you are using the latest ovftool version to upload the templates.
Download Photon BYOI Templates for VMware Tanzu Kubernetes Grid
To download the Photon BYOI templates:
Go to the VMware Customer Connect site at https://customerconnect.vmware.com/.
From the top menu, select Products and Accounts > All Products.
On the All Downloads page, scroll down to VMware Telco Cloud Automation and click Download Product.
On the Download VMware Telco Cloud Automation page, ensure that the version selected is 3.0.
Click the Drivers & Tools tab.
Expand the category VMware Telco Cloud Automation 3.0 Photon BYOI Templates for TKG.
Corresponding to Photon BYOI Templates for VMware Tanzu Kubernetes Grid 2.2.0 and 2.3.1, click Go To Downloads.
In the Download Product page, download the appropriate Photon BYOI template.
The Tanzu Kubernetes Grid 2.2.0 template (version 1.25.7) is required for upgrading TKG Management Clusters.
Download RAN Optimized BYOI Templates for VMware Tanzu Kubernetes Grid
To download RAN optimized BYOI templates:
Go to the VMware Customer Connect site at https://customerconnect.vmware.com/.
From the top menu, select Products and Accounts > All Products.
On the All Downloads page, scroll down to VMware Telco Cloud Automation and click Download Product.
On the Download VMware Telco Cloud Automation page, ensure that the version selected is 3.0.
Click the Drivers & Tools tab.
Expand the category VMware Telco Cloud Automation 3.0 RAN Optimized BYOI Template for TKG.
Corresponding to RAN Optimized Photon BYOI Templates for VMware Tanzu Kubernetes Grid 2.3.1, click Go To Downloads.
On the Download Product page, download the appropriate Photon BYOI template.
Download RAN Optimized Single Node Cluster BYOI Templates for VMware Tanzu Kubernetes Grid
To download RAN optimized Single Node Cluster BYOI templates:
Go to the VMware Customer Connect site at https://customerconnect.vmware.com/.
From the top menu, select Products and Accounts > All Products.
On the All Downloads page, scroll down to VMware Telco Cloud Automation and click Download Product.
On the Download VMware Telco Cloud Automation page, ensure that the version selected is 3.0.
Click the Drivers & Tools tab.
Expand the category VMware Telco Cloud Automation 3.0 RAN Optimized BYOI Single Node Cluster template for TKG.
Corresponding to RAN Optimized Single Node Cluster Photon BYOI Templates for VMware Tanzu Kubernetes Grid 2.3.1, click Go To Downloads.
On the Download Product page, download the appropriate Photon BYOI template.
Airgap server
VMware Telco Cloud Automation 3.0 Airgap appliance adopts an updated approach to import data into an Airgap Server deployed in air-gapped environment.
Deploy two Airgap instances, one in public network and the other in private network
Run export operation on the public network airgap appliance, export full size bundle
Copy exported bundle onto the private network airgap appliance
Run import operation on the private network airgap appliance
For further information, follow the steps here within the Deployment Guide.
Migrating TCA 2.3 based Airgap Servers to TCA 3.0 Airgap Servers is supported. For further information, follow the steps here within the Deployment Guide.
BYO Airgap Server - now replaced with Airgap OVA.
Cluster diagnosis - now replaced with a new cluster diagnosis service.
CCLI show nodepolicy/esxinfo subcommand - now allows users to check the NodePolicy CR directly on the UI.
CCLI show addon subcommand - now allows the user to check the addon status on the UI.
CCLI show hostconfig/hostconfigprofile - now the hostconfig information is available on the UI.
CCLI show diagnosis - now the diagnosis is available in the UI.
CaaS v1 workload cluster
CDC/RDC provisioning in Infrastructure Automation
Cloud Native TCA including installations through TKG clusters and so on.
Cloud Native TCA upgrades
CBAM and SVNFM
Fixed Issue 3211123: When editing the CaaS v1 template, the "Next" button is disabled in the "Control Plane Node Configuration" tab
Fixed Issue 3220048: Additional operations are enabled if the Reset State operation is performed when the Heal operation fails on the NF instance
Fixed Issue 3263361: Copying multiple files at various steps of a v3 schema workflow reports an error
Fixed Issue 3145966: Clicking the Workflow Name under Workflow Executions redirects the user to the Workflows tab in the VNF catalog
Fixed Issue 3221536: Files uploaded during the execution of standalone Workflows are corrupted
Fixed Issue 3220082: Workflow catalog filters do not work
Fixed Issue 3150997: When the management cluster enables a machine health check and the Management cluster control plane scales out, the new master node is tagged with the node pool label: telco.vmware.com/nodepool={namepool name}
Issue 3281886: Workflow execution for all pre built workflows will fail if execution time exceeds the session duration time [Default 60 minutes].
Workaround:
User can increase the max session duration. [Login to TCA-M appliance manager → Go to Administration → In General Settings go to Session Management and edit the session duration].
Issue 3292201: The workflow gets stuck when you run it immediately after creation
Workaround:
Terminate the workflow execution that is stuck and re-run the workflow.
Issue 3281375: Ansible executor deployment fails if the repository hosting the ansible image is not CA-signed
Workaround:
Cluster Admin must upload the Certificate Authority of the Airgap server hosting the ansible image to the cluster before running the workflow.
The user can make the ansible-executor image available locally on any cluster to be used and provide the kubeconfig of that cluster in the ansible workflow payload.
Issue 3269413: Workflow Hub does not support RBAC filters for resources
Workflow Hub does not support RBAC filters, also known as advanced object filtering for resources.
Issue 3262019: Limited audit log support for Workflow Hub resources
Audit logs for Workflow Hub fail to display specific details, such as the exact resource accessed and the type of service used.
Issue 3295134: Tenant name is case-sensitive when logging into tenant-specific TCA
Issue 3291411: Tenant-related resources are displayed as IDs instead of names
Issue 3262886: Object names should be unique across tenants
You cannot reuse the object name from one tenant to another tenant when creating the same object type.
Creating management cluster and workload cluster with same name should NOT be allowed. User should pay attention to avoid this. For example, if there is an existing (management/workload) cluster named "cluster1", user should avoid to create another (management/workload) cluster named "cluster1" subsequently.
However, the following Workflow resource names are a few exceptions that can be duplicated across tenants:
Workflows
Schedules
Schemas
Certs
Secret manager
Workaround
Use unique names across tenants.
Issue 3265525: Changing the default identity provider results in users losing access to existing tenants
Workaround:
Manually update the Active Directory users in the existing Tenants with the default IDP in Telco Cloud Automation Orchestrator. This is also applicable for the default authentication change from Active Directory to vCenter and SSO.
Issues 3257458 and 3213336: Multi-tenancy is available as a limited support feature
Multi-tenancy does not support:
Network Slicing
Infrastructure Automation
Gitops
Techsupport features
Multi-tenancy is available as a limited support feature for Workflow Hub where resource sharing and ownership change are not supported.
Issue 3274085: Execution of workflows and Open Terminal feature do not work when impersonating a user or tenant
Issue 3293171: Single Node Cluster creation fails as it changes the type to Standard
Workaround
After migrating from TCA 2.3 to 3.0, close the TCAM tabs, clear the browser cache, and reopen TCAM tabs.
Issue 3294045: In some cases, the upgrade notifications for the node pool are delayed by a few hours
Workaround:
Upgrade the node pool by following the usual flow of editing the node pool and selecting the TBR you want to upgrade to. The notifications should be displayed automatically in three hours.
Issue 3294863: If the TCA-CP appliance is slow, there may be an error in upgrading the management cluster
If the TCA-CP appliance is slow, there may be an error in upgrading the management cluster. The message ‘Unable to upgrade k8s cluster with error Internal Server Error while fetching data from BootStrapper. Error: 'SocketTimeoutException'. You can retry to upgrade the cluster' is displayed.
Workaround:
1. Log in to the TCA-CP appliance as an admin user.
2. If the management cluster failed to upgrade:
3. Go to KBS pod and monitor the management cluster status with the command kbsctl show managmentclusters.
For example:
admin@tca-cp-230 [ ~ ]$ kubectl exec -it -n tca-cp-cn kbs-tkg230-7c8b7fff59-27bdj -- bash
root [ / ]# kbsctl show managementclusters
Count: 1
----------------------------------------
ID: ba750eeb-c4db-458f-ae49-fd080ffafe61
Name: mc230-1
Status: Running
TKG ID: 111be853-c280-4e9b-81f0-e4f081040488
4. Wait until the status of the management cluster changes to ‘Running’, and then retry to upgrade the cluster from the TCA UI.
Issue 3295328: The option to provide the node pool upgrade strategy (YAML) is hidden in the TCA 3.0 UI
When creating or editing a node pool, the Advanced Options section which allows the user to provide the node pool upgrade strategy (YAML) is hidden in the TCA 3.0 UI.
Workaround:
Go to Custom Resources(CR) and customize the upgrade strategy.
Issue 3292629: Upgrading the Management Cluster from v1.25 to v1.26 fails at times
Upgrading the Management Cluster from v1.25 to v1.26 fails at times and displays the message ‘Cluster upgraded successfully but post configuration failed’.
Workaround:
Log in to the management cluster.
Check the 'tca-addon-bom' secret by running the command kubectl get secret -n tca-system | grep tca-addon-bom-3.0.0
.
Delete the 'tca-addon-bom' secret by running the command kubectl delete secret tca-addon-bom-3.0.0-tca.xxxx
.
Check the 'tbr-bom CR' by running the command kubectl get tbr -n tca-system | grep tbr-bom-3.0.0-v1.26.5---vmware.2-tkg.1-tca
.
Delete 'tbr-bom CR' by running the command kubectl delete tbr tbr-bom-3.0.0-v1.26.5---vmware.2-tkg.1-tca.xxxxx -n tca-system
.
Log in to the TCA portal and retry the failed management cluster.
Issue 3293611: Node pool deletion fails as CPAV vspheremachine is not deleted
Workaround:
SSH into the management cluster control plane node.
Restart CAPV Pod with kubectl rollout restart deploy/capv-controller-manager -n capv-system
.
Issue 3102649: Management cluster creation fails if the vCenter password contains a colon (:)
Workaround:
Change the vCenter password – remove the colon(:).
Issue 3289354: When the node pool upgrade strategy is not enabled, vertical scaling out of the node pool is stuck in the Provisioning state
When the node pool upgrade strategy is not enabled, vertical scaling out of the CPU, memory, and disk of the node pool is stuck in the Provisioning state. This happens if the host/cluster does not have enough resources.
Workaround:
Deploy only one node on the cell site host.
Enable the node pool strategy by editing the node pool in the TCA UI with the following values:
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
type: RollingUpdate
This ensures that the old node is deleted, resources are released, and the new node is created. It also reduces the configuration failures caused by insufficient resources.
Issue 3232625: Node pool deletion fails as the CAPV vspherevm controller does not restore the vspherevm CR instance
Workaround:
SSH into the management cluster control plane node.
Restart the CAPV Pod with kubectl rollout restart deploy/capv-controller-manager -n capv-system/
.
Issue 3297535: In the airgapped environment, if the airgap server misses theTCA 2.3 released images, the management cluster upgrade fails
In the airgapped environment, if the airgap server misses theTCA 2.3 released images, the management cluster upgrade from v1.24 to v1.25 fails.
Workaround:
Log in to the management cluster with capv.
Create and run the .sh script createMgmtDummyCR.sh
.
Click Retry from the TCA UI.
Issue 3297043: The onboarding of the Network Function within the PTP configuration enables periodic timesync automatically after node reboot
Workaround:
Disable system service tca-ntp-handler. service on target node:
root@workload-ptp-np [ ~ ]# systemctl disable tca-ntp-handler.service Removed /etc/systemd/system/multi-user.target.wants/tca-ntp-handler.service.
Issue 3294888: When you deploy the vsphere-csi addon with zone and region, the addon is stuck in the Processing state
When you deploy the vsphere-csi addon with zone and region, the addon is stuck in the Processing state. The addon status shows an unauthorized error when retrieving the tag list from vCenter.
Workaround:
Restart the tca-kubecluster-operator
pod to re-authenticate with vCenter.
Issue 3288670:For Classy Standard and Standard clusters, certain autoscaler settings do not work
For Classy Standard clusters, the annotations prefix added to the machine deployment changes from 'cluster.k8s.io' to 'cluster.x-k8s.io' when autoscaler is enabled for the node pool and the user edits the autoscaler minimum and maximum settings, the machine deployment replicas are not within the range of the autoscaler settings.
For Standard clusters, when autoscaler is enabled for cluster creation, the autoscaler setting on the node pool does not work. Even if you disable the autoscaler feature for the node pool, the new autoscaler annotation still exists in machine deployment.
Workaround:
To fix the Classy Standard cluster issue:
ssh into TCA-CP and then log in to the kbs pod
Go to the management cluster, and for the node pool, remove the replicas from the MachineDeployment field.
To fix the Standard cluster issue:
ssh into TCA-CP and log in to the kbs pod
Click the the management cluster and for the node pool, edit the MachineDeployment field:
Update the autoscaler annotation from "cluster.k8s.io/cluster-api-autoscaler-node-group-max-size" to "cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size"
Update the annotation from "cluster.k8s.io/cluster-api-autoscaler-node-group-min-size" to "cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size",
Remove the replicas from the MachineDeployment field.
If you want to disable autoscaler, remove the related annotations, and set the value in the Replicas field with the same value as the node pool replicas.
Issue 3295464: Management cluster deployment fails on vSphere 7.0.3 u3 and NSX IPv6
Management cluster deployment fails on vSphere 7.0.3 u3 and NSX IPv6 segment. The following error occurs:
Creating ClusterClass="tkg-vsphere-default-v1.1.0" Namespace="tkg-system" Retrying with backoff Cause="error creating \"cluster.x-k8s.io/v1beta1, Kind=ClusterClass\" tkg-system/tkg-vsphere-default-v1.1.0: Internal error occurred: failed calling webhook \"default.clusterclass.cluster.x-k8s.io\": failed to call webhook: Post \"https://capi-webhook-service.capi-system.svc:443/mutate-cluster-x-k8s-io-v1beta1-clusterclass?timeout=10s\": context deadline exceeded"
Workaround:
Upgrade vSphere ESXi to 8.x.
Issue 3295383: If vsphere-csi addon is deployed on a workload cluster with a vCenter username and password, the vsphere-csi addon on the new workload cluster is not configured with the vCenter password
If the vsphere-csi addon is deployed on a workload cluster with a vCenter username and password, when the option Copy Specification and Deploy new Cluster is used to create a new workload cluster, vsphere-csi addon on the new workload cluster is not configured with the vCenter password. If the vCenter username is different from the one used to create the workload cluster, the cluster may not create the Persistent Volume backed by vsphere-csi.
Workaround:
Edit the vsphere-csi addon and then enter the vCenter password.
The object-propagation-controller pod crashes out of memory
When you create 200 workload clusters in a management cluster, the object-propagation-controller pod crashes out of memory. For legacy clusters, the pod automatically recovers memory.
Workaround:
For legacy cluster:
object-propagation-controller
to 1G.For classy cluster or single node classy cluster:
tkg-clusterclass
, tanzu-framework
, tkr-service PKGI
, and increase the memory limit of object-propagation-controller
, tkr-status-controller
, tkr-source-controller
if they crash.Issue 3287197: After the TCA-CP appliance reboots, the kbs-tkg220 pod fails to cache tanzu contexts, kubeconfig, and YAML files v1.24.10 and v1.25.7
After the TCA-CP appliance reboots, the kbs-tkg220
pod fails to cache tanzu contexts, kubeconfig, and YAML files v1.24.10 and v1.25.7 of the management cluster, respectively.
The issue occurs in the following scenarios:
TCA deployment upgrade from TCA 2.3 with v1.24.10 or v1.25.7 management cluster.
Only when the TCA-CP appliance reboots.
Only for the v1.24.10 and v1.25.7 management clusters.
Workaround:
Restart the kbs-tkg220
pod manually.
Issue 3286484: Time-out of node customization in Network Functions
Node customization in Network Function is timed out when deployed on Classy Single Node Cluster.
Workaround:
Delete the multus pod.
Issue 3277229: vsphere-csi addon health status is not correct for legacy workload clusters
After upgrading VMware Telco Cloud Automation from 2.3 to 3.0, the vsphere-csi addon health status is not displayed correctly.
Workaround:
Upgrade the workload cluster.
Issue 3288060: HCP data is not migrated when TCA is upgraded or migrated from 2.3 to 3.0
HCP data is not migrated when TCA is upgraded or migrated from 2.3 to 3.0. Therefore, ESXi secrets are not present within HCP failing diagnosis.
Workaround:
You must resync (full-resync) the hosts in Infrastructure Automation so that secrets are re-created before performing diagnosis.
Issue 3288039: Bios version and Server type are not validated in RAN Diagnosis
If IPMI details are not provided during host provisioning, the Bios version and Server type are not validated in RAN Diagnosis.
Workaround:
Provide IPMI details while provisioning the CSG host.
Issue 3281844: Datacenter and parent domain name cannot be changed
After creating the vCenter domain, you cannot change the data center.
After creating a cell site group, you cannot change the parent domain.
Workaround:
Delete the domain and recreate it with the required changes.
Issue 3289179: Changes made to the ‘application_properties.ini’ file in the ‘tcf-manager’ container fail to migrate to TCA 3.0
Workaround:
Note the changes from the original application_properties.ini file to the new file and add the same set of properties to configmap tca-tcf-manager-config-override , which is part of 3.0 and later versions. Instructions for the new configmap format are available in the configmap comments.
1. Go to your environment where kubeconfig is set.
2. Run the following command to edit or override configmap
kubectl edit configmap tca-tcf-manager-config-override -n tca-mgr
3. Add values to configmap by providing the key = values in the section to override (add the section if it’s not available, for example, [section_name] )
Issues 3292757 and 3291489: When the vCenter domain is edited but not resynced, the operations performed on the hosts fail
If the vCenter domain is edited but not resynced:
On Resync, hosts remain in NOT_PROVISIONED status.
On deletion, hosts remain in TO_BE_DELETED status
Fails to add hosts and displays the error "[ZTP101057] Host(s) cannot be added to the CSG domain <domain-name> unless a parent vCenter domain is in provisioned state with latest specifications."
Workaround:
Resync the vCenter domain after editing and then perform all other operations on the hosts.
Issue 3262926: Host provisioning for pre-deployed CSG fails
Host provisioning for pre-deployed CSG fails when the host is not present in the respective datacenter and vCenter folder for that specific CSG. The following is the error:
"Failed to verify the presence of host <host-name> in vCenter. Reason: Pre-deployed host <host-name> must first be added to the VC <vcenter-fqdn>."
Workaround:
For pre-deployed CSG domains, you must add the host to vCenter under the respective datacenter and vCenter folder (CSG domain) before adding the host to Infrastructure Automation.
Issue 3290732: In the TCA 3.0 environment, if a domain is missing the {hostname} keyword, a warning is displayed
In the TCA 2.3 environment, the {hostname} keyword was not mandatory for a custom CSI Zone Tag (under CSG domains > CSI Tagging) . However, post-migration to TCA 3.0, the {hostname} keyword is mandatory for a custom CSI Zone Tag . If the domain is missing the {hostname} keyword, a warning message is displayed and without the keyword, you cannot save the domain after editing it.
Workaround:
Post-migration to TCA 3.0, add the {hostname} keyword in the custom CSI Zone tag for domains with CSI zone tag without {hostname} and save the domains.
This is applicable for domains with a Custom CSI Zone tag in the TCA 2.3 environment.
Issue 3285652: Network Service scale operation fails to show the VNFS to be scaled
Network Service scale operation does not show which VNFS needs to be scaled when the Network Service is instantiated using the VNFS instance.
Workaround:
During the Network Service instantiation, select the option Instantiate New for VNFs.
Issue 3289650: If 'VNF operate' (shutdown) is not completed within the time specified in the TCA UI, the operation fails
If VNF Graceful Operate (VM shutdown) takes more than specified in the TCA UI stop timeout, the following are the impacts:
If VNF is not powered off, TCA tries to power off the VM forcefully.
If the VM moves to the powered-off state just before the timeout specified in the TCA Ul, the forceful Power-off operation fails.
Workaround:
Use Graceful VNF Operate with a higher timeout (more than 10 seconds).
Use Forceful VNF Operate instead of Graceful VNF Operate.
Issues 3290637 and 3290641: Manual execution of workflows within CSAR cannot be disabled
Issue 3309655: After backup and restore, logs may not be flowing into VRLI even if VRLI configuration is present.
Fluentd pods sometimes comes up before the config can be restored resulting in logs not flowing to VRLI.
Workaround:
Login to Appliance Manager UI > Appliance Summary > Restart the Fluent Daemon service.
Issue 3308649: Restored schedule does not have any effect after Backup-Restore or Migration. No periodic backups would be taken even if schedule is present.
Workaround
Manually edit the schedule and set a new time.
Issue 3284932: After restoring CNF resources with multiple DUs on the same workload cluster, termination of the CNF from TCA UI fails
After restoring CNF resources with multiple DUs on the same workload cluster using the Velero addon, termination of the CNF from TCA UI fails
Workaround:
Manually delete the CNF resources such as namespace in the workload cluster.
Retry from the TCA UI to terminate the CNF.
Issue 3294837: Velero Restic plugin to back up volumes with NFS as backend fails
Cluster upgrades from TCA 2.3 to 3.0 using the Velero Restic plugin to back up volumes with NFS as the backend fails.
Workaround:
Uninstall the Velero addon and reinstall it.
Remove the failed backup and proceed with a fresh backup.
Issue 3267832: Velero pod restart might lead to backup failure on a workload cluster
Velero pod might restart when backup is created and this leads to backup failure on a workload cluster.
Workaround:
Delete the failed backup on the workload cluster and retry.
Issue 3293997: Static routes configuration do not persist after backup and restore.
Workaround:
Add the static routes configuration manually through the Appliance Management portal after restoring.
Issue 3292755: When configuring FTP server settings, 'Use SSH keys' does not generate the key in the required format
When configuring FTP server settings, 'Use SSH keys' does not generate the key in the expected format.
Expected format (key.txt):
ssh-rsa >AAAAB3NzaC1yc2EAAAADAQABAAABAQCDVdkEhSS0J6Ak3rJpR2SRoSMa2eIxGpzOtEkGW0BZOE0a2Fl8b2H8CCIwmBJrZan89h1s9b0OMj6t08GdP31F673S>7+YpDA81IuWMtge1uPjjaAsEaFm37yv4mewe3Hpx3IcqlFsoeH9v/HmChzTeVUUIBNt2PHepA45uYn7t3Ja8ol5DAWaFEfE0ytxukeTIIWenvFvOzDPXtrjl>nfKvdoM0qROGn9ShWYd8TC4Z/JFxEcBHUZvAJt7LD4BWEc4rkOXiCnHKfMzxtt13/STeU/RDPGvkfKHxb3GyMfQvAiArwfjzsy7H0O+rhTWMAsE/MKIwWUhf>QgJ+DyPZwEBj admin@host
Generated format (key.txt):
-----BEGIN PUBLIC KEY----->MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA2Gt0ZWi7YhW7G4wxRZgtRxJX006zAGM5huEf5jpGbbUweottdgGkFIXlS3UNx7HsV2hRc9uvBYjG>rCaDKiBYlGyzPEJzY7TogxaecLlYuJVaIFR3MxRUcWzgqP2KMwK4cDP+BOZQDJ+yKQA3lR1tlTq/e+E4blJYIXzeOBO2mU6/i5jS3oZ94WH+nZKLkmvcaNpn>6pOHAAPPPkyFK4Nw0dYI8o3ahV1Q2OyxpbsXEk+H04X6769XeiH6J6f7M6kVmJhrQbXdcbB997fDQSqwQa1QhC7qZK7OH6nfS4zpJrfZp917F3LkoBfYDTYd>nbvamdbPvvd2PnXfWBsFMmT52QIDAQAB-----END PUBLIC KEY-----
Workaround :
Paste the content from TCA (download or copy to clipboard) to a file.
Change the key to the required format using the command ssh-keygen -i -m PKCS8 -f key.txt
.
Log in to the FTP server and append the content from step 2 to <user_home>/.ssh/authorized_keys
.
For formatting the file content, see key.txt.
Issue 3282193: Open Terminal on a terminated and re-instantiated CNF requires 10 minutes to work correctly
Open Terminal on a terminated and re-instantiated CNF requires 10 minutes to work correctly if the namespace is manually deleted during termination.
Workaround:
Reinstantiate the same CNF after 10 minutes.
Issue 3292135: agctl sync operation may fail with an unexpected error
agctl sync operation fails with an internal server error (status code 500) caused by harbor intermittent issue.
Issue 3293336: Failure of TKG isolated-cluster plugin sync results in cache that requires a manual cleanup
If the TKG isolated-cluster plugin sync fails when uploading or downloading TKG image tar packages, it leaves cache files that need to be manually cleaned up.
If you do not clean up cache, the plugin crashes when you upload or download the TKG image tar packages.
Workaround:
1. Delete the cache files manually:
rm -rf /photon-reps/tkg_temp
rm -rf /tmp/imgpkg-*
2. Redo the sync operation.
Issue 3293551: The login banner message on the Open Terminal does not appear post-migration
Workaround:
Configure the login banner from the TCA Console, edit it, and then save it.
Issue 3293171: Creating SNC changes cluster type to Standard, thus failing to create the SNC cluster.
Stale browser cache causes SNC cluster creation to fail post migration.
Workaround:
After a migration from 2.3 to 3.0, please close TCAM tabs, clear the browser cache, and reopen TCAM tabs.
Issue 3297261: When migrating from TCA 2.3 to TCA 3.0, the migration of the IPv6 appliance fails
When migrating from TCA 2.3 to TCA 3.0, the migration of the IPv6 mode fails due to a mismatch of the OVF parameters on the new appliance.
Workaround
Allow the migration script to complete.
Go to vCenter, power off, and delete the newly deployed TCA 3.0 IPv6 appliance.
Deploy the TCA 3.0 IPv6 appliance manually and power it On.
Rerun the migration script with the same JSON input.
Issue 3293416: Instantiating the Network Function on IPV6 Single Node Cluster shows the socket between 'sriovdp' and 'kubelet' status as abnormal
Instantiating the Network Function on IPV6 Single Node Cluster might at Infrastructure Configuration step check nodepolicymachinestatus and show the socket between ‘sriovdp’ and ‘kubelet’ status as abnormal.
Workaround:
Reinstantiate the Network Function from TCA UI.
Issue 3297282: CNF rollback cannot be performed when the Network Function that is instantiated first has no node customizations
You cannot perform CNF rollback when the Network Function that is instantiated first has no node customizations configured and additional customizations are added through a CNF upgrade.
Issue 3292945: In TCA 3.0 when selecting node pools to customize, the CNF instantiation wizard does not show any warning if the CNFs are already deployed on the selected node pools
Issue 3296014: The Network Service Instance screen does not display the underlying VNFs details under 'General Properties'
Workaround
Click the corresponding VNF instance link under NS instance details to view the VNF details.
Issue 3296417: During NS instantiation, due to the unavailability of the vAPP templates drop-down list, the instantiation fails
During NS instantiation, the available vAPP templates drop-down to instantiate the VNFs does not populate all the values and hence the instantiation fails. This applies only to the vCloud Director.
Workaround
Use the vAPP template instead of the VNF template to instantiate the Network Service.
Issue 3307853: Uploading new Web cert overrides App Mgmt cert as well. Looks like CNVA supports only 1 cert for both 443 and 9443.
Workaround
NA. No apparent functional impact
Issue 3310428: Unable to acknowledge the VIM related alarms.
Unable to acknowledge the VIM related alarms. Acknowledgment for non VIM related alarms, like CNF alarms, license related alarms are working fine.
Issue 3309026: Owner of the resource will not be able to perform LCM operations if the matching filter criteria set for the logged in user is no longer met.
Consider the following scenario:
Log in as user1 with ABAC filter set for tag: vendor: nokia
On-board a network function catalog (NF1) with user1 and set the tag as vendor: nokia
Edit the tag for network function catalog to vendor: samsung
Try to perform instantiate on NF1
Since the filter criteria is no longer matching, the user though being the owner of the resource (NF1) is unable to perform LCM operations (like instantiate) on the resource sample error response:
{
"status" : 403,
"detail" : "Access denied: The user [email protected] in tenant with default identifier does not have the required privilege to perform LCM on NF"
}
Workaround:
Set the filters to enable the LCM operations on the resource.
Example: Edit tags for the NF1 catalog and add tags to match the filter criteria set for the user.
If the resource is already in instantiated state, recommended workaround would be to delete the resource and recreate with correct tags matching the filter criteria set for the user.
You can configure the maximum session duration (60 minutes) of API and GUI access to TCA.
You can specify the admin ('tca' user) password for the appliance manager separately.
TCA 3.0 allows encrypted passwords for backup and restore.