VMware Cloud Director Container Service Extension 4.0 | 17 NOV 2022 | Build: 20803846

Check for additions and updates to these release notes.

What's New in November 2022

  • You can now perform cluster life cycle management tasks such as create, upgrade, resize, and delete Kubernetes clusters in Kubernetes Container Clusters UI plug-in of VMware Cloud Director.

  • CSE Management tab: A new service provider persona workflow in the Kubernetes Container Clusters UI plug-in. This workflow guides service providers through the VMware Cloud Director Container Service Extension set up in the UI plug-in, and prepares the environment to allow tenant users to create Kubernetes clusters.

  • Multi-node control plane UI for Tanzu Kubernetes Grid clusters, allowing high availability of the Kubernetes control plane.

  • Heterogeneous clusters with custom sized nodes to build clusters that can accommodate memory or CPU intensive containers.

  • Pre-installation of Tanzu core packages in Tanzu Kubernetes Grid clusters at creation time, that reduces additional configuration by containerized applications.

  • GPU support for Tanzu Kubernetes Grid clusters to allow for AI / ML applications.

  • The VMware Cloud Director Container Service Extension UI is localized to the following languages: German (de_DE), French (fr_FR), Italian (it_IT), Spanish (es_ES), Brazilian Portuguese (pt_BR), Japanese (ja_JP), Korean (ko_KR), Simplified Chinese (zh_CN), Traditional Chinese (zh_TW).

  • VMware Cloud Director Container Service Extension is packaged as an appliance and uses Photon OS 3.0.

  • VMware Cloud Director Container Service Extension supports HA deployment to allow high availability of cluster management tasks, such as create, upgrade, resize and delete a cluster.

  • Support for the deployment of VMware RabbitMQ using VMware Data Solutions Extension. For more information, see VMware Cloud Director extension for VMware Data Solutions.

  • You can select a specific LB VIP and subnet for the control plane to manage additional network security or for business continuity.

  • Cluster API for VMware Cloud Director, CAPVCD, 1.0.0 is released alongside VMware Cloud Director Container Service Extension 4.0. You can use CAPVCD 1.0.0 independently to lifecycle Kubernetes Clusters . For more information, see https://github.com/vmware/cluster-api-provider-cloud-director.

Known Issues

  • In Kubernetes Container Clusters UI plug-in, the cluster delete operation can fail when the cluster status is Error.

    To delete a cluster that is in Error status, it is necessary to force delete the cluster.

    1. Log in to VMware Cloud Director, and from the top navigation bar, select More > Kubernetes Container Clusters.

    2. Select a cluster, and in the cluster information page, click Delete.

    3. In the Delete Cluster page, select the Force Delete checkbox, and click Delete.

  • In VMware Cloud Director Container Service Extension, the creation of Tanzu Kubernetes Grid clusters can fail due to a script execution error.

    The following error appears in the Events tab of the cluster info page in Kubernetes Container Clusters UI:

    ScriptExecutionTimeout with the following details:

    error while bootstrapping the machine [cluster-name/EPHEMERAL_TEMP_VM]; timeout for post customization phase [phase name of script execution]

    Workaround:

    When this error occurs, it is recommended to activate Auto Repair on Errors from cluster settings. This instructs VMware Cloud Director Container Service Extension to reattempt cluster creation.

    1. Log in to VMware Cloud Director, and from the top navigation bar, select More > Kubernetes Container Clusters.

    2. Select a cluster, and in the cluster information page, click Settings, and activate the Auto Repair on Errors toggle.

    3. Click Save.

    Note:

    It is recommended to deactivate the Auto Repair on Errors toggle when troubleshooting cluster creation issues.

  • The cluster creation for multi-control plane or multi-worker node goes into an error state. The Events tab in the cluster details page shows an EphemeralVMError event due to the failure to delete ephemeralVM in VMware Cloud Director.

    The same error events can appear repeatedly if the Auto Repair on Errors setting is activated on the cluster. If the Auto Repair on Errors setting is off, sometimes the cluster can show an error state due to the failure to delete the ephemeralVM in VMware Cloud Director even though the control plane and worker nodes are created successfully.

    This issue is visible in any release and patch release after but not including VMware Cloud Director 10.3.3.3, and any release and patch release starting with VMware Cloud Director 10.4.1.

    Workaround:

    Create the cluster with one control plane and one worker node, and then resize the cluster to the desired node count.

    This issue is fixed for VMware Cloud Director Container Service Extension 4.0.3 release.

  • In some instances, nodes cannot join clusters. This occurs randomly due to intermittent issues, even when the cluster is in an available state.

    The following error appears in the Events tab of the cluster info page in Kubernetes Container Clusters UI:

    VcdMachineScriptExecutionError with the following details:

    script failed with status [x] and reason [Date Time 1 /root/node.sh: exit [x]]

    Workaround:

    Perform this workaround for clusters that have nodes that fail to join. This workaround does not resolve the problem for future cluster creations.

    It is necessary to perform this manual workaround when the issue occurs. There may be one or more nodes that fail to join, and it is necessary to perform the below steps starting from step 2 for every node that has not joined.

    1. Download the kube config of the cluster that does not have all the nodes.

      1. Log in to VMware Cloud Director, and from the top navigation bar, select More > Kubernetes Container Clusters.

      2. Select a cluster, and in the cluster information page, click Download Kube Config.

        For more information on Kube Config file, refer to the Kubernetes website.

    2. In the Kubernetes Container Clusters UI plug-in, in the Events tab of the cluster information page, click on VcdMachineScriptExecutionError to view the error details, and note the Resource Name.

    3. In kubectl, enter the following command to fetch all the machines on the cluster:

      kubectl get machines -A --kubeconfig=<path of downloaded kubeconfig>

      The node that could not join the cluster should be stuck in a Provisioning state. To identify this node, look for the machine name with the resource name that was present in the VcdMachineScriptExecutionError

    4. In kubectl, enter the following command:

      Run `kubectl --kubeconfig=<path of downloaded kubeconfig> delete machine -n clusterNamespace machineName
      Note:

      Ensure the machine name that is being deleted matches the resource name that was present in the VcdMachineScriptExecutionError.

      Once the VM is deleted, it is recreated and the node reattempts to join the cluster.

  • ERROR: failed to create cluster: failed to pull image failure

    This error occurs in the following circumstances:

    • When a user attempts to create a Tanzu Kubernetes Grid Cluster using VMware Cloud Director Container Service Extension 4.0, and it fails intermittently.

    • An image pull error due to a HTTP 408 response is reported.

    This issue can occur if there is difficulty reaching the Internet from the EPHEMERAL_TEMP_VM to pull the required images.

    Potential causes:

    • Slow or intermittent Internet connectivity.

    • The network IP Pool cannot resolve DNS (docker pull error).

    • The network MTU behind a firewall must set lower.

    To resolve the issue, ensure that there are no networking connectivity issues stopping the EPHEMERAL_TEMP_VM from reaching the Internet.

  • Users may encounter authorization errors when executing cluster operations in Kubernetes Container Clusters UI plug-in if a Legacy Rights Bundle exists for their organization.

    • After you upgrade VMware Cloud Director from version 9.1 or earlier, the system may create a Legacy Rights Bundle for each organization. This Legacy Rights Bundle includes the rights that are available in the associated organization at the time of the upgrade and is published only to this organization. To begin using the rights bundles model for an existing organization, you must delete the corresponding Legacy Rights Bundle. For more information, see Managing Rights and Roles.

    • In the Administration tab in the service provider portal, you can delete Legacy Rights Bundles. For more information, see Delete a Rights Bundle. Kubernetes Container Clusters UI plug-in CSE Management has a server setup process that automatically creates, and publishes Kubernetes Clusters Rights Bundle to all tenants. The rights bundle contains all rights that are involved in Kubernetes cluster management in VMware Cloud Director Container Service Extension 4.0.

  • Resizing or upgrading a Tanzu Kubernetes Grid cluster using kubectl.

    After a cluster has been created in the Kubernetes Container Clusters UI plug-in, you can use kubectl to manage workloads on Tanzu Kubernetes Grid clusters.

    If you also want to lifecycle manage, resize and upgrade the cluster through kubectl instead of the Kubernetes Container Clusters UI plug-in, complete the following steps:

    1. Delete the RDE-Projector operator from the cluster kubectl delete deployment -n rdeprojector-system rdeprojector-controller-manager

    2. Detach the Tanzu Kubernetes Grid cluster from Kubernetes Container Clusters UI plug-in.

      1. In the VMware Cloud Director UI, in the Cluster Overview page, retrieve the cluster ID of the cluster.

      2. Update the RDE with entity.spec.vcdKe.isVCDKECluster to false.

        1. Get the payload of the cluster - GET https://<vcd>/cloudapi/1.0.0/entities/<Cluster ID>

        2. Copy and update the json path in the payload. - entity.spec.vcdKe.isVCDKECluster to false.

        3. PUT https://<vcd>/cloudapi/1.0.0/entities/<Cluster ID> with the modified payload. It is necessary to include the entire payload as the body of PUT operation.

      3. At this point the cluster is detached from VMware Cloud Director Container Service Extension 4.0.0 and 4.0.1, and it is not possible to manage the cluster through VMware Cloud Director Container Service Extension 4.0.0 and 4.0.1. It is now possible to use kubectl to manage, resize or upgrade the cluster by applying CAPI yaml, the cluster API specification, directly.

  • Cluster creation fails in VMware Cloud Director Container Service Extension due to invalid GitHub Token with Error: 401 Bad Credentials

    This is the expected error during cluster creation. If customers set invalid Github access token, the cluster creation fails and the following error appears:

    error creating the GitHub repository client: failed to get GitHub latest version: failed to get repository 
    versions: failed to get repository versions: failed to get the list of releases: GET 
    https://api.github.com/repos/kubernetes-sigs/cluster-api/releases: 401 Bad credentials

    When you configure the VMware Cloud Director Container Service Extension server, enter an accurate Github access token.

  • Policies selection in VMware Cloud Director Container Service Extension 4 plug-in does not populate the full list after selection for the purpose of policy modification

    When a user selects a sizing policy in the Kubernetes Container Clusters plug-in and they want to change it, the dropdown menu only displays the selected sizing policy, and does not automatically load alternative sizing policies.

    The user has to delete the text manually to allow the alternative sizing policies to appear. This also occurs in the dropdown menu when the user selects of placement policies and storage policies.

    This is intentional. This is how the combobox html, Clarity, web component works.

    Note:Clarity is the web framework that VMware Cloud Director UI is built on.

    The dropdown box uses the input text as a filter. When nothing is in the input field, you can see all selections, and the selections filter as you type.

  • When you create a VMware Cloud Director Container Service Extension cluster, a character capitalization error appears

    In the Kubernetes Container Clusters UI, if you use capital letters, the following error appears:

    • Name must start with a letter, end with an alphanumeric, and only contain alphanumeric or hyphen (-) characters. (Max 63 characters)

    This is a restriction set by Kubernetes. Object names are validated under RFC 1035 labels. For more information, refer to Kubernetes website.

  • Kubernetes Container Clusters UI-Plugin 4.0 does not interoperate with other Kubernetes Container Clusters UI plug-ins, such as 3.5.0.

    The ability to operate these two plug-ins simultaneously without conflict is a known VMware Cloud Director UI limitation. You can only have one plug-in activated at any given time.

  • When a node of the cluster is deleted due to failure in vSphere or other underlying infrastructure, VMware Cloud Director Container Service Extension does not inform the user, and it does not auto-heal the cluster.

    When the node of a cluster is deleted, basic cluster operations, such as cluster resize and cluster upgrade, continue to work. The deleted node remains in deleted state, and is included in computations regarding size of the cluster.

    1. Download the Kubeconfig of the cluster.

    2. Use the following command to delete the machine that continues to use the deleted node configuration:

    kubectl --kubeconfig=<path to downloaded kubeconfig> get machines -A # try to match the machine name 
    here; also get namespace 
    kubectl -n <namespace name from above> --kubeconfig=<path to downloaded kubeconfig> delete machine 
    <machine name> 
    # wait for machine to get deleted

    The above command deletes the machine, and CAPVCD automatically creates a new machine.

  • VMware Cloud Director Container Service Extension fails to deploy clusters with TKG templates that have an unmodifiable placement policy set on them.

    1. Log in to the VMware Cloud Director Tenant Portal as an administrator.

    2. Click Libraries > vApp Templates.

    3. In the vApp Templates window, select the radio button to the left of the template.

    4. In the top ribbon, click Tag with Compute Policies.

    5. Select the Modifiable checkboxes, and click Tag.

  • In VMware Cloud Director 10.4, service providers are unable to log-in to the VMware Cloud Director Container Service Extension virtual machine by default.

    In VMware Cloud Director 10.4, after deploying the VMware Cloud Director Container Service Extension virtual machine from OVA file, the following two checkboxes in the VM settings page are not selected by default:

    • Allow local administrator password

    • Auto-generate password

    It is necessary to select these checkboxes to allow providers to log-in to the VMware Cloud Director Container Service Extension virtual machine in future to perform troubleshooting tasks.

    1. Log in to VMware Cloud Director UI as a service provider, and create a vApp from the VMware Cloud Director Container Service Extension OVA file. For more information, see Create a vApp from VMware Cloud Director Container Service Extension server OVA file.

    2. Once you deploy the vApp, and before you power it on, go to VM details > Guest OS Customization > Select Allow local administrator password and Auto-generate password.

    3. After the vApp update task finishes, power on the vApp.

  • Fast provisioning must be deactivated in Organization VDC in order to resize disks.

    1. Log in to VMware Cloud Director UI as a provider, and select Resources.

    2. In the Cloud Resources tab, select Organization VDCs, and select an organization VDC.

    3. In the organization VDC window, under Policies, select Storage.

    4. Click Edit, and deactivate the Fast provisioning toggle.

    5. Click Save.

  • When you log in as a service provider, after you upload the latest UI plug-in, the CSE Management tab does not display.

    Deactivate the previous UI plug-in that is built into VMware Cloud Director.

    1. Log in to VMware Cloud Director UI as a provider, and select More > Customize Portal.

    2. Select the check box next to the names of the target plug-ins, and click Enable or Disable.

    3. To start using the newly activated plug-in, refresh the Internet browser page.

    Note:

    If there are multiple activated plugins with the same name or id but different version, the lowest version plug-in is used. Therefore, only activate the highest version plug-in. Deactivate all other version plug-ins.

    For more information on managing plug-ins, see Managing Plug-Ins.

check-circle-line exclamation-circle-line close-line
Scroll to top icon