Use this section as a reference to aid you while using VMware Cloud Director Container Service Extension as a service provider administrator.

You can view the errors in the Kubernetes Container Clusters UI in the Cluster Information page, in the Events tab.

VMware Cloud Director Container Service Extension 4.x stack involves more than one component running in different virtual machines. For any errors, it is necessary to collect and analyze logs from various sources. The following diagram details the various sources of logs for Tanzu Kubernetes Grid cluster lifecycle management workflows.

  • In the above diagram, Kubernetes logs can include CAPI, Kubernetes Cluster API Provider for VMware Cloud Director, Kubernetes Cloud Provider for VMware Cloud Director, Kubernetes Container Storage Interface driver for VMware Cloud Director, RDE Projector, and other pod logs.
  • In the above diagram, Cloudinit logs can include cloud-final.out, cloud-final.err, and cloud-****.
Note: Bootstrap VM is relevant only for cluster create and delete operations.

Troubleshooting through the Kubernetes Container Clusters UI

You can view the errors in the Kubernetes Container Clusters UI in the Cluster Information page, in the Events tab.

Log Analysis from VMware Cloud Director Container Service Extension Server

Log into the VMware Cloud Director Container Service Extension server VM, and collect and analyze the following logs:

  1. .~/cse.log
  2. .~/cse-wire.log if exists
  3. .~/cse.sh.log
  4. .~/cse-init-per-instance.log
  5. .~/config.toml
    Note: It is necessary to remove the API token before you upload the logs.

Log Analysis from Bootstrap VM

Log into the Bootstrap VM - "EPHEMERAL-TEMP-VM". This VM exists in the vApp named <cluster name>. If the VM does not exist, skip this step.
  1. /var/log/cloud-init.out
  2. /var/log/cloud-init.err
  3. /var/log/cloud-config.out
  4. /var/log/cloud-config.err
  5. /var/log/cloud-final.out
  6. /var/log/cloud-final.err
  7. /var/log/script_err.log
  8. Use the following scripts to collect and analyze the Kubernetes logs from the KIND cluster running on the bootstrap VM. For more information see https://github.com/vmware/cloud-provider-for-cloud-director/tree/main/scripts.
    1. Use kind get kubeconfig to retrieve the kubeconfig
    2. >chmod u+x generate-k8s-log-bundle.sh
    3. >./generate-k8s-log-bundle.sh <kubeconfig of the KIND cluster>

Log Analysis from the Target Cluster

Download the kubeconfig of the target cluster from the Kubernetes Container Clusters UI, and run the following script with the kubeconfig set to the target cluster.

  • Use the following script to collect and analyze the Kubernetes logs from the Target Cluster running on the Control Plane and Worker Node VMs. For more information see https://github.com/vmware/cloud-provider-for-cloud-director/tree/main/scripts.
    1. Download the kubeconfig of the target cluster from the Kubernetes Container Clusters UI.
    2. >chmod u+x generate-k8s-log-bundle.sh
    3. >./generate-k8s-log-bundle.sh <kubeconfig of the target cluster>

Log Analysis from an Unhealthy Control Plane or Worker Node of the Target Cluster

Log into the problematic VM associated with the Kubernetes node, and collect and analyze the following events:

  1. /var/log/capvcd/customization/error.log
  2. /var/log/capvcd/customization/status.log
  3. /var/log/cloud-init-output.log
  4. /root/kubeadm.err

Analyze the associated Server Configuration and Cluster Info Entities

  • VCDKEConfig RDE Instance: Configuration details for the VMware Cloud Director Container Service Extension server.
    1. Get the result of https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/VCDKEConfig/1.1.0.
    2. Remove the Github personal token before you upload or share this entity.
  • capvcdCluster RDE instance associated with the cluster. This represents the current status of the cluster.
    1. Retrieve the RDE ID from the Cluster Information page in the Kubernetes Container Clusters UI.
    2. Get the result of https://{{vcd}}/cloudapi/1.0.0/entities/{{cluster-id}}
    3. Remove the API token, and the kubeconfig if RDE is a version less than 1.2 before you upload or share the entity.
      Note: For RDEs of version >= 1.2, the API token and kubeconfig details are already hidden and encrypted. No action is necessary.