This section includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying Tanzu Kubernetes clusters.

Clean Up Docker After Unsuccessful Management Cluster Deployments

Problem

Unsuccessful attempts to deploy Tanzu Kubernetes Grid can leave Docker objects in your system, which consume resources.

Solution

To clean up after attempts to deploy the management cluster fail, remove the Docker containers, images, and volumes that are left behind.

CAUTION: These steps remove all Docker containers, images, and volumes from your system. If you are running Docker processes that are not related to Tanzu Kubernetes Grid on this system, do no run these commands. Remove unneeded containers, images, and volumes individually.

  1. Remove all kind clusters.

    kind get clusters | xargs -n1 kind delete cluster --name 
    
  2. Remove all containers.

    docker rm -f $(docker ps -aq)
    
  3. Remove all container images.

    docker rmi -f $(docker images -aq)
    
  4. Remove all orphaned Docker volumes.

    docker system prune --all --volumes -f
    

Clean Up After Unsuccessful Management Cluster Deployments

Problem

Unsuccessful attempts to deploy a Tanzu Kubernetes Grid management cluster leave orphaned objects in your vSphere instance or AWS account.

Solution

There are different ways to clean up unsuccessful deployments. Attempt these methods in the following order of preference.

  1. Run kubectl delete to delete the cluster manually.
     kubectl delete cluster.cluster.x-k8s.io/cluster_name -n tkg-system 
     docker rm -v tkg-kind-unique_ID-control-plane -f 
  2. In vSphere, locate the VMs created, power them off and delete them from your system.
  3. In AWS, use a tool such as aws-nuke with a specific configuration to delete the objects from Amazon EC2.
  4. In AWS, log in to your Amazon EC2 dashboard and delete the objects manually in the console.

Kind Cluster Remains after Deleting Management Cluster

Problem

Running tkg delete management-cluster removes the management cluster, but fails to delete the local kind cluster from the bootstrap environment.

Solution

  1. List all running kind clusters and remove the one that looks like tkg-kind-unique_ID

    kind delete cluster --name tkg-kind-unique_ID
    
  2. List all running clusters and identify the kind cluster.

    docker ps -a
    
  3. Copy the container ID of the kind cluster and remove it.

    docker kill container_ID
    

Deploying a Tanzu Kubernetes Cluster Times Out, but the Cluster is Created

Problem

Running tkg create cluster fails with a timeout error similar to the following:

I0317 11:11:16.658433 clusterclient.go:341] Waiting for resource my-cluster of type *v1alpha3.Cluster to be up and running
E0317 11:26:16.932833 common.go:29]
Error: unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): cluster control plane is still being initialized
E0317 11:26:16.933251 common.go:33]
Detailed log about the failure can be found at: /var/folders/_9/qrf26vd5629_y5vgxc1vjk440000gp/T/tkg-20200317T111108811762517.log

However, if you run tkg get cluster, the cluster appears to have been created.

-----------------------+

NAME	STATUS
-----------------------+

my-cluster	Provisioned
-----------------------+

Solution

  1. Use the tkg get credentials command to add the cluster credentials to your kubeconfig.

    tkg get credentials my-cluster
    
  2. Set kubectl to the cluster's context.

    kubectl config set-context my-cluster@user
    
  3. Check whether the cluster nodes are all in the ready state.

    kubectl get nodes
    
  4. Check whether all of the pods are up and running.

    kubectl get pods -A
    
  5. If all of the nodes and pods are running correctly, your Tanzu Kubernetes cluster has been created successfully and you can ignore the error.

  6. If the nodes and pods are not running correctly, attempt to delete the cluster.

    tkg delete cluster my-cluster
    
  7. If tkg delete cluster fails, use kubectl to delete the cluster manually.
check-circle-line exclamation-circle-line close-line
Scroll to top icon