This section includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying Tanzu Kubernetes clusters.
Unsuccessful attempts to deploy Tanzu Kubernetes Grid can leave Docker objects in your system, which consume resources.
To clean up after attempts to deploy the management cluster fail, remove the Docker containers, images, and volumes that are left behind.
CAUTION: These steps remove all Docker containers, images, and volumes from your system. If you are running Docker processes that are not related to Tanzu Kubernetes Grid on this system, do no run these commands. Remove unneeded containers, images, and volumes individually.
Remove all kind clusters.
kind get clusters | xargs -n1 kind delete cluster --name
Remove all containers.
docker rm -f $(docker ps -aq)
Remove all container images.
docker rmi -f $(docker images -aq)
Remove all orphaned Docker volumes.
docker system prune --all --volumes -f
Unsuccessful attempts to deploy a Tanzu Kubernetes Grid management cluster leave orphaned objects in your vSphere instance or AWS account.
There are different ways to clean up unsuccessful deployments. Attempt these methods in the following order of preference.
kubectl deleteto delete the cluster manually.
kubectl delete cluster.cluster.x-k8s.io/cluster_name -n tkg-system
docker rm -v tkg-kind-unique_ID-control-plane -f
aws-nukewith a specific configuration to delete the objects from Amazon EC2.
tkg delete management-cluster removes the management cluster, but fails to delete the local
kind cluster from the bootstrap environment.
List all running
kind clusters and remove the one that looks like
kind delete cluster --name tkg-kind-unique_ID
List all running clusters and identify the
docker ps -a
Copy the container ID of the
kind cluster and remove it.
docker kill container_ID
tkg create cluster fails with a timeout error similar to the following:
I0317 11:11:16.658433 clusterclient.go:341] Waiting for resource my-cluster of type *v1alpha3.Cluster to be up and running E0317 11:26:16.932833 common.go:29] Error: unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): cluster control plane is still being initialized E0317 11:26:16.933251 common.go:33] Detailed log about the failure can be found at: /var/folders/_9/qrf26vd5629_y5vgxc1vjk440000gp/T/tkg-20200317T111108811762517.log
However, if you run
tkg get cluster, the cluster appears to have been created.
-----------------------+ NAME STATUS -----------------------+ my-cluster Provisioned -----------------------+
tkg get credentials command to add the cluster credentials to your
tkg get credentials my-cluster
kubectl to the cluster's context.
kubectl config set-context my-cluster@user
Check whether the cluster nodes are all in the ready state.
kubectl get nodes
Check whether all of the pods are up and running.
kubectl get pods -A
If all of the nodes and pods are running correctly, your Tanzu Kubernetes cluster has been created successfully and you can ignore the error.
If the nodes and pods are not running correctly, attempt to delete the cluster.
tkg delete cluster my-cluster
tkg delete clusterfails, use
kubectlto delete the cluster manually.