This section includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying Tanzu Kubernetes clusters.
tkg initon MacOS results in
nfs-utilson Photon OS Nodes
Unsuccessful attempts to deploy Tanzu Kubernetes Grid can leave Docker objects in your system, which consume resources.
To clean up after attempts to deploy the management cluster fail, remove the Docker containers, images, and volumes that are left behind.
CAUTION: These steps remove all Docker containers, images, and volumes from your system. If you are running Docker processes that are not related to Tanzu Kubernetes Grid on this system, do no run these commands. Remove unneeded containers, images, and volumes individually.
Remove all kind clusters.
kind get clusters | xargs -n1 kind delete cluster --name
Remove all containers.
docker rm -f $(docker ps -aq)
Remove all container images.
docker rmi -f $(docker images -aq)
Remove all orphaned Docker volumes.
docker system prune --all --volumes -f
Unsuccessful attempts to deploy a Tanzu Kubernetes Grid management cluster leave orphaned objects in your vSphere instance or AWS account.
There are different ways to clean up unsuccessful deployments. Attempt these methods in the following order of preference.
kubectl delete to delete the cluster manually.
If the deployment of the management cluster fails, the Tanzu Kubernetes Grid CLI provides a help message to inform you of the location of
kubeconfig file of the bootstrap cluster, and prompts you to run the following command, replacing
UUID with the ID provided in the help message.
kubectl delete cluster.cluster.x-k8s.io/cluster_name -n tkg-system --kubeconfig ~/.kube-tkg/tmp/config-UUID
unique_IDwith the ID provided in the help message.
docker rm -v tkg-kind-unique_ID-control-plane -f
aws-nukewith a specific configuration to delete the objects from Amazon EC2.
tkg delete management-cluster removes the management cluster, but fails to delete the local
kind cluster from the bootstrap environment.
List all running
kind clusters and remove the one that looks like
kind delete cluster --name tkg-kind-unique_ID
List all running clusters and identify the
docker ps -a
Copy the container ID of the
kind cluster and remove it.
docker kill container_ID
tkg create cluster fails with a timeout error similar to the following:
I0317 11:11:16.658433 clusterclient.go:341] Waiting for resource my-cluster of type *v1alpha3.Cluster to be up and running E0317 11:26:16.932833 common.go:29] Error: unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): cluster control plane is still being initialized E0317 11:26:16.933251 common.go:33] Detailed log about the failure can be found at: /var/folders/_9/qrf26vd5629_y5vgxc1vjk440000gp/T/tkg-20200317T111108811762517.log
However, if you run
tkg get cluster, the cluster appears to have been created.
-----------------------+ NAME STATUS -----------------------+ my-cluster Provisioned -----------------------+
tkg get credentials command to add the cluster credentials to your
tkg get credentials my-cluster
kubectl to the cluster's context.
kubectl config set-context my-cluster@user
Check whether the cluster nodes are all in the ready state.
kubectl get nodes
Check whether all of the pods are up and running.
kubectl get pods -A
If all of the nodes and pods are running correctly, your Tanzu Kubernetes cluster has been created successfully and you can ignore the error.
If the nodes and pods are not running correctly, attempt to delete the cluster.
tkg delete cluster my-cluster
tkg delete cluster fails, use
kubectl to delete the cluster manually.
When you run the
tkg init --ui command on a Windows system, the UI opens in your default browser, but the graphics and styling are not applied. This happens because a Windows registry is set to
regeditto open the Registry Editor utility.
text/cssand click OK.
tkg init --uicommand again to relaunch the UI.
If you run the
tkg init command on Mac OS with the latest stable version of Docker Desktop,
tkg init fails with the error message:
Error: : kubectl prerequisites validation failed: kubectl client version v1.15.5 is less than minimum supported kubectl client version 1.17.0
This happens because Docker Desktop symlinks
kubectl 1.15 into the path.
Put a newer version of
kubectl in the path before Docker's version.
You can use SSH to connect to individual nodes of management clusters or Tanzu Kubernetes clusters. To do so, the SSH key pair that you created when you deployed the management cluster must be available on the machine on which you run the SSH command. Consquently, you must run
ssh commands on the machine on which you run
The SSH keys that you register with the management cluster, and consequently that are used by any Tanzu Kubernetes clusters that you deploy from the management cluster, are associated with the following user accounts:
To connect to a node by using SSH, run one of the following commands from the machine that you use as the bootstrap environment:
Because the SSH key is present on the system on which you are running the
ssh command, no password is required.
If you have lost the credentials for a management cluster, for example by inadvertently deleting the
.kube-tkg/config file on the system on which you run
tkg commands, you can recover the credentials from the management cluster control plane node.
tkg get management-clusterto recreate the
Use SSH to log in to the management cluster control plane node.
admin.conf file for the management cluster.
sudo vi /etc/kubernetes/admin.conf
admin.conf file contains the cluster name, the cluster user name, the cluster context, and the client certificate data.
.kube-tkg/configfile on the system on which you run
In Tanzu Kubernetes Grid 1.0 and 1.1.0,
nfs-utils was not included by default in the Photon OS base image template from which cluster nodes are created. In Tanzu Kubernetes Grid 1.1.2 and later,
nfs-utils is enabled by default.
If you do not require
nfs-utils, you can remove it from cluster node VMs.
nfs-utilsto mount NFS volumes on nodes, deploy new clusters by using the Kubernetes v1.17.6, v1.17.9, v1.18.3, or v1.18.6 base OS images that are supplied with Tanzu Kubernetes Grid v1.1.2 and 1.1.3. The
nfs-utilspackage is installed by default.
nfs-utilson clusters that you deployed with Tanzu Kubernetes Grid 1.0 or 1.1.0, upgrade the clusters to use the Kubernetes v1.17.6, v1.17.9, v1.18.3, or v1.18.6 base OS images that are supplied with Tanzu Kubernetes Grid v1.1.2 or 1.1.3. The
nfs-utilspackage is installed during the upgrade.
nfs-utils on clusters that you deploy with Tanzu Kubernetes Grid 1.1.2 and later, use SSH to log in to the cluster node VMs and run the following command:
tdnf erase nfs-utils