This topic explains how to shut down and restart workload clusters, and how to delete them.
You may need to shut down and restart workload clusters to accommodate planned outages for network maintenance or planned network downtime.
jq
installed locally.Run the following command to collect information about your etcd
database:
kubectl --kubeconfig /etc/kubernetes/admin.conf get pods `kubectl --kubeconfig /etc/kubernetes/admin.conf get pods -A | grep etc | awk '{print $2}'` -n kube-system -o=jsonpath='{.spec.containers[0].command}' | jq
Example output:
[
"etcd",
"--advertise-client-urls=https://192.168.7.154:2379",
"--cert-file=/etc/kubernetes/pki/etcd/server.crt",
"--client-cert-auth=true",
"--data-dir=/var/lib/etcd",
"--initial-advertise-peer-urls=https://192.168.7.154:2380",
"--initial-cluster=workload-vsphere-tkg2-control-plane-fk5hw=https://192.168.7.154:2380",
"--key-file=/etc/kubernetes/pki/etcd/server.key",
"--listen-client-urls=https://127.0.0.1:2379,https://192.168.7.154:2379",
"--listen-metrics-urls=http://127.0.0.1:2381",
"--listen-peer-urls=https://192.168.7.154:2380",
"--name=workload-vsphere-tkg2-control-plane-fk5hw",
"--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt",
"--peer-client-cert-auth=true",
"--peer-key-file=/etc/kubernetes/pki/etcd/peer.key",
"--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt",
"--snapshot-count=10000",
"--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt"
]
For each control plane node:
Run the ssh
command to log in to the node.
Run the following command to locate its etcdctl
executable.
find / -type f -name "*etcdctl*" -print
Example output:
/run/containerd/io.containerd.runtime.v1.linux/k8s.io/823581f975804b65048f4babe2015a95cfa7ed6f767073796afe47b9d03299fb/rootfs/usr/local/bin/etcdctl`
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/19/fs/usr/local/bin/etcdctl
Create an etcd
backup file and verify.
Run the following command:
ETCD-EXE snapshot save LOCAL-BACKUP --endpoints=ENDPOINTS --cacert=CA --cert=CERT --key=KEY
Where:
ETCD-EXE
is the local path to the etcdctl
executableLOCAL-BACKUP
is the local file to back up to, for example /tmp/etcdBackup1.db
ENDPOINTS
, CA
, CERT
, and KEY
are the --advertise-client-urls
, --peer-trusted-ca-file
, --cert-file
, and --key-file
values recorded aboveFor example:
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/19/fs/usr/local/bin/etcdctl snapshot save /tmp/etcdBackup1.db \
--endpoints=https://192.168.7.154:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
Verify that the backup file was created:
ls -l LOCAL-BACKUP
Verify the backup content by generating a snapshot from the file:
ETCD-EXE --write-out=table snapshot status LOCAL-BACKUP
From your bootstrap machine, run the following sequence of commands to collect cluster information and save it to a file:
tanzu cluster list -A > CLUSTER-INFO-1
kubectl config get-contexts >> CLUSTER-INFO-1
kubectl config use-context tkg-mgmt-vsphere-20211111074850-admin@tkg-mgmt-vsphere-20211111074850 >> CLUSTER-INFO-1
kubectl get nodes -o wide >> CLUSTER-INFO-1
kubectl config use-context mycluster1-admin@mycluster1 >> CLUSTER-INFO-1
kubectl get nodes -o wide >> CLUSTER-INFO-1
cat CLUSTER-INFO-1
Where CLUSTER-INFO-1
is a local text file to save the information to, for example /tmp/SaveClusterInfo1.txt
.
Drain all applications on the worker nodes.
Stop all virtual machines on vCenter in the following order:
Restart all virtual machines on vCenter in the following order:
Run the following sequence of commands to collect cluster information and save it to a different file:
tanzu cluster list -A --include-management-cluster -A > CLUSTER-INFO-2
kubectl config get-contexts >> CLUSTER-INFO-2
kubectl config use-context tkg-mgmt-vsphere-20211111074850-admin@tkg-mgmt-vsphere-20211111074850 >> CLUSTER-INFO-2
kubectl get nodes -o wide >> CLUSTER-INFO-2
kubectl config use-context mycluster1-admin@mycluster1 >> CLUSTER-INFO-2
kubectl get nodes -o wide >> CLUSTER-INFO-2
cat CLUSTER-INFO-2
Where CLUSTER-INFO-2
is a different local text file to save the information to, for example /tmp/SaveClusterInfo2.txt
.
Compare the two cluster information files to verify that they have the same cluster information, for example:
sdiff /tmp/SaveClusterInfo1.txt /tmp/SaveClusterInfo2.txt
To delete a workload cluster, run the tanzu cluster delete
command. Depending on the cluster contents and cloud infrastructure, you may need to delete in-cluster volumes and services before you delete the cluster itself.
ImportantYou must delete workload clusters explicitly; you cannot delete them by deleting their namespace in the management cluster.
List the clusters.
To list all workload clusters within the Tanzu CLI’s current login context, run the tanzu cluster list -A
command.
tanzu cluster list -A
Delete volumes and services.
If the cluster you want to delete contains persistent volumes or services such as load balancers and databases, you may need to manually delete them before you delete the cluster itself. What you need to pre-delete depends on your cloud infrastructure:
To delete Service type LoadBalancer (Service) in a cluster:
Set kubectl
to the cluster’s context.
kubectl config set-context my-cluster@user
Retrieve the cluster’s list of services.
kubectl get service
Delete each Service type LoadBalancer
.
kubectl delete service <my-svc>
To delete Persistent Volume (PV) and Persistent Volume Claim (PVC) objects in a cluster:
Run kubectl config set-context my-cluster@user
to set kubectl
to the cluster’s context.
Run kubectl get pvc
to retrieve the cluster’s Persistent Volume Claims (PVCs).
For each PVC:
Run kubectl describe pvc <my-pvc>
to identify the PV it is bound to. The PV is listed in the command output as Volume, after Status: Bound.
Run kubectl describe pv <my-pv>
to describe to determine if its bound PV Reclaim Policy
is Retain
or Delete
.
Run kubectl delete pvc <my-pvc>
to delete the PVC.
If the PV reclaim policy is Retain
, run kubectl delete pv <my-pvc>
and then log into your cloud portal and delete the PV object there. For example, delete a vSphere CNS volume from your datastore pane > Monitor > Cloud Native Storage > Container Volumes. For more information about vSphere CNS, see Getting Started with Cloud Native Storage in vSphere.
kubectl delete
command.Other Services: Any subnet and AWS-backed service in the cluster’s VPC, such as an RDS or VPC, and related resources such as:
Delete these resources in the AWS UI as above or with the aws
CLI.
Persistent Volumes and Persistent Volume Claims: Delete these resources with the kubectl delete
command as described in Delete Persistent Volume Claims and Persistent Volumes, below.
Deleting a cluster deletes everything that TKG created in the cluster’s resource group.
If needed, migrate workloads off of the clusters, for example by using Velero as described in Cluster Migration and Resource Filtering in the Velero documentation.
Delete the clusters.
To delete a cluster, run tanzu cluster delete
.
tanzu cluster delete my-cluster
If the cluster is running in a namespace other than the default
namespace, you must specify the --namespace
option to delete that cluster.
tanzu cluster delete my-cluster --namespace=my-namespace
To skip the yes/no
verification step when you run tanzu cluster delete
, specify the --yes
option.
tanzu cluster delete my-cluster --namespace=my-namespace --yes
To delete a cluster on AWS, the AWS_REGION
variable must be set to the region where the cluster is running. You can set AWS_REGION
in the local environment or credential profile, as described in Configure AWS Account Credentials. To delete the cluster in a different region, prepend the setting to the tanzu cluster delete
command:
AWS_REGION=eu-west-1 tanzu cluster delete my-cluster
ImportantDo not change context or edit the
.kube-tkg/config
file while Tanzu Kubernetes Grid operations are running.