請參閱本主題以取得檢查與 TKG 元件有關的 主管 健全狀況的各種技術。
檢查 主管 網繭的狀態
主管 網繭執行 TKG 基礎結構元件。
檢查
主管 上的所有網繭是否都處於 [執行中] 狀態。
kubectl get pods -A | grep "Running"
備註: 此外,也可以使用
grep -v "Running"
傳回未處於 [執行中] 狀態的網繭。
例如:
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-855c5b4cfd-8w4hp 1/1 Running 0 27d kube-system coredns-855c5b4cfd-bx2hk 1/1 Running 0 27d kube-system coredns-855c5b4cfd-rrb5n 1/1 Running 0 27d kube-system docker-registry-423f01b9b30c727e9c237a0031999b14 1/1 Running 0 27d kube-system docker-registry-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 0 27d kube-system docker-registry-423f930ca2413d96beef34526c2e61b4 1/1 Running 0 27d kube-system etcd-423f01b9b30c727e9c237a0031999b14 1/1 Running 1 (27d ago) 27d kube-system etcd-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 1 (27d ago) 27d kube-system etcd-423f930ca2413d96beef34526c2e61b4 1/1 Running 1 (27d ago) 27d kube-system kube-apiserver-423f01b9b30c727e9c237a0031999b14 1/1 Running 1 (27d ago) 27d kube-system kube-apiserver-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 1 (27d ago) 27d kube-system kube-apiserver-423f930ca2413d96beef34526c2e61b4 1/1 Running 1 (27d ago) 27d kube-system kube-controller-manager-423f01b9b30c727e9c237a0031999b14 1/1 Running 0 27d kube-system kube-controller-manager-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 0 27d kube-system kube-controller-manager-423f930ca2413d96beef34526c2e61b4 1/1 Running 0 27d kube-system kube-proxy-8h499 1/1 Running 0 27d kube-system kube-proxy-bm7qt 1/1 Running 0 27d kube-system kube-proxy-dnmq2 1/1 Running 0 27d kube-system kube-scheduler-423f01b9b30c727e9c237a0031999b14 2/2 Running 13 (25d ago) 27d kube-system kube-scheduler-423f568f75dcb48725b0d768b7e4bdf5 2/2 Running 0 27d kube-system kube-scheduler-423f930ca2413d96beef34526c2e61b4 2/2 Running 0 27d kube-system kubectl-plugin-vsphere-423f01b9b30c727e9c237a0031999b14 1/1 Running 3 (27d ago) 27d kube-system kubectl-plugin-vsphere-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 3 (27d ago) 27d kube-system kubectl-plugin-vsphere-423f930ca2413d96beef34526c2e61b4 1/1 Running 3 (27d ago) 27d kube-system wcp-authproxy-423f01b9b30c727e9c237a0031999b14 1/1 Running 0 27d kube-system wcp-authproxy-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 0 27d kube-system wcp-authproxy-423f930ca2413d96beef34526c2e61b4 1/1 Running 0 27d kube-system wcp-fip-423f01b9b30c727e9c237a0031999b14 1/1 Running 0 27d kube-system wcp-fip-423f568f75dcb48725b0d768b7e4bdf5 1/1 Running 0 27d kube-system wcp-fip-423f930ca2413d96beef34526c2e61b4 1/1 Running 0 27d svc-tmc-c63 agent-updater-69f6598bcd-zrkwq 1/1 Running 0 27d svc-tmc-c63 agentupdater-workload-27696934--1-vz5sg 0/1 Completed 0 35s svc-tmc-c63 cluster-health-extension-68948f657-4gpcd 1/1 Running 0 27d svc-tmc-c63 extension-manager-f8886bfb7-vdsm9 1/1 Running 0 27d svc-tmc-c63 extension-updater-79b4787cf6-bwssn 1/1 Running 0 27d svc-tmc-c63 intent-agent-66576db5bd-lj2gk 1/1 Running 0 5d6h svc-tmc-c63 sync-agent-f9c68cc58-6zddj 1/1 Running 0 6d svc-tmc-c63 tmc-agent-installer-27696934--1-jgwvw 0/1 Completed 0 35s svc-tmc-c63 tmc-auto-attach-6488b9cd8b-xdfzz 1/1 Running 0 18h svc-tmc-c63 vsphere-resource-retriever-58985c99cb-68h6v 1/1 Running 0 18h vmware-system-appplatform-operator-system vmware-system-appplatform-operator-mgr-0 1/1 Running 0 27d vmware-system-appplatform-operator-system vmware-system-psp-operator-mgr-587f66646d-xxvmr 1/1 Running 0 27d vmware-system-capw capi-controller-manager-766c6fc449-4qqvf 2/2 Running 423 (26d ago) 27d vmware-system-capw capi-controller-manager-766c6fc449-bcpdq 2/2 Running 410 (26d ago) 27d vmware-system-capw capi-controller-manager-766c6fc449-rnznx 2/2 Running 0 26d vmware-system-capw capi-kubeadm-bootstrap-controller-manager-58fd767b49-585f2 2/2 Running 402 (25d ago) 27d vmware-system-capw capi-kubeadm-bootstrap-controller-manager-58fd767b49-96q6m 2/2 Running 398 (25d ago) 27d vmware-system-capw capi-kubeadm-bootstrap-controller-manager-58fd767b49-nssgq 2/2 Running 407 (25d ago) 27d vmware-system-capw capi-kubeadm-control-plane-controller-manager-559df997b-762jr 2/2 Running 193 (26d ago) 27d vmware-system-capw capi-kubeadm-control-plane-controller-manager-559df997b-bb42s 2/2 Running 189 (26d ago) 27d vmware-system-capw capi-kubeadm-control-plane-controller-manager-559df997b-wxhqv 2/2 Running 199 (26d ago) 27d vmware-system-capw capw-controller-manager-6dd47d75b-6ncxk 2/2 Running 400 (25d ago) 27d vmware-system-capw capw-controller-manager-6dd47d75b-k2ph4 2/2 Running 399 (25d ago) 27d vmware-system-capw capw-controller-manager-6dd47d75b-np9sg 2/2 Running 403 (25d ago) 27d vmware-system-capw capw-webhook-5484757c7-2pkbt 2/2 Running 0 27d vmware-system-capw capw-webhook-5484757c7-fkt7z 2/2 Running 0 27d vmware-system-capw capw-webhook-5484757c7-r85kw 2/2 Running 0 27d vmware-system-cert-manager cert-manager-6ccbcfcd57-lppgn 1/1 Running 1 (27d ago) 27d vmware-system-cert-manager cert-manager-cainjector-796f7b74db-5qvgn 1/1 Running 3 (27d ago) 27d vmware-system-cert-manager cert-manager-webhook-586948846f-b584m 1/1 Running 0 27d vmware-system-csi vsphere-csi-controller-6d8cfd75cd-66zbj 6/6 Running 0 27d vmware-system-csi vsphere-csi-controller-6d8cfd75cd-b4nhz 6/6 Running 1 (27d ago) 27d vmware-system-csi vsphere-csi-controller-6d8cfd75cd-v6hlf 6/6 Running 0 27d vmware-system-kubeimage image-controller-ff79fb5fc-kd6ts 1/1 Running 0 27d vmware-system-license-operator vmware-system-license-operator-controller-manager-7d555768bnxjb 1/1 Running 0 25d vmware-system-license-operator vmware-system-license-operator-controller-manager-7d555768j2sb8 1/1 Running 0 25d vmware-system-license-operator vmware-system-license-operator-controller-manager-7d555768w7v77 1/1 Running 0 25d vmware-system-logging fluentbit-p24gk 1/1 Running 0 27d vmware-system-logging fluentbit-rj2t8 1/1 Running 0 27d vmware-system-logging fluentbit-xx2lk 1/1 Running 0 27d vmware-system-nsop vmware-system-nsop-controller-manager-65b8445959-66msw 1/1 Running 0 27d vmware-system-nsop vmware-system-nsop-controller-manager-65b8445959-nm6xh 1/1 Running 0 27d vmware-system-nsop vmware-system-nsop-controller-manager-65b8445959-sv5w7 1/1 Running 0 27d vmware-system-nsx nsx-ncp-6f989c9c67-vb4x6 1/1 Running 5 (27d ago) 27d vmware-system-registry vmware-registry-controller-manager-7f49485b9-72kh7 2/2 Running 0 27d vmware-system-tkg masterproxy-tkgs-plugin-8npzx 1/1 Running 0 27d vmware-system-tkg masterproxy-tkgs-plugin-bjtsz 1/1 Running 0 27d vmware-system-tkg masterproxy-tkgs-plugin-v92gt 1/1 Running 0 27d vmware-system-tkg tkgs-plugin-server-5fc4c985c7-bz8jh 1/1 Running 0 27d vmware-system-tkg tkgs-plugin-server-5fc4c985c7-r9wj5 1/1 Running 0 27d vmware-system-tkg tkgs-plugin-server-5fc4c985c7-sdr55 1/1 Running 0 27d vmware-system-tkg vmware-system-tkg-controller-manager-7ffcc55df5-dqkkm 2/2 Running 0 25d vmware-system-tkg vmware-system-tkg-controller-manager-7ffcc55df5-hkvx9 2/2 Running 0 25d vmware-system-tkg vmware-system-tkg-controller-manager-7ffcc55df5-txxrf 2/2 Running 0 25d vmware-system-tkg vmware-system-tkg-state-metrics-5bbb6d668c-7c5vt 2/2 Running 238 (26d ago) 27d vmware-system-tkg vmware-system-tkg-state-metrics-5bbb6d668c-c87zs 2/2 Running 237 (26d ago) 27d vmware-system-tkg vmware-system-tkg-state-metrics-5bbb6d668c-wc46p 2/2 Running 237 (26d ago) 27d vmware-system-tkg vmware-system-tkg-webhook-567f9fd68c-425xs 2/2 Running 0 25d vmware-system-tkg vmware-system-tkg-webhook-567f9fd68c-97d6z 2/2 Running 0 25d vmware-system-tkg vmware-system-tkg-webhook-567f9fd68c-dnkgt 2/2 Running 0 25d vmware-system-ucs upgrade-compatibility-service-5745846d58-tpk67 1/1 Running 0 27d vmware-system-ucs upgrade-compatibility-service-5745846d58-twxkt 1/1 Running 0 27d vmware-system-ucs upgrade-compatibility-service-5745846d58-wzl8x 1/1 Running 0 27d vmware-system-vmop vmware-system-vmop-controller-manager-c8499b9df-5h6f9 2/2 Running 0 27d vmware-system-vmop vmware-system-vmop-controller-manager-c8499b9df-6wgr7 2/2 Running 0 27d vmware-system-vmop vmware-system-vmop-controller-manager-c8499b9df-tvbg6 2/2 Running 0 27d vmware-system-vmop vmware-system-vmop-hostvalidator-8498cc5f4d-vqhnk 1/1 Running 0 27d
如果
主管 上有任何網繭未處於 [執行中] 狀態,請使用以下命令檢查該網繭。
kubectl describe pod <POD Name> -n <Namespace>
檢查 主管 資源的狀態
TKG 控制器資源:
kubectl get tkc
叢集 API 資源 (CAPI、CABPK、CAPW、CAPV):
kubectl get cluster-api
虛擬機器運算子資源:
kubectl get virtualmachines,virtualmachineservices,virtualmachinesetresourcepolicies
叢集範圍的且從內容程式庫同步的虛擬機器運算子資源:
kubectl get virtualmachineimages
儲存區資源:
kubectl get persistentvolumeclaims,cnsnodevmattachment,cnsvolumemetadatas
網路資源 (特定於 NSX):
kubectl get service,lb,lbm,vnet,vnetif,nsxerrors,nsxnetworkinterfaces
取得所有
主管 資源並將其寫入檔案:
kubectl api-resources --namespaced -o name | paste -d',' -s | xargs kubectl get -n <namespace> > resources_in_namespace.txt
驗證是否存在叢集 API 部署
驗證是否存在 CAPI、CAPW、CAPV 部署。
kubectl -n vmware-system-capw get deployments.apps NAME READY UP-TO-DATE AVAILABLE AGE capi-controller-manager 2/2 2 2 18h capi-kubeadm-bootstrap-controller-manager 2/2 2 2 18h capi-kubeadm-control-plane-controller-manager 2/2 2 2 18h capv-controller-manager 2/2 2 2 10h capw-controller-manager 2/2 2 2 18h capw-webhook 2/2 2 2 18h
檢查支援服務包檔案
支援服務包中的 commands/ 資料夾包含 journalctl 記錄,這些記錄提供了有關 WCP 啟動期間所發生情況的詳細資料。
kubectl_describe_virtualmachine.txt
kubectl_describe_tanzukubernetescluster.txt
kubectl_describe_kubeadmconfig.txt
kubectl-describe-pod_kube-system.txt
kubectl-describe-pod_vmware-system-capw.txt
kubectl-describe-pod_vmware-system-tkg.txt
kubectl-describe-pod_vmware-system-ucs.txt
Kubectl-describe-pod_vmware-system-vmop.txt
kubectl_describe_cluster_resource_virtualmachineimages.txt
docker_images.txt
檢查 TKG 叢集的健全狀況
檢查所有叢集節點 (虛擬機器) 是否都處於就緒狀態。
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME tkgs-cluster-13-control-plane-dpmjj Ready control-plane,master 12d v1.22.9+vmware.1 10.244.0.25 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-control-plane-nb5r6 Ready control-plane,master 12d v1.22.9+vmware.1 10.244.0.18 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-control-plane-zpcgs Ready control-plane,master 12d v1.22.9+vmware.1 10.244.0.26 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-c7t8c Ready <none> 12d v1.22.9+vmware.1 10.244.0.24 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-slzvn Ready <none> 12d v1.22.9+vmware.1 10.244.0.19 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-vzrsd Ready <none> 12d v1.22.9+vmware.1 10.244.0.22 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-k5h4s Ready <none> 12d v1.22.9+vmware.1 10.244.0.20 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-lkmdx Ready <none> 12d v1.22.9+vmware.1 10.244.0.21 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11 tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-qwv98 Ready <none> 12d v1.22.9+vmware.1 10.244.0.23 <none> VMware Photon OS/Linux 4.19.225-3.ph3 containerd://1.5.11
檢查所有網繭是否都處於 [執行中]。
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system antrea-agent-58hv7 2/2 Running 0 12d kube-system antrea-agent-6x897 2/2 Running 0 12d kube-system antrea-agent-7d99k 2/2 Running 0 12d kube-system antrea-agent-b7vdv 2/2 Running 0 12d kube-system antrea-agent-dhdlg 2/2 Running 0 12d kube-system antrea-agent-mj4wx 2/2 Running 0 12d kube-system antrea-agent-v7vtv 2/2 Running 0 12d kube-system antrea-agent-x49gz 2/2 Running 1 (12d ago) 12d kube-system antrea-agent-z2gth 2/2 Running 0 12d kube-system antrea-controller-bb59f5fbf-t6cm9 1/1 Running 0 12d kube-system antrea-resource-init-65b586c9db-2cbxx 1/1 Running 0 12d kube-system coredns-5f64c4fff8-2gsqn 1/1 Running 0 12d kube-system coredns-5f64c4fff8-hvkg9 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-control-plane-dpmjj 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-control-plane-nb5r6 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-control-plane-zpcgs 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-c7t8c 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-slzvn 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a1-gq458-9d6458d6f-vzrsd 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-k5h4s 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-lkmdx 1/1 Running 0 12d kube-system docker-registry-tkgs-cluster-13-worker-nodepool-a2-tw99z-7b547b7f85-qwv98 1/1 Running 0 12d kube-system etcd-tkgs-cluster-13-control-plane-dpmjj 1/1 Running 0 12d kube-system etcd-tkgs-cluster-13-control-plane-nb5r6 1/1 Running 0 12d kube-system etcd-tkgs-cluster-13-control-plane-zpcgs 1/1 Running 0 12d kube-system kube-apiserver-tkgs-cluster-13-control-plane-dpmjj 1/1 Running 0 12d kube-system kube-apiserver-tkgs-cluster-13-control-plane-nb5r6 1/1 Running 0 12d kube-system kube-apiserver-tkgs-cluster-13-control-plane-zpcgs 1/1 Running 0 12d kube-system kube-controller-manager-tkgs-cluster-13-control-plane-dpmjj 1/1 Running 0 12d kube-system kube-controller-manager-tkgs-cluster-13-control-plane-nb5r6 1/1 Running 1 (12d ago) 12d kube-system kube-controller-manager-tkgs-cluster-13-control-plane-zpcgs 1/1 Running 0 12d kube-system kube-proxy-4kp57 1/1 Running 0 12d kube-system kube-proxy-5q8pw 1/1 Running 0 12d kube-system kube-proxy-5th6p 1/1 Running 0 12d kube-system kube-proxy-8m6mx 1/1 Running 0 12d kube-system kube-proxy-dn5lp 1/1 Running 0 12d kube-system kube-proxy-qgmcg 1/1 Running 0 12d kube-system kube-proxy-vbq27 1/1 Running 0 12d kube-system kube-proxy-xhnws 1/1 Running 0 12d kube-system kube-proxy-zgfvn 1/1 Running 0 12d kube-system kube-scheduler-tkgs-cluster-13-control-plane-dpmjj 1/1 Running 0 12d kube-system kube-scheduler-tkgs-cluster-13-control-plane-nb5r6 1/1 Running 1 (12d ago) 12d kube-system kube-scheduler-tkgs-cluster-13-control-plane-zpcgs 1/1 Running 0 12d kube-system metrics-server-774bc4dc99-qp7tb 1/1 Running 0 12d vmware-system-auth guest-cluster-auth-svc-6m6cd 1/1 Running 0 12d vmware-system-auth guest-cluster-auth-svc-h44xf 1/1 Running 0 12d vmware-system-auth guest-cluster-auth-svc-l968n 1/1 Running 0 12d vmware-system-cloud-provider guest-cluster-cloud-provider-5f87d5d7d8-rmd78 1/1 Running 1 (12d ago) 12d vmware-system-csi vsphere-csi-controller-7d858778bd-h7zhg 6/6 Running 4 (12d ago) 12d vmware-system-csi vsphere-csi-controller-7d858778bd-rkl98 6/6 Running 0 12d vmware-system-csi vsphere-csi-controller-7d858778bd-snmk7 6/6 Running 0 12d vmware-system-csi vsphere-csi-node-22fnt 3/3 Running 1 (12d ago) 12d vmware-system-csi vsphere-csi-node-5jtbr 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-87lz6 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-gp9sf 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-k2psv 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-mg8bw 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-pctmv 3/3 Running 0 12d vmware-system-csi vsphere-csi-node-sslrl 3/3 Running 1 (12d ago) 12d vmware-system-csi vsphere-csi-node-zbqbq 3/3 Running 0 12d
取得並說明 TKG 叢集狀態。
kubectl get tkc <clustername>
kubectl describe tkc <clustername>
檢查 TKG 控制器管理程式健全狀況
檢查 TKG 控制器管理程式狀態和健全狀況。
kubectl get deployments -n vmware-system-tkg vmware-system-tkg-controller-manager -o yaml
檢查虛擬機器運算子健全狀況
網繭應正在執行中。
kubectl get pods -n vmware-system-vmop NAME READY STATUS RESTARTS AGE vmware-system-vmop-controller-manager-c8499b9df-5h6f9 2/2 Running 0 27d vmware-system-vmop-controller-manager-c8499b9df-6wgr7 2/2 Running 0 27d vmware-system-vmop-controller-manager-c8499b9df-tvbg6 2/2 Running 0 27d vmware-system-vmop-hostvalidator-8498cc5f4d-vqhnk 1/1 Running 0 27d
虛擬機器運算子將建立 VirtualNetworkInterface
並驗證其狀態。如果節點虛擬機器未取得 IP,則將首先檢查這一情況。虛擬機器建立是否通過了此階段?
虛擬機器運算子還負責協調 VirtualMachineService
並更新其狀態。如果 TKG 叢集 Kubernetes API 無法透過其外部 IP 存取,請檢查虛擬機器運算子記錄。
例如,選擇其中一個虛擬機器運算子網繭,指定命名空間並指定管理程式容器。(
logs
命令適用於容器。在任何控制器中,網繭都是管理程式容器,您可以檢查它的記錄。)
kubectl logs -f vmware-system-vmop-controller-manager-c8499b9df-5h6f9 -n vmware-system-vmop manager