This topic documents steps for how to run diagnosis in management cluster to do troubleshooting for cluster lifecycle and CNF customization for management cluster itself or workload clusters it managed.
Procedure
Login to management cluster
$ ssh capv@<management-cluster-vip>
Run diagnosis
capv@mc2-master-control-plane-9967g [ ~ ]$ run-diagnosis
[INFO] Generate TcaTestSuites CR /tmp/all-cluster-diagnosis.yml successfully.
[INFO] Start to install/upgrade test controller service...
[INFO] Helm command result:
Release "test-controller" does not exist. Installing it now.
NAME: test-controller
LAST DEPLOYED: Wed Nov 24 06:10:01 2021
NAMESPACE: tca-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
[INFO] Start to run test cases...
tcatestsuite.testnf.telco.vmware.com/all-cluster-diagnosis created
[INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
[INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
[INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
...
[INFO] You can view reports here http://10.197.155.23:30007/tca-system/all-cluster-diagnosis/0/0/cluster-diagnosis-reports/report.html
[INFO] You can download all the results here http://10.197.155.23:30007/tca-system/all-cluster-diagnosis/0/0/cluster-diagnosis-reports.tar.gz
Results
There is a summary HTML report for the management and workload clusters. User can view it on the browser. And also click the cluster name to see details for each cluster.
What to do next
Users can delete all diagnosis reports and service.
capv@mc2-master-control-plane-9967g [ ~ ]$ run-diagnosis -u
[INFO] Start to uninstall test controller service...
tcatestsuite.testnf.telco.vmware.com "all-cluster-diagnosis" deleted
release "test-controller" uninstalled