This topic documents steps for how to run diagnosis in management cluster to do troubleshooting for cluster lifecycle and CNF customization for management cluster itself or workload clusters it managed.

Procedure

  1. Login to management cluster
    $ ssh capv@<management-cluster-vip>
  2. Run diagnosis
    capv@mc2-master-control-plane-9967g [ ~ ]$ run-diagnosis
    [INFO] Generate TcaTestSuites CR /tmp/all-cluster-diagnosis.yml successfully.
    [INFO] Start to install/upgrade test controller service...
    [INFO] Helm command result:
    Release "test-controller" does not exist. Installing it now.
    NAME: test-controller
    LAST DEPLOYED: Wed Nov 24 06:10:01 2021
    NAMESPACE: tca-system
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    [INFO] Start to run test cases...
    tcatestsuite.testnf.telco.vmware.com/all-cluster-diagnosis created
    [INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
    [INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
    [INFO] TcaTestSuite all-cluster-diagnosis is IN_PROGRESS....
    ...
    [INFO] You can view reports here http://10.197.155.23:30007/tca-system/all-cluster-diagnosis/0/0/cluster-diagnosis-reports/report.html
    [INFO] You can download all the results here http://10.197.155.23:30007/tca-system/all-cluster-diagnosis/0/0/cluster-diagnosis-reports.tar.gz

Results

There is a summary HTML report for the management and workload clusters. User can view it on the browser. And also click the cluster name to see details for each cluster.

Figure 1. Cluster diagnosis report

What to do next

Users can delete all diagnosis reports and service.

capv@mc2-master-control-plane-9967g [ ~ ]$ run-diagnosis -u
[INFO] Start to uninstall test controller service...
tcatestsuite.testnf.telco.vmware.com "all-cluster-diagnosis" deleted
release "test-controller" uninstalled