VMware Telco Cloud Service Assurance installation is triggered by the execution of the tcx_app_deployment.sh script. This script executes two main stages: The initialization and the installation stage.

  • Troubleshooting initialization issues:
    1. VMware Telco Cloud Service Assurance initialization (pushing artifacts, deploying core components) are executed by a python script called tcx_install.zip.
    2. If the tcx_app_deployment.sh script exits with a failure and the failure message includes a pattern similar to the following example:
      Traceback (most recent call last):
      04:09:08   File "/tmp/Bazel.runfiles_405hciz4/runfiles/tcx/scripts/tcx_install.py", line 215, in <module>
      04:09:08     main()
      Then, the deployment failed during the initialization stage. Attach the initialization logs in the ticket, and contact IT administrator for resolution.
    3. How to get product initialization logs: Errors during the execution of the tcx_install.zip python script are logged to a log file named, tcx_installer_log.log under the scripts directory of the unpacked deployer bundle on the deployment host. You can attach these logs while filing a support ticket.
  • Troubleshooting VMware Telco Cloud Service Assurance installation issues:
    If all the apps are not reconciled, the deployment fails. For example:
    root [ ~/tcx-deployer/scripts/deployment ]# kubectl get tcxproduct
    NAME   STATUS            READY     MESSAGE                                                      AGE
    tcsa   updateCompleted   Unknown   The following App CRs are still reconciling: kafka-strimzi   168m
    Follow the procedure:
    1. Launch the deployment container, so that you can use kubectl. Refer, VMware Telco Cloud Service Assurance deployment guide.
    2. Set the KUBECONFIG variable to your cluster's kubeconfig file using command:
      export KUBECONFIG=/root/.kube/<your-kubernetes-cluster-kubeconfig-file>
    3. Use the below kubectl commands in the deployment container to help narrow down the issue. Also, you can attach the output of each kubectl command in your support request:
      1. Get the current product status:
        kubectl get tcxproduct tcsa
      2. If the message appears: The following App CRs are still reconciling" or The following App CRs failed, check the status of each App in the message by running the following command:
        kubectl describe app <app-name>
      3. In the output of the above command, look for the Useful Error Message at the bottom. This message provides adequate information about the exact resource (Deployment, StatefulSet, ReplicaSet, Job, and so on..) that is failing.
      4. Depending on the resource that is failing or stuck Reconciling, run the kubectl describe command for that resource to get more information:
        kubectl describe deployment <deployment-name> or,
        kubectl describe service <service-name> or,
        kubectl describe statefulset <statefulset-name> or,
        kubectl describe daemonset <daemonset-name> or,
        kubectl describe job <job-name> or,
      5. Once you have narrowed down to the appropriate resource, if the above commands do not provide adequate information get information from the pods owned by the resource:
        kubectl get pods -A | grep <app-name>
        kubectl describe pod <pod-name>  -n <pod-namespace>   # where pod-name and pod-namespace is the name and namespace of the pod obtained from previous command
        and..
        kubectl logs <pod-name>  -n <pod-namespace>
      6. Get product installation logs. Product installation is executed by a service called Admin Operator. The logs of the installation can be obtained from the admin-operator pod by running the following steps:
        kubectl get pods | grep admin-operator
        kubectl logs <admin-operator-pod-name> # Where <admin-operator-pod-id> is the pod id from the previous command

      Attach these logs while filing a support request.