Common issues after installation

Unable to log in to the TMC console after installation

Problem: After a successful installation of TMC Self-Managed, users are unable to log in / access the TMC user interface.

Cause: This is typically the result of previous failed installs leaving behind stale secrets.

Fix: To enable access to the TMC console, you need to delete stale components, such as secrets and dynamic OIDC client.

  1. Delete residual artifacts. Run the following commands to delete stale components.
    kubectl -n tmc-local delete oidcclient/client.oauth.pinniped.dev-auth-manager-pinniped-oidc-client secret/client.oauth.pinniped.dev-auth-manager-pinniped-oidc-client-client-secret-generated  
    kubectl -n tmc-local delete po -lapp=authenticator  
    kubectl delete lease authenticator-leader-elect
    
  2. Wait a few minutes (typically about 5 minutes) to allow the pods to restart and reconcile the dynamic OIDC client.
  3. You can use the following command to verify that the reconciliation is complete.
    kubectl -n tmc-local get oidcclient/client.oauth.pinniped.dev-auth-manager-pinniped-oidc-client secret/client.oauth.pinniped.dev-auth-manager-pinniped-oidc-client-client-secret-generated
    

no healthy upstream

Problem: Error occurs in APIs or in envoy logs.

Cause: pods are not healthy Verify that all the pods in the tmc-local namespace are in Running or Completed state. - If they are not in one of these states, examine the pods for specific information on failure states. - If all pods look healthy, this error might be because the stack-http-proxy is not Ready. - If the stack-http-proxy is Ready, then there might be some intermittent issue in the ingress.

Fix: Try restarting the contour and envoy pods using the following command:

kubectl -n tmc-local delete po -l app.kubernetes.io/name=contour

Permission Denied

Problem: After logging into the Tanzu Mission Control console for the first time, you might experience Permission Denied error messages when trying to perform operations in the UI. For example, when attempting to register a management cluster.

Cause: The most common cause for this issue is a group misconfiguration in the backend IDP.

Fix: Make sure your user is part of the ‘tmc:admin’ group in your IDP.

Authentication service failure: CrashLoopBackOff state

Problem: The authentication service indicates an error accessing postgres.

Failure log

[ERROR] Could not connect to acquire lock: failed to connect to `host=postgres-postgresql user=<long-id> database=<long-id>`: failed SASL auth (FATAL: password authentication failed for user "<long-id>" (SQLSTATE 28P01))

Fix: Delete the deployment and postgres credentials and wait for kapp-controller to reconcile the deleted objects

Delete:

kubectl -n tmc-local delete postgresendpoints --all
kubectl -n tmc-local delete deployment.app --all

No valid authentication credentials

Cause: When running Tanzu Mission Control Self-Managed in an air-gapped environment, the clock of node(s) running the host cluster might get out of sync, causing clock skew.

To accommodate a minor clock skew, we Tanzu Mission Control Self-Managed implements a clock tolerance of 60 seconds. When there is a clock skew of up to 60sec, tokens are accepted at the api-gateway and global services. However, larger skews (of >60sec) cause this issue to surface.

Adding Git kustomization through the Tanzu Mission Control console is unresponsive

Problem: When adding a kustomization to a cluster using the Tanzu Mission Control console, the Create kustomization button is unresponsive, and no failure or success message is displayed.

Fix: Use the tmc CLI to add the kustomization. The command looks something like this:

tmc cluster fluxcd kustomization create –-cluster-name <cluster-name> –-management-cluster-name <mgmt-cluster-name> –-provisioner-name <provisioner-name> --source-name <git-repo-name> --path <path-within-repo>

Inspection Failures

Inspection scans can fail due to a variety of reasons. Some of the more common reasons are explained below.

Problem: Conformance or Lite inspections fail to complete or return an error.

The Conformance and Lite inspection types are dependent on public images.

Cause: Missing images

If the scan images for these inspection types are not available in your private local image registry, the inspection can take a very long time to progress, and in most cases the scan does not complete, and returns an error.

Fix: Make sure you have copied the latest version of these third-party images into your private local image registry, as described in Copy inspection scan images.

Problem: Conformance inspection fails due to insufficient resources.

Cause: Insufficient resources or resource constraints

A Conformance inspection can fail due to insufficient resources. Check the logs of the Sonobuoy pod in the vmware-system-tmc namespace in the workload cluster. These logs typically reveal the insufficient resource problem.

Fix: To resolve this issue, increase the number of nodes for the workload cluster.

check-circle-line exclamation-circle-line close-line
Scroll to top icon