Common issues after installation

management cluster cannot be displayed

Problem: After you register a management cluster from the TMC console, the management cluster is not displayed in the management cluster list view, and the TMC cluster agent extensions are not installed on the management cluster.

Fix: Restart the pods of cluster-agent-service in the tmc-local namespace using the following command:

kubectl rollout restart deployment cluster-agent-service-server -n tmc-local

no healthy upstream

Problem: Error occurs in APIs or in envoy logs.

Cause: pods are not healthy Verify that all the pods in the tmc-local namespace are in Running or Completed state.

  • If they are not in one of these states, examine the pods for specific information on failure states.
  • If all pods look healthy, this error might be because the stack-http-proxy is not Ready.
  • If the stack-http-proxy is Ready, then there might be some intermittent issue in the ingress.

Fix: Try restarting the contour and envoy pods using the following command:

kubectl -n tmc-local delete po -l app.kubernetes.io/name=contour

Permission Denied

Problem: After logging into the Tanzu Mission Control console for the first time, you might experience Permission Denied error messages when trying to perform operations in the UI (for example, when attempting to register a management cluster).

Cause: The most common cause for this issue is a group misconfiguration in the backend IDP.

Fix: Make sure your user is part of the tmc:admin group in your IDP.

Authentication service failure: CrashLoopBackOff state

Problem: The authentication service indicates an error accessing postgres.

Failure log

[ERROR] Could not connect to acquire lock: failed to connect to `host=postgres-postgresql user=<long-id> database=<long-id>`: failed SASL auth (FATAL: password authentication failed for user "<long-id>" (SQLSTATE 28P01))

Fix: Delete the deployment and postgres credentials and wait for kapp-controller to reconcile the deleted objects

Delete:

kubectl -n tmc-local delete postgresendpoints --all
kubectl -n tmc-local delete deployment.app --all

No valid authentication credentials

Cause: When running Tanzu Mission Control Self-Managed in an air-gapped environment, the clock of node(s) running the host cluster might get out of sync, causing clock skew.

To accommodate a minor clock skew, TMC Self-Managed implements a clock tolerance of 60 seconds. When there is a clock skew of up to 60 seconds, tokens are accepted at the api-gateway and global services. However, larger skews (of >60sec) cause this issue to surface.

Adding Git kustomization through the Tanzu Mission Control console is unresponsive

Problem: When adding a kustomization to a cluster using the TMC console, the Create kustomization button is unresponsive, and no failure or success message is displayed.

Fix: Use the tmc CLI to add the kustomization. The command looks something like this:

tmc cluster fluxcd kustomization create –-cluster-name <cluster-name> –-management-cluster-name <mgmt-cluster-name> –-provisioner-name <provisioner-name> --source-name <git-repo-name> --path <path-within-repo>

Inspection Failures

Inspection scans can fail due to a variety of reasons. Some of the more common reasons are explained below.

Problem: Conformance or Lite inspections fail to complete or return an error.

The Conformance and Lite inspection types are dependent on public images.

Cause: Missing images

If the scan images for these inspection types are not available in your private local image registry, the inspection can take a very long time to progress, and in most cases the scan does not complete, and returns an error.

Fix: Make sure you have copied the latest version of these third-party images into your private local image registry, as described in Copy inspection scan images.

Problem: Conformance inspection fails due to insufficient resources.

Cause: Insufficient resources or resource constraints

A Conformance inspection can fail due to insufficient resources. Check the logs of the Sonobuoy pod in the vmware-system-tmc namespace in the workload cluster. These logs typically reveal the insufficient resource problem.

Fix: To resolve this issue, increase the number of nodes for the workload cluster.

check-circle-line exclamation-circle-line close-line
Scroll to top icon