management cluster cannot be displayed
Problem: After you register a management cluster from the TMC console, the management cluster is not displayed in the management cluster list view, and the TMC cluster agent extensions are not installed on the management cluster.
Fix: Restart the pods of cluster-agent-service
in the tmc-local
namespace using the following command:
kubectl rollout restart deployment cluster-agent-service-server -n tmc-local
no healthy upstream
Problem: Error occurs in APIs or in envoy
logs.
Cause: pods are not healthy Verify that all the pods in the tmc-local
namespace are in Running
or Completed
state.
stack-http-proxy
is not Ready
.stack-http-proxy
is Ready
, then there might be some intermittent issue in the ingress.Fix: Try restarting the contour
and envoy
pods using the following command:
kubectl -n tmc-local delete po -l app.kubernetes.io/name=contour
Permission Denied
Problem: After logging into the Tanzu Mission Control console for the first time, you might experience Permission Denied
error messages when trying to perform operations in the UI (for example, when attempting to register a management cluster).
Cause: The most common cause for this issue is a group misconfiguration in the backend IDP.
Fix: Make sure your user is part of the tmc:admin
group in your IDP.
CrashLoopBackOff
stateProblem: The authentication service indicates an error accessing postgres.
Failure log
[ERROR] Could not connect to acquire lock: failed to connect to `host=postgres-postgresql user=<long-id> database=<long-id>`: failed SASL auth (FATAL: password authentication failed for user "<long-id>" (SQLSTATE 28P01))
Fix: Delete the deployment and postgres credentials and wait for kapp-controller to reconcile the deleted objects
Delete:
kubectl -n tmc-local delete postgresendpoints --all
kubectl -n tmc-local delete deployment.app --all
No valid authentication credentials
Cause: When running Tanzu Mission Control Self-Managed in an air-gapped environment, the clock of node(s) running the host cluster might get out of sync, causing clock skew.
To accommodate a minor clock skew, TMC Self-Managed implements a clock tolerance of 60 seconds. When there is a clock skew of up to 60 seconds, tokens are accepted at the api-gateway and global services. However, larger skews (of >60sec) cause this issue to surface.
kustomization
through the Tanzu Mission Control console is unresponsiveProblem: When adding a kustomization to a cluster using the TMC console, the Create kustomization button is unresponsive, and no failure or success message is displayed.
Fix: Use the tmc
CLI to add the kustomization. The command looks something like this:
tmc cluster fluxcd kustomization create –-cluster-name <cluster-name> –-management-cluster-name <mgmt-cluster-name> –-provisioner-name <provisioner-name> --source-name <git-repo-name> --path <path-within-repo>
Inspection scans can fail due to a variety of reasons. Some of the more common reasons are explained below.
The Conformance and Lite inspection types are dependent on public images.
Cause: Missing images
If the scan images for these inspection types are not available in your private local image registry, the inspection can take a very long time to progress, and in most cases the scan does not complete, and returns an error.
Fix: Make sure you have copied the latest version of these third-party images into your private local image registry, as described in Copy inspection scan images.
Cause: Insufficient resources or resource constraints
A Conformance inspection can fail due to insufficient resources. Check the logs of the Sonobuoy
pod in the vmware-system-tmc
namespace in the workload cluster. These logs typically reveal the insufficient resource problem.
Fix: To resolve this issue, increase the number of nodes for the workload cluster.