no healthy upstream
Problem: Error occurs in APIs or in envoy logs.
Cause: pods are not healthy Verify that all the pods in the tmc-local
namespace are in Running
or Completed
state. - If they are not in one of these states, examine the pods for specific information on failure states. - If all pods look healthy, this error might be because the stack-http-proxy
is not Ready
. - If the stack-http-proxy
is Ready
, then there might be some intermittent issue in the ingress.
Fix: Try restarting the contour
and envoy
pods using the following command:
kubectl -n tmc-local delete po -l app.kubernetes.io/name=contour
Permission Denied
Problem: After logging into the Tanzu Mission Control console for the first time, you might experience Permission Denied
error messages when trying to perform operations in the UI. For example, when attempting to register a management cluster.
Cause: The most common cause for this issue is a group misconfiguration in the backend IDP.
Fix: Make sure your user is part of the ‘tmc:admin’ group in your IDP.
CrashLoopBackOff
stateProblem: The authentication service indicates an error accessing postgres.
Failure log
[ERROR] Could not connect to acquire lock: failed to connect to `host=postgres-postgresql user=<long-id> database=<long-id>`: failed SASL auth (FATAL: password authentication failed for user "<long-id>" (SQLSTATE 28P01))
Fix: Delete the deployment and postgres credentials and wait for kapp-controller to reconcile the deleted objects
Delete:
kubectl -n tmc-local delete postgresendpoints --all
kubectl -n tmc-local delete deployment.app --all
No valid authentication credentials
Cause: When running Tanzu Mission Control Self-Managed in an air-gapped environment, the clock of node(s) running the host cluster might get out of sync, causing clock skew.
To accommodate a minor clock skew, we Tanzu Mission Control Self-Managed implements a clock tolerance of 60 seconds. When there is a clock skew of up to 60sec, tokens are accepted at the api-gateway and global services. However, larger skews (of >60sec) cause this issue to surface.
kustomization
through the Tanzu Mission Control console is unresponsiveProblem: When adding a kustomization to a cluster using the Tanzu Mission Control console, the Create kustomization button is unresponsive, and no failure or success message is displayed.
Fix: Use the tmc
CLI to add the kustomization. The command looks something like this:
tmc cluster fluxcd kustomization create –-cluster-name <cluster-name> –-management-cluster-name <mgmt-cluster-name> –-provisioner-name <provisioner-name> --source-name <git-repo-name> --path <path-within-repo>
Inspection scans can fail due to a variety of reasons. Some of the more common reasons are explained below.
The Conformance and Lite inspection types are dependent on public images.
Cause: Missing images
If the scan images for these inspection types are not available in your private local image registry, the inspection can take a very long time to progress, and in most cases the scan does not complete, and returns an error.
Fix: Make sure you have copied the latest version of these third-party images into your private local image registry, as described in Copy inspection scan images.
Cause: Insufficient resources or resource constraints
A Conformance inspection can fail due to insufficient resources. Check the logs of the Sonobuoy
pod in the vmware-system-tmc
namespace in the workload cluster. These logs typically reveal the insufficient resource problem.
Fix: To resolve this issue, increase the number of nodes for the workload cluster.