This topic describes troubleshooting information for problems during Cloud Native Runtimes for Tanzu installation or configuration.

Cannot connect to app on AWS

Symptom

On AWS, you see the following error when connecting to your app:

curl: (6) Could not resolve host: a***********************7.us-west-2.elb.amazonaws.com

Solution

Try connecting to your app again after 5 minutes. The AWS LoadBalancer name resolution takes several minutes to propagate.

minikube Pods Fail to Start

Symptom

On minikube, you see the following error when installing Cloud Native Runtimes:

3:03:59PM: error: reconcile job/contour-certgen-v1.10.0 (batch/v1) namespace: contour-internal
Pod watching error: Creating Pod watcher: Get "https://192.168.64.17:8443/api/v1/pods?labelSelector=kapp.k14s.io%2Fapp%3D1618232545704878000&watch=true": dial tcp 192.168.64.17:8443: connect: connection refused
kapp: Error: waiting on reconcile job/contour-certgen-v1.10.0 (batch/v1) namespace: contour-internal:
  Errored:
   Listing schema.GroupVersionResource{Group:"", Version:"v1", Resource:"pods"}, namespaced: true:
    Get "https://192.168.64.17:8443/api/v1/pods?labelSelector=kapp.k14s.io%2Fassociation%3Dv1.572a543d96e0723f858367fcf8c6af4e": unexpected EOF

Solution

Increase your available system RAM to at least 4 GB.

Installation fails with kapp-controller v0.16

Symptom

When installing Cloud Native Runtimes, you see the following error:

kapp: Error: waiting on reconcile app/cloud-native-runtimes (kappctrl.k14s.io/v1alpha1) namespace: cloud-native-runtimes:
  Finished unsuccessfully (Reconcile failed:  (message: Fetching (0): Unsupported way to fetch templates))

Solution

Install kapp-controller v0.17.0 or later on your cluster. Cloud Native Runtimes requires kapp-controller support for imgpkgBundle fetcher, which was introduced in kapp-controller v0.17.0.

On some Kubernetes versions and cloud providers, Tanzu Kubernetes Grid v1.3.1 installs kapp-controller v0.16.0, which is incompatible with Cloud Native Runtimes. For more information about Tanzu Kubernetes Grid kapp-controller versions, see the TKG Release Notes.

Installation fails to reconcile app/cloud-native-runtimes

Symptom

When installing Cloud Native Runtimes, you see one of the following errors:

11:41:16AM: ongoing: reconcile app/cloud-native-runtimes (kappctrl.k14s.io/v1alpha1) namespace: cloud-native-runtime
11:41:16AM:  ^ Waiting for generation 1 to be observed
kapp: Error: Timed out waiting after 15m0s

Or,

3:15:34PM:  ^ Reconciling
3:16:09PM: fail: reconcile app/cloud-native-runtimes (kappctrl.k14s.io/v1alpha1) namespace: cloud-native-runtimes
3:16:09PM:  ^ Reconcile failed:  (message: Deploying: Error (see .status.usefulErrorMessage for details))

kapp: Error: waiting on reconcile app/cloud-native-runtimes (kappctrl.k14s.io/v1alpha1) namespace: cloud-native-runtimes:
  Finished unsuccessfully (Reconcile failed:  (message: Deploying: Error (see .status.usefulErrorMessage for details)))

Explanation

The cloud-native-runtimes deployment app installs the subcomponents of Cloud Native Runtimes.

Error messages about reconciling indicate that one or more subcomponents have failed to install.

Solution

The following procedure includes ensuring that your Cloud Provider supports the Service type LoadBalancer and examining logs:

  1. If you see the following Kapp timed out error, ensure that your Cloud Provider supports the creation of Service type LoadBalancer. For more information about the Service type LoadBalancer, see Prerequisites.

    kapp: Error: Timed out waiting after 15m0s
    
  2. Get the logs from the cloud-native-runtimes app. Run:

    kubectl get app/cloud-native-runtimes -n cloud-native-runtimes -o jsonpath="{.status.deploy.stdout}"
    

    For example,

    $ kubectl get app/cloud-native-runtimes -n cloud-native-runtimes -o jsonpath="{.status.deploy.stdout}"
    10:51:58PM: ok: reconcile customresourcedefinition/httpproxies.projectcontour.io (apiextensions.k8s.io/v1) cluster
    10:51:58PM: fail: reconcile deployment/webhook (apps/v1) namespace: vmware-sources
    10:51:58PM:  ^ Deployment is not progressing: ProgressDeadlineExceeded (message: ReplicaSet "webhook-6f5d979b7d" has timed out progressing.)
    

    Note: If the command does not return log messages, then kapp-controller is not installed or is not running correctly.

  3. Review the output for subcomponent deployments that have failed or are still ongoing.

    In the example above, the webhook deployment in the vmware-sources namespace failed.

  4. Run kubectl get pods to find the name of the pod:

    kubectl get pods --show-labels -n NAMESPACE
    

    Where NAMESPACE is the namespace associated with the reconcile error, for example, vmware-sources.

    For example,

    $ kubectl get pods --show-labels -n vmware-sources
    NAME                       READY   STATUS    RESTARTS   AGE   LABELS
    webhook-6f5d979b7d-cxr9k   0/1     Pending   0          44h   app=webhook,kapp.k14s.io/app=1626302357703846007,kapp.k14s.io/association=v1.9621e0a793b4e925077dd557acedbcfe,pod-template-hash=6f5d979b7d,role=webhook,sources.tanzu.vmware.com/release=v0.23.0
    
  5. Run kubectl logs and kubectl describe:

    kubectl logs PODNAME -n NAMESPACE
    kubectl describe pod PODNAME -n NAMESPACE
    

    Where:

    • PODNAME is found in the output of step 3, for example webhook-6f5d979b7d-cxr9k.
    • NAMESPACE is the namespace associated with the reconcile error, for example, vmware-sources.

    For example:

    $ kubectl logs webhook-6f5d979b7d-cxr9k -n vmware-sources
    
    $ kubectl describe pod webhook-6f5d979b7d-cxr9k  -n vmware-sources
    Events:
    Type     Reason            Age                 From               Message
    ----     ------            ----                ----               -------
    Warning  FailedScheduling  80s (x14 over 14m)  default-scheduler  0/1 nodes are available: 1 Insufficient cpu.
    
  6. Review the output from the kubectl logs and kubectl describe commands and take further action.

    In the example of the webhook deployment, the output indicates that the scheduler does not have enough CPU to run the pod. In this case, the solution is to add nodes or CPU cores to the cluster. If you are using Tanzu Mission Control (TMC), increase the number of workers in the node pool to three or more through the TMC UI. See Edit a Node Pool, in the TMC documentation.

Cloud Native Runtimes Installation Fails with Existing Contour Installation

Symptom

You see the following error message when you run the install script:

Could not proceed with installation. Refer to Cloud Native Runtimes documentation for details on how to utilize an existing Contour installation. Another app owns the custom resource definitions listed below.

Solution

Follow the procedure in Install Cloud Native Runtimes on a Cluster with Your Existing Contour Instances to resolve the issue.

check-circle-line exclamation-circle-line close-line
Scroll to top icon