This topic tells you how to troubleshoot Tanzu Build Service when used with Tanzu Application Platform (commonly known as TAP).
After installing Tanzu Application Platform on or upgrading an existing Amazon Elastic Kubernetes Service (EKS) cluster to Kubernetes v1.23, build pods show:
'running PreBind plugin "VolumeBinding": binding volumes: timed out waiting
for the condition'
This is due to the CSIMigrationAWS in this Kubernetes version, which requires users to install the Amazon EBS CSI driver to use AWS Elastic Block Store (EBS) volumes. See the Amazon documentation. For more information about EKS support for Kubernetes v1.23, see the Amazon blog post.
Tanzu Application Platform uses the default storage class which uses EBS volumes by default on EKS.
Follow the AWS documentation to install the Amazon EBS CSI driver before installing Tanzu Application Platform, or before upgrading to Kubernetes v1.23. See
When using dockerd as the cluster’s container runtime, you might see the smart-warmer-image-fetcher
pods report a status of ErrImagePull
.
This error might be due to dockerd’s layer depth limitation, in which the maximum supported image layer depth is 125.
To verify that the ErrImagePull
status is due to dockerd’s maximum supported image layer depth, check for event messages containing the words max depth exceeded
. For example:
$ kubectl get events -A | grep "max depth exceeded"
build-service 73s Warning Failed pod/smart-warmer-image-fetcher-wxtr8 Failed to pull image
"harbor.somewhere.com/aws-repo/build-service:clusterbuilder-full@sha256:065bb361fd914a3970ad3dd93c603241e69cca214707feaa6
d8617019e20b65e": rpc error: code = Unknown desc = failed to register layer: max depth exceeded
To work around this issue, configure your cluster to use containerd or CRI-O as its default container runtime. For instructions, see the following documentation for your Kubernetes cluster provider.
For AWS, see:
For AKS, see:
For GKE, see:
For OpenShift, see:
You see the following error, or similar, in a node status:
Warning ContainerGCFailed 119s (x2523 over 42h) kubelet rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (16779959 vs. 16777216)
This is due to the way that the container runtime interface (CRI) handles garbage collection for unused images and containers.
Do not use Docker as the CRI because it is not supported. Some versions of EKS default to Docker as the runtime.
While upgrading apps to a later stack, you might encounter the build platform erroneously reusing the old build cache.
If you encounter this issue, delete, and recreate the workload in Tanzu Application Platform, or delete and recreate the image in Tanzu Build Service.
buildservice.kp_default_repository
to shared.image_registry
After switching to using the shared.image_registry
fields in tap-values.yaml
, your workloads might start failing with a TemplateRejectedByAPIServer
error, with the error message: admission webhook "validation.webhook.kpack.io" denied the request: validation failed: Immutable field changed: spec.tag
.
Tanzu Application Platform automatically appends /buildservice
to the end of the repository specified in shared.image_registry.project_path
. This updates the existing workload image tags, which is not allowed by Tanzu Build Service.
Delete the images.kpack.io
, it has the same name as the workload. The workload then recreates it with valid values.
Alternatively, re-add the buildservice.kp_default_repository_*
fields in the tap-values.yaml
. You must set both the repository and the authentication fields to override the shared values. Set kp_default_repository
, kp_default_repository_secret.name
, and kp_default_repository_secret.namespace
.
During upgrades a large number of builds might be created due to buildpack and stack updates. Some of these builds might fail causing the workload to be in an unhealthy state.
Builds fail due to transient network issues.
This resolves itself on subsequent builds after a code change and does not affect the running application.
If you do not want to wait for subsequent builds to run, you can use either the Tanzu Build Service plug-in for the Tanzu CLI or the open source kpack CLI to trigger a build manually.
If using the Tanzu CLI, manually trigger a build as follows:
List the image resources in the developer namespace by running:
tanzu build-service image list -n DEVELOPER-NAMESPACE
Manually trigger the image resources to re-run builds for each failing image by running:
tanzu build-service image trigger IMAGE-NAME -n DEVELOPER-NAMESPACE
If using the kpack CLI, manually trigger a build as follows:
List the image resources in the developer namespace by running:
kp image list -n DEVELOPER-NAMESPACE
Manually trigger the image resources to re-run builds for each failing image by running:
kp image trigger IMAGE-NAME -n DEVELOPER-NAMESPACE