Troubleshooting Supply Chain Security Tools - Store

This topic contains ways you can troubleshoot known issues for Supply Chain Security Tools (SCST) - Store.

Persistent volume retains data

Symptom

If SCST - Store is deployed, deleted, redeployed, and the database password is changed during the redeployment, the metadata-store-db pod fails to start. The persistent volume used by PostgreSQL retaining old data, even though the retention policy is set to DELETE, causes this issue.

Solution

Caution

Changing the database password deletes your SCST - Store data.

To redeploy the app, either use the same database password or follow these steps to erase the data on the volume:

  1. Deploy metadata-store app by using kapp.
  2. Verify that the metadata-store-db-* pod fails.
  3. Run:

    kubectl exec -it metadata-store-db-<some-id> -n metadata-store /bin/bash
    

    Where <some-id> is the ID generated by Kubernetes and appended to the pod name.

  4. Run rm -rf /var/lib/postgresql/data/* to delete all database data.

    Where /var/lib/postgresql/data/* is the path found in postgres-db-deployment.yaml.

  5. Delete the metadata-store app by using kapp.

  6. Deploy the metadata-store app by using kapp.

Missing persistent volume

Symptom

After SCST - Store is deployed, metadata-store-db pod might fail for missing volume while postgres-db-pv-claim pvc is in PENDING state.

This is because the cluster where SCST - Store is deployed does not have storageclass defined. storageclass’s provisioner is responsible for creating the persistent volume after metadata-store-db attaches postgres-db-pv-claim.

Solution

To solve:

  1. Verify that your cluster has storageclass by running kubectl get storageclass.
  2. Create a storageclass in your cluster before deploying SCST - Store. For example:

    # This is the storageclass that Kind uses
    kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
    
    # set the storage class as default
    kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
    

Builds fail due to volume errors on EKS running Kubernetes v1.23

Symptom

When installing SCST - Store on or upgrading an existing EKS cluster to Kubernetes v1.23, the satabase pod shows:

running PreBind plugin "VolumeBinding": binding volumes: provisioning failed for PVC "postgres-db-pv-claim"

Explanation

This is due to the CSIMigrationAWS in this Kubernetes version, which requires users to install the Amazon Elastic Block Store (EBS) CSI Driver to use EBS volumes. For more information, see the AWS documentation.

SCST - Store uses the default storage class which uses EBS volumes by default on EKS.

Solution

Follow the AWS documentation to install the Amazon EBS CSI Driver before installing SCST - Store or before upgrading to Kubernetes v1.23. For more information, see the AWS documentation.

CA Cert expires

Symptom

The Scan Controller fails to communicate with the SCST - Store and receives an error message containing the following text:

tls: failed to verify certificate: x509: certificate has expired or is not yet valid

Explanation

The CA certificate expired before the app certificate expires, which causes the error even though the app certificate is still valid. Certmanager will rotate the expired CA certificate, but it doesn’t rotate the certificates that were previously created by the expired CA certificate. So this leads to contour being in a bad state and the certificates won’t be refreshed properly.

Solution

To solve:

  1. Delete the existing expired cacert by running:

    kubectl delete secret cacert contourcert envoycert -n projectcontour
    
  2. Delete the contour-certgen job by running:

    kubectl delete job contour-certgen -n projectcontour
    
  3. Trigger reconciliation for contour by running:

    kctrl package installed kick --package-install contour -n tap-install
    
  4. Restart envoy pods. First find the name of the envoy pod by running:

    kubectl get pods -n projectcontour
    
  5. Find the pod that is named in the format envoy-<some-random-id>. Use that name and restart the pods by running:

    kubectl delete pod <envoy-pod-name> -n projectcontour
    

Certificate Expires

Symptom

The Scan Controller fails to connect to SCST - Store.

The logs of the metadata-store-app pod show the following error:

$ kubectl logs deployment/metadata-store-app -c metadata-store-app -n metadata-store
...
2022/09/12 21:22:07 http: TLS handshake error from 127.0.0.1:35678: write tcp 127.0.0.1:9443->127.0.0.1:35678: write: broken pipe
...

or

The logs of metadata-store-db show the following error:

$ kubectl logs statefulset/metadata-store-db -n metadata-store
...
2022-07-20 20:02:51.206 UTC [1] LOG:  database system is ready to accept connections
2022-09-19 18:05:26.576 UTC [13097] LOG:  could not accept SSL connection: sslv3 alert bad certificate
...

Explanation

cert-manager rotates the certificates, but the metadata-store and the PostgreSQL db are unaware of the change, and are using the old certificates.

Solution

If you see TLS handshake error in the metadata-store-app logs, delete the metadata-store-app pod and wait for it to come back up.

kubectl delete pod metadata-store-app-xxxx -n metadata-store

If you see could not accept SSL connection in the metadata-store-db logs, delete the metadata-store-db pod and wait for it to come back up.

kubectl delete pod metadata-store-db-0 -n metadata-store

Database index corruption issue in SCST - Store

Metadata Store unable to reconcile because the metadata store pod complains about potential database index corruption issue.

kubectl logs metadata-store-app-pod_name -n metadata-store
{“level”:“error”,“ts”:“2023-08-15T16:38:31.528115988Z”,“logger”:“MetadataStore”,“msg”:“unable to check index corruption since user is not a superuser to perform \“CREATE EXTENSION amcheck\“. Please create this extension and check for index corruption using following sql command \“SELECT bt_index_check(oid) FROM pg_class WHERE relname in (SELECT indexrelid::regclass::text FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s JOIN pg_collation c ON coll=c.oid WHERE collprovider IN (‘d’, ‘c’) AND collname NOT IN (‘C’, ‘POSIX’));\“”,“hostname”:“metadata-store-app-77c9fb59c8-qplxt”}
{“level”:“error”,“ts”:“2023-08-15T16:38:31.528139637Z”,“logger”:“MetadataStore”,“msg”:“Found corrupted database indexes but unable to fix them”,“hostname”:“metadata-store-app-77c9fb59c8-qplxt”,“error”:“unable to check index corruption since user is not a superuser to perform \“CREATE EXTENSION amcheck\“. Please create this extension and check for index corruption using following sql command \“SELECT bt_index_check(oid) FROM pg_class WHERE relname in (SELECT indexrelid::regclass::text FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s JOIN pg_collation c ON coll=c.oid WHERE collprovider IN (‘d’, ‘c’) AND collname NOT IN (‘C’, ‘POSIX’));\“”}

For information about the solution, see Postgres Database Index Corruption.

Errors from Tanzu Developer Portal related to SCST - Store

Different Tanzu Developer Portal plug-ins use SCST - Store to display information about vulnerabilities and packages. Some errors visible in Tanzu Developer Portal are related to this connection.

An error occurred while loading data from the Metadata Store

Symptom

In the Supply Chain Choreographer plug-in, you see the error message An error occurred while loading data from the Metadata Store.

Screenshot of Tanzu Developer Portal displaying the error message about loading data from the metadata store.

Cause

There are multiple potential causes. The most common cause is tap-values.yaml missing the configuration that enables Tanzu Developer Portal to communicate with Supply Chain Security Tools - Store.

Solution

See Supply Chain Choreographer - Enable CVE scan results for the necessary configuration to add to tap-values.yaml. After adding the configuration, update your Tanzu Application Platform deployment or Tanzu Developer Portal deployment with the new values.

check-circle-line exclamation-circle-line close-line
Scroll to top icon