Improve application security and interoperability with VMware Tanzu Application Catalog metadata

This section provides an overview of the metadata available in the VMware Tanzu Application Catalog (Tanzu Application Catalog), explains how to access it and describes common usage scenarios.

Tanzu Application Catalog is a curated collection of production-ready open source application components, databases and runtimes. All the containers and charts in the catalog are tested on multiple platforms and packaged according to industry best practices. They are also continuously maintained and updated to ensure that users always have access to the latest and most secure versions.

To give developers and operators confidence in the catalog, the Tanzu Application Catalog is designed to meet the stringent security and transparency requirements of enterprise IT. Users can access critical details about the open-source libraries and binaries used in every container and chart through a comprehensive metadata system. This metadata allows developers and operators to independently verify that the containers and charts sourced from the Tanzu Application Catalog conform to enterprise licensing and compliance policies.

Overview

Every asset (container or Helm chart) in the Tanzu Application Catalog comes with supplementary metadata. This metadata consists of a JSON file that serves as a complete “bill of materials” for the asset. The JSON file contains information on how to consume the asset, its digest, its build and release dates, and a complete list of included sub-components or libraries with license information. This metadata is automatically generated every time the asset is released and is digitally signed to protect it from tampering.

The screenshot below illustrates an example of the metadata supplied with the Apache Helm chart from the Tanzu Application Catalog.

Metadata example

Usage scenarios

The metadata exposed in the Tanzu Application Catalog can be directly queried and retrieved by authorized third-party applications. This facilitates a number of key enterprise use cases.

Compliance with enterprise licensing policies

Enterprises that use open source software (OSS) typically define policies stating which OSS licenses are permitted for use by enterprise development teams, and the scope of usage for each license. For example, enterprises may only permit use of components which are licensed under the MIT license. Enforcement of these OSS licensing policies is key to ensure that the final software delivered by the enterprise is compliant with the licensing terms of each OSS component used, and to ensure that the enterprise’s intellectual property rights are protected.

Tanzu Application Catalog metadata provides a complete list of all the sub-components and libraries used in the asset, together with the associated license for each. This information can be used to enforce enterprise policies at catalog level - for example, allowing developers to launch only those applications from the catalog which have a non-GPL license, or only permitting developers to use runtime containers which have an MIT license.

Better interoperability with enterprise applications

Tanzu Application Catalog users already have access to detailed release and test information for each asset through the Tanzu Application Catalog Web interface. However, many enterprises already have a predefined and approved toolchain for their developer teams and prefer to list and integrate assets from the Tanzu Application Catalog directly in this toolchain (in some cases, with additional filters applied).

For example, an enterprise might already maintain an internal service catalog for its development teams. Using Tanzu Application Catalog metadata, it can create a custom view or developer portal that displays a filtered list of assets from the Tanzu Application Catalog in this catalog - for example, only those assets using a specific “guaranteed compatible” version of a library or runtime. Tanzu Application Catalog metadata thus provides an amplified data tree for each asset that increases visibility and enables custom user experiences for internal tools.

Improved DevSecOps

Tanzu Application Catalog metadata includes a complete “bill of materials” for each asset. This information is extremely useful to enterprise security teams, as it gives them greater visibility into applications running in production environments. It allows them to understand how each application is built and more easily identify security vulnerabilities arising from specific application components or component versions.

Tanzu Application Catalog metadata also gives enterprise DevOps teams the information they need to make better upgrade decisions. Teams can use catalog metadata to calculate a delta between what is currently running in production and what is new, and then take an informed decision about the necessity of upgrading an application. Catalog metadata can also be combined with upstream project metrics (such as forks, activity or CVE history) to calculate risk scores for enterprise deployments.

Functional validation

The metadata includes in the Tanzu Application Catalog includes test results. Enterprise development teams can use this metadata to confirm that the assets in the catalog have been functionally validated on different platforms. This gives them the confidence to immediately start using catalog assets in their own development and thereby reduce the time-to-market for new projects.

Metadata components and access

The asset metadata contains the following:

  • Release information, such as the name, version and release date
  • Release tags
  • Asset digest (SHA)
  • List of sub-components included in the asset with license information
  • CVE and virus scan results
  • Verification and functional test results
  • Digital signature for integrity checks

Depending on the asset (container or Helm chart), the metadata will be different. Here is the list of possible files:

  • asset-spec.json: Detailed information about the content of the asset. Available for both containers and Helm charts.
  • clamav-antivirus-scan-results.txt: Information about the antivirus scan process. Available only for containers.
  • test-results.tar.gzip: Results of the test suite performed on this asset. Available for both containers and Helm charts.
  • trigger-info.json: Information about what triggered the current release of the asset, in JUnit report format. Available for both containers and Helm charts.
  • spdx.json: An open standard for communicating software bill of material information, including components, licenses, copyrights, and security references. Available for containers and single vms.
  • cve-trivy-scanner-output.json: CVEs detected in the system packages. It is generated only at build time. Available only for containers.
  • vulnerability-cvrf-report.xml: Same information as the CVEs Scan report using the CVRF standard format. Available only for containers.
  • source-container.tar.gz: Souce code used to build the asset, includes the Dockerfile. Available only for containers.

For provenance verification, a digital JSON Web signature (JWS) is attached to every metadata file. The signature is created using RSA keys only available in the Tanzu Application Catalog pipeline. The signature header includes information (including the JWKs discovery endpoint) that the verifier can use to validate the signature

This asset metadata can be accessed in three different ways.

  • Most users will interact with the Tanzu Application Catalog through the Web interface. In this case, the metadata is available for each assets as a downloadable, JSON-formatted file. This JSON file contains internal references to supplementary downloads such as antivirus scans and logs for the specific release of that asset.

  • Users can access Tanzu Application Catalog metadata using the tac Tanzu Application Catalog command-line tool. This tool allows users to list and view details of assets in the catalog from the command-line, enabling easy integration with shell scripts or other utilities.

  • Users without access to the Tanzu Application Catalog Web interface can access Tanzu Application Catalog metadata using the OCI registry. This access method is currently only supported for federal customers.

Method 1: Obtain metadata using the Tanzu Application Catalog Web interface

What to do first

Ensure:

Retrieve metadata for a single artifact

As an example, follow the steps below to access the metadata for the Apache Helm chart using the Tanzu Application Catalog Web interface:

  1. Log in to the Tanzu Application Catalog.
  2. Select your organization.
  3. Navigate to the detail page for the Apache Helm chart.
  4. In the “Build Time Reports” section, find and download the “Asset Specification” report.

    Metadata download

The report is a JSON-formatted file containing multiple sections. It can be read using any text editor or JSON-compatible client library, making it immediately usable in other applications.

Method 2: Obtain metadata from the OCI registry

Note

This method is currently supported only for federal customers.

Tanzu Application Catalog metadata is pushed along with its containers and charts to a remote registry, leveraging the Open Container Initiative (OCI) for registries.

charts-index

The charts-index artifact is a special JSON file bundled as an OCI artifact that mimics the behavior of a Helm chart repository charts-index.

However, the Helm CLI is not able to list charts stored in an OCI registry and the OCI specification does not provide a way to list all assets stored in a registry. Therefore, the Tanzu Application Catalog charts-index provides a way to retrieve the list of available assets in a registry. This allows federal customers without access to the Tanzu Application Catalog Web interface to perform asset discovery.

The index lives under the charts-index repository and is tagged with the latest tag, using the pattern REGISTRY-NAME/PROJECT-NAME/charts-index:latest.

Artifact metadata

Tanzu Application Catalog artifact metadata lives closely with the asset it belongs to, in the same repository. Tags are used to identify and mark the metadata, by appending a -metadata suffix to all published tags.

For example, for a regular container image or Helm chart with tags latest, 6.0.10 and 6.0-ubuntu-18, there is also a custom OCI artifact containing metadata with corresponding tags latest-metadata, 6.0.10-metadata and 6.0-ubuntu-18-metadata.

What to do first

Ensure:

  • You have the jq program and OCI Registry As Storage (ORAS) CLI installed.

Retrieve metadata for a single artifact

The ORAS CLI oras is used to consume Tanzu Application Catalog metadata from an OCI registry.

As an example, use the command below to access the metadata for the Apache container. Replace the REGISTRY placeholder with the correct registry URL and the USERNAME and PASSWORD placeholders with your credentials for the Tanzu Application Catalog registry.

$ oras pull -u USERNAME -p PASSWORD REGISTRY/apache:latest-metadata

Following the file naming convention explained previously, the metadata associated with the latest tag is tagged as latest-metadata. Therefore, the command above retrieves the metadata for the latest release of the Apache container image.

Retrieve all metadata

It is also possible to retrieve all Tanzu Application Catalog metadata from a given registry for containers and charts using a Bash script. An example script is shown below.

#!/bin/bash

PWD=$(pwd)
REGISTRY=$1
NAMESPACE=$2
USER=$3
PASSWORD=$4

if [ -z ${USER} ] || [ -z ${PASSWORD} ] || [ -z ${NAMESPACE} ] || [ -z ${REGISTRY} ]; then
    echo "Registry, Namespace, USER and Password are required"
    echo "./batch-metadata-pull.sh [REGISTRY] [NAMESPACE] [USER] [PASSWORD]"
    exit 1
fi

# Step 1: Download the charts-index from Harbor instance
echo ""
echo "Executing command: docker run -v ${PWD}:/workspace bitnami/oras:1.1.0 pull -u USER -p PASSWORD ${REGISTRY}/${NAMESPACE}/charts-index:latest"
docker run -v ${PWD}:/workspace bitnami/oras:1.1.0 pull -u "${USER}" -p "${PASSWORD}" ${REGISTRY}/${NAMESPACE}/charts-index:latest
echo ""

# Step 2: Iterate over output file, downloading metadata
INDEX_FILE="asset-index.json"

# Iterate over each container
for row in $(cat "${INDEX_FILE}" | jq -r '.containers[] | @base64'); do
    _jq() {
    echo ${row} | base64 --decode | jq -r ${1}
    }

    NAME=$(_jq '.name')
    VERSIONS=$(_jq '.versions')
    echo "---------- ${NAME} container metadata ----------"
    for row in $(echo $VERSIONS | jq -r '.[] | @base64'); do
        VERSION=$(_jq '.version')
        DIGEST=$(echo $(_jq '.digest') | awk -F '@' '{print $2}')
        TAGS=$(echo $(_jq '.tags'))
        tag=''
        foundTag='no'
        for row in $(echo $TAGS | jq -r '.[] | @base64'); do
            fulltag=$(echo $row | base64 --decode)
            tag=$(echo $fulltag | awk -F ':' '{print $2}')
            # ^.*\-\r[0-9]+$ would be a simpler regex to capture whatever tag finishing in '-rX'
            if [[ $tag =~ ^[0-9]+(\.[0-9]+\.[0-9]+)?(\-[0-9]+)?\-[a-z]+(\-[0-9]+)?\-\r[0-9]+$ ]]; then
                foundTag='yes'
                break
            fi
        done
        if [[ $foundTag == 'no' ]]; then
            echo "ERROR: not found any immutable tag for $NAME"
            echo "These are the tags:"
            echo $TAGS
            echo ""
        fi
        mkdir -p "metadata/containers/${NAME}/${DIGEST}"
        METADATA_ENDPOINT="${REGISTRY}/${NAMESPACE}/containers/${NAME}:${tag}-metadata"
        echo "Executing command: docker run -v ${PWD}/containers:/workspace bitnami/oras:1.1.0 pull -u USER -p PASSWORD ${METADATA_ENDPOINT}"
        docker run -v ${PWD}/metadata/containers/${NAME}/${VERSION}:/workspace bitnami/oras:1.1.0 pull -u "${USER}" -p "${PASSWORD}" ${METADATA_ENDPOINT}
    done
    echo ""
done


# Iterate over each chart
for row in $(cat "${INDEX_FILE}" | jq -r '.charts[] | @base64'); do
    _jq() {
    echo ${row} | base64 --decode | jq -r ${1}
    }

    NAME=$(_jq '.name')
    VERSIONS=$(_jq '.versions')
    echo "---------- ${NAME} chart metadata ----------"
    for row in $(echo $VERSIONS | jq -r '.[] | @base64'); do
        VERSION=$(_jq '.version')
        DIGEST=$(_jq '.digest')
        mkdir -p "metadata/charts/${NAME}/${DIGEST}"
        METADATA_ENDPOINT="${REGISTRY}/${NAMESPACE}/charts/${NAME}:${VERSION}-metadata"
        echo "Executing command: docker run -v ${PWD}/charts:/workspace bitnami/oras:1.1.0 pull -u USER -p PASSWORD ${METADATA_ENDPOINT}"
        docker run -v ${PWD}/metadata/charts/${NAME}/${VERSION}:/workspace bitnami/oras:1.1.0 pull -u "${USER}" -p "${PASSWORD}" ${METADATA_ENDPOINT}
    done
    echo ""
done

This script can be used as follows, replacing the REGISTRY, PROJECT, USERNAME and PASSWORD placeholders with your Tanzu Application Catalog registry name, username and token

$ ./batch-metadata-pull.sh REGISTRY PROJECT USERNAME PASSWORD
Note

The script performs a batch download and, therefore, its execution time depends on the number of assets in the registry.

Useful links

check-circle-line exclamation-circle-line close-line
Scroll to top icon