Linux Custom Machine Images

This procedure walks you through building a Linux custom machine image to use when creating clusters on AWS, Azure, or vSphere. It is divided into the sections below. For more information about cluster types in Tanzu Kubernetes Grid, see Workload Cluster Types.

As noted within the procedures, some steps differ depending on whether you are building an image for a class-based or plan-based (legacy) cluster.

Linux Image Prerequisites

To build a Linux custom machine image, you need:

  • An account on your target infrastructure, AWS, Azure, or vSphere.
  • A macOS or Linux workstation with the following installed:
    • Docker Desktop
    • For AWS: The aws command-line interface (CLI)
    • For Azure: The az CLI
    • For vSphere: To build a RHEL 8 image you need a Linux workstation, not macOS.

(Class-Based, vSphere) Retrieve the OS Image Version

Before building an image to use for class-based clusters on vSphere, you must retrieve the OS image version that is associated with the default Ubuntu OVA for the Kubernetes version that you want to use for your custom image. You will assign this OS image version to your custom image in the Build a Linux Image step below.

To retrieve the OS image version, do one of the following depending on your use case:

  • If you have a running management cluster that was created using the default Kubernetes version for the current Tanzu Kubernetes Grid version, you can retrieve the OS image version from the cluster:

    1. Set your kubectl context to the management cluster.

    2. From the list of available TKrs, choose the Tanzu Kubernetes Release (TKr) for the Kubernetes version that you want to use for you custom image. For example, v1.26.8---vmware.2-tkg.1. To list available TKrs, run:

      kubectl get tkr
      
    3. Open the TKr and record the osImages property. This property specifies the names of OSImage objects associated with the TKr.

    4. List OSImage objects in Kubernetes:

      kubectl get osimages
      
    5. In the output, find the OSImage object listing that matches the TKr osImages name for the default Ubuntu OVA.

    6. Record the NAME property of the OSImage listing and replace its embedded --- with a + character. For example, v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8.

  • If you do not have a running management cluster that was created using the default Kubernetes version for the current Tanzu Kubernetes Grid version, you can retrieve the OS image version directly from the default Ubuntu OVA, either locally or from vSphere:

    • To retrieve the OS image version locally:

      1. Download the default Ubuntu OVA for your target Kubernetes version:

        1. Go to the Broadcom Support Portal and log in with your VMware customer credentials.
        2. Go to the Tanzu Kubernetes Grid downloads page
        3. In the version drop-down, select 2.3.1.
        4. Download the OVA. For example, Ubuntu 2004 Kubernetes v1.26.8 OVA.
      2. Unpack the downloaded OVA using the extraction tool of your choice.
      3. In the unpacked directory, locate the .ofv file.
      4. In the .ofv file, search for the OVA VERSION property and record its value. For example, v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8. The property looks similar to the following:

        <Property ovf:key="VERSION" ovf:type="string" ovf:userConfigurable="false" ovf:value="v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8"/>
        
    • If you already uploaded the default Ubuntu OVA for your target Kubernetes version to vSphere, you can alternatively retrieve the OS image version by inspecting the OVA VM properties in the vSphere UI or by using the govc CLI. To use this method, retrieve the OS image version before converting the OVA VM to a template.

      • To retrieve the OS image version from the vSphere UI:

        1. Locate the OVA VM and open the Configure tab on the OVA VM summary page.
        2. Go to Settings –> vApp Options.
        3. In the Properties table, locate the VERSION key and record its Default Value. For example, v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8.
      • To retrieve the OS image version using the govc CLI, run the govc vm.info command. For example:

        govc vm.info -json /dc0/vm/ubuntu-2004-kube-v1.26.8+vmware.1-tkg.1 | jq
        

        In the output, search for "Id": "VERSION" and record the value of the "DefaultValue" property. For example:

        {
        "Key": 10,
        "ClassId": "",
        "InstanceId": "",
        "Id": "VERSION",
        "Category": "Cluster API Provider (CAPI)",
        "Label": "VERSION",
        "Type": "string",
        "TypeReference": "",
        "UserConfigurable": false,
        "DefaultValue": "v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8",
        "Value": "",
        "Description": ""
        }
        

Build a Linux Image

  1. Set up authentication for your infrastructure:

    • vSphere: Create a credentials JSON file and fill in its values:

      {
      "cluster": "",
      "convert_to_template": "false",
      "create_snapshot": "true",
      "datacenter": "",
      "datastore": "",
      "folder": "",
      "insecure_connection": "false",
      "linked_clone": "true",
      "network": "",
      "password": "",
      "resource_pool": "",
      "template": "",
      "username": "",
      "vcenter_server": ""
      }
      
    • AWS: Log in to the aws CLI. Then authenticate and specify your region, if prompted:

      aws configure
      
    • Azure: Log in to the az CLI. Then create a configuration JSON file azure-sig.json and fill in the Azure specific information. An example of such a file can be found here.

  2. Download the Linux resource bundle container from projects.registry.vmware.com:

    1. Ensure that your workstation can access the VMware image registry projects.registry.vmware.com.

    2. Download and run the container with the Kubernetes Linux binaries that Image Builder needs to build a Linux OVA:

      docker pull projects.registry.vmware.com/tkg/linux-resource-bundle:v1.26.8_vmware.1-tkg.1
      
      docker run -d -p 3000:3000 projects.registry.vmware.com/tkg/linux-resource-bundle:v1.26.8_vmware.1-tkg.1
      
  3. Download the Image Builder configuration directory:

    1. Determine the Image Builder configuration version that you want to build from.

      • Search the Broadcom Communities for TKG Image Builder to list the available versions.
      • Each Image Builder version corresponds to its compatible Kubernetes and Tanzu Kubernetes Grid versions. For example, TKG-Image-Builder-for-Kubernetes-v1_26_8---vmware_1-tkg_v2_3_1.zip builds a Kubernetes v1.26.8 image for Tanzu Kubernetes Grid v2.3.1.
      • If you need to create a management cluster, which you must do when you first install Tanzu Kubernetes Grid, choose the default Kubernetes version of your Tanzu Kubernetes Grid version. For example, in Tanzu Kubernetes Grid v2.3.1, the default Kubernetes version is v1.26.8. For workload clusters, you can also build a Kubernetes v1.25.13 or v1.24.17, in addition to v1.26.8.

      The steps below explain how to build a Kubernetes v1.26.8 image for Tanzu Kubernetes Grid v2.3.1.

    2. Download the configuration code zip file, and unpack its contents.

    3. cd into the TKG-Image-Builder- directory, so that the tkg.json file is in your current directory.

  4. vSphere For vSphere, create a metadata.json file in the image builder directory that sets a version string to match what you list in your custom TKr in the later steps:

    • Class-based: Use the value that you retrieved in the Retrieve the OS Image Version step above, for example:

      {
      "VERSION": "v1.26.8+vmware.1-tkg.1-0edd4dafbefbdb503f64d5472e500cf8"
      }
      
    • Plan-based: The image-builder gives the OVAs that it creates a version string that identical to the VMware-published OVAs, like v1.26.8+vmware.1-tkg.1. For custom images, VMware recommends replacing the -tkg.1 with a string meaningful to your organization, for example:

      {
      "VERSION": "v1.26.8+vmware.1-myorg.0"
      }
      
  5. Edit the tkg.json file to fill in <IP> and <PORT> settings and customizations for containerd_url and kubernetes_http_source, where:

    • IP corresponds to the IP of the machine running the Docker container.
    • PORT associates an unused port on the Docker host with port 3000 on the container, for example 3001:3000. The container exposes the artifacts over port 3000.
  6. To include the following options, continue editing the tkg.json file:

    • Photon: If building a Photon-3 OVA, edit "extra_rpms" in tkg.json to reflect additional custom packages supported:

      "extra_rpms": "sysstat nfs-utils ethtool apparmor-parser"
      
    • STIG and CIS Hardening: To harden your custom Ubuntu image above default levels:

      1. Add a line that sets ansible_user_vars to some or all of the following variables to true. These default to false:

      2. STIG:

        • install_aide - Activate AIDE (Advanced Intrusion Detection Environment)
        • install_sshd_login_banner - Install DoD login banner
        • remove_existing_ca_certs - Keep DoD PKI Infrastructure
        • install_audispd_plugins - Install event multiplexor (audispd) plugins
      3. CIS:

        • install_aide - Activate AIDE (Advanced Intrusion Detection Environment)
        • install_clamav - Activate ClamAV AntiVirus
        • install_systemd_timesyncd - Use timesyncd instead of chrony
        • install_protect_kernel_defaults - Set kernel-protect defaults upstream
      4. Modify the custom_role_names setting by adding /home/imagebuilder/stig_ubuntu_2004 for STIG or /home/imagebuilder/cis_ubuntu_2004 for CIS.

      For example, for additional CIS hardening:

        "ansible_user_vars": "install_aide=true install_clamav=true install_systemd_timesyncd=true install_protect_kernel_defaults=true",
        "custom_role_names": "/home/imagebuilder/tkg /home/imagebuilder/cis_ubuntu_2004",
      
      Note

      Custom Photon images are not supported for additional hardening via ansible_user_vars.

    • FIPS: To build a FIPS-enabled image, remove the following line in tkg.json, if present:

      "ansible_user_vars": "install_fips=no"
      
    • Internet-Restricted: To build images for an internet-restricted environment that accesses the internet via HTTP proxy server, add the following:

      "http_proxy": "http://proxy.acme.com:80",
      "https_proxy": "http://proxy.acme.com:80",
      "no_proxy": "localhost, 127.0.0.1, acme.com, 10.0.0.0/8"
      
    • GPU-Enabled Clusters: To build an image for a GPU-enabled cluster add the following:

      "vmx_version": "17"
      

    You can add additional customizations to tkg.json or put them in a separate file customizations.json.

  7. Collect the following parameter strings to plug into the docker command in the next step. Many of these specify docker run -v parameters that copy your current working directories into the /home/imagebuilder directory of the container used to build the image:

    • AUTHENTICATION: Copies your local CLI directory. Use:
      • vSphere: /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json
      • AWS: ~/.aws:/home/imagebuilder/.aws
      • Azure: ~/.azure:/home/imagebuilder/.azure
    • SOURCES: Copies the repo’s tkg.json file, which lists download sources for versioned OS, Kubernetes, container network interface (CNI) images:
      • Use /PATH/TO/tkg.json:/home/imagebuilder/tkg.json
    • ROLES: The repo’s tkg directory, which contains Ansible roles required by Image Builder.
      • Use `/PATH/TO/tkg:/home/imagebuilder/tkg
    • TESTS: Copies a goss test directory designed for the image’s target infrastructure, OS, and Kubernetes verson:
      • Use the filename of a file in the repo’s goss directory.
      • Example: amazon-ubuntu-1.26.8+vmware.1-goss-spec.yaml
    • CUSTOMIZATIONS: Copies a customizations file in JSON format.
      • See Customization in the Image Builder documentation.
      • Before making any modifications, consult with VMware Customer Reliability Engineering (CRE) for best practices and recommendations.
    • PACKER_VAR_FILES: A space-delimited list of the JSON files above that contain variables for Packer.
    • (Azure) AZURE-CREDS: Path to an Azure credentials file, as described in the Image Builder documentation.
    • COMMAND: Use a command like one of the following, based on the custom image OS. For vSphere and Azure images, the commands start with build-node-ova- and build-azure-sig-:
      • build-ami-ubuntu-2004: Ubuntu v20.04
      • build-ami-ubuntu-1804: Ubuntu v18.04
      • build-ami-amazon-2: Amazon Linux 2
      • build-node-ova-vsphere-ubuntu-2004: GPU-enabled clusters
  8. Using the strings above, run the Image Builder in a Docker container pulled from the VMware registry projects.registry.vmware.com

    Omit metadata.json if you are not building an image for vSphere and env-file if you are not building an image for Azure:

    export ROLES="... the value for roles you created above"
    export SOURCES="... ..." 
    docker run -it --rm \
        -v $AUTHENTICATION \
        -v $SOURCES \
        -v $ROLES \
        -v /PATH/TO/goss/TESTS.yaml:/home/imagebuilder/goss/goss.yaml \
        -v /PATH/TO/metadata.json:/home/imagebuilder/metadata.json \
        -v /PATH/TO/CUSTOMIZATIONS.json:/home/imagebuilder/CUSTOMIZATIONS.json \
        --env PACKER_VAR_FILES="tkg.json CUSTOMIZATIONS.json" \
        --env-file AZURE-CREDS \
        --env IB_OVFTOOL=1 \
        projects.registry.vmware.com/tkg/image-builder:v0.1.13_vmware.2 \
        COMMAND
    
    Note

    This command may take several minutes to complete.

    Examples

    vSphere: The .ova file is saved to the local filesystem of your workstation. Whatever folder you want those OVAs to be saved in should be mounted to /home/imagebuilder/output within the container. Then, create the OVA using the container image:

    docker run -it --rm \
      -v /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json \
      -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
      -v $(pwd)/tkg:/home/imagebuilder/tkg \
      -v $(pwd)/goss/vsphere-ubuntu-1.26.8+vmware.1-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
      -v $(pwd)/metadata.json:/home/imagebuilder/metadata.json \
      -v /PATH/TO/OVA/DIR:/home/imagebuilder/output \
      --env PACKER_VAR_FILES="tkg.json vsphere.json" \
      --env OVF_CUSTOM_PROPERTIES=/home/imagebuilder/metadata.json \
      --env IB_OVFTOOL=1 \
      projects.registry.vmware.com/tkg/image-builder:v0.1.13_vmware.2 \
      build-node-ova-vsphere-ubuntu-2004
    

    GPU-Enabled clusters: include the customizations.json file created in the steps above when running the command to create the OVA:

    docker run -it --rm \
      -v /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json \
      -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
      -v $(pwd)/tkg:/home/imagebuilder/tkg \
      -v $(pwd)/goss/vsphere-ubuntu-1.26.8+vmware.1-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
      -v $(pwd)/metadata.json:/home/imagebuilder/metadata.json \
      -v $(pwd)/customizations.json:/home/imagebuilder/customizations.json \
      -v /PATH/TO/OVA/DIR:/home/imagebuilder/output \
      --env PACKER_VAR_FILES="tkg.json vsphere.json customizations.json" \
      --env OVF_CUSTOM_PROPERTIES=/home/imagebuilder/metadata.json \
      --env IB_OVFTOOL=1 \
      projects.registry.vmware.com/tkg/image-builder:v0.1.13_vmware.2 \
      build-node-ova-vsphere-ubuntu-2004
    

    RHEL: to build a RHEL OVA you need to use a Linux machine, not macOS, because Docker on macOS does not support the --network host option.
    You must also register the OS as licensed with Red Hat and sign up for updates by adding the following to the docker run command above:

      -v $(pwd)/isos/rhel-8.4-x86_64-dvd.iso:/rhel-8.4-x86_64-dvd.iso \
      --network host \
      --env RHSM_USER=USER --env RHSM_PASS=PASS
    

    Where:

    • RHSM_USER and RHSM_PASS are the username and password for your Red Hat Subscription Manager account.
    • You map your local RHEL ISO path, in $(pwd)/isos/rhel-8.4-x86-64-dvd.iso in the example above, as an additional volume.

    AWS: create a custom image with Ubuntu v20.04 and Kubernetes v1.26.8 to run on AWS, running from the directory that contains tkg.json:

    docker run -it --rm \
        -v ~/.aws:/home/imagebuilder/.aws \
        -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
        -v $(pwd)/tkg:/home/imagebuilder/tkg \
        -v $(pwd)/goss/amazon-ubuntu-1.26.8+vmware.1-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
        -v /PATH/TO/CUSTOMIZATIONS.json /home/imagebuilder/aws.json \
        --env PACKER_VAR_FILES="tkg.json aws.json" \
        --env IB_OVFTOOL=1 \
        projects.registry.vmware.com/tkg/image-builder:v0.1.13_vmware.2 \
        build-ami-ubuntu-2004
    
  9. Upload the image to your cloud provider.

    • For vSphere instructions, see Import the Base Image Template into vSphere in the Prepare to Deploy Management Clusters to vSphere.
    • If you uploaded the default Ubuntu OVA for your target Kubernetes version to vSphere, delete the default OVA before uploading your custom OVA.

Create a TKr for the Linux Image

To make your Linux image the default for future Kubernetes versions, create a TKr based on it. Otherwise, skip to Use a Linux Image for a Workload Cluster.

The diagram below provides a high-level overview of how to create a TKr for a custom Linux image on vSphere.

Create a TKr

To create a TKr:

  1. From your ~/.config/tanzu/tkg/bom/ directory, open the TKr BoM corresponding to your custom image’s Kubernetes version. For example with a filename like tkr-bom-v1.26.8+vmware.1-tkg.1.yaml for Kubernetes v1.26.8.

    If the directory lacks the TKr BoM file that you need, you can bring it in by deploying a cluster with the desired Kubernetes version, as described in Deploy a Cluster with a Non-Default Kubernetes Version.

    1. In the BoM file, find the image definition blocks for your infrastructure: ova for vSphere, ami for AWS, and azure for Azure. Each image definition block contains osinfo.name, osinfo.version, and osinfo.arch, where:

      • osinfo.name is the OS name. For example, ubuntu. To view the list of supported OSes, see Target Operating Systems.
      • osinfo.version is the OS version. For example, 20.04. To view the list of supported versions, see Target Operating Systems.
      • osinfo.arch is the OS arch. Supported value is amd64.
    2. To add a reference to your new OS image, add an image definition block under ova, ami, or azure, depending on your target infrastructure. Your image definition block must contain osinfo.name, osinfo.version, and osinfo.arch, as described above. Additionally, when adding an image definition block on:

      • vSphere:

        • name: Is a unique name for your OVA that includes the OS version, for example, my-ova-ubuntu-2004.
        • version: Use the unique VERSION assigned in metadata.json when you created the OVA, for example, v1.26.8+vmware.1-myorg.0.
        Note

        The version must exactly match the same VERSION in the metadata.json.

      • AWS: For each region that you plan to use the custom image in, follow existing id value format, but use a unique hex string at the end, for example, ami-693a5e2348b25e428.

      If the BoM file defines images under regions, your custom image definition block must be listed first in its region. Within each region, the cluster creation process picks the first suitable image listed.

    3. In the release.version value, set a custom version by adding a suffix. Do not customize the version by adding a prefix. For example, change v1.26.8+vmware.1-tkg.1 to v1.26.8+vmware.1-tkg.1-mycustomtkr.

    4. Save the BoM file with the same custom suffix as you specified for release.version in the previous step.

      If the filename includes a plus (+) character, replace the + with a triple dash (---).

      For example, save the BOM file as tkr-bom-v1.26.8---vmware.2-tkg.1-mycustomtkr.yaml.

  2. base64-encode the file contents into a binary string, for example:

    cat tkr-bom-v1.26.8---vmware.2-tkg.1-mycustomtkr.yaml | base64 -w 0
    
  3. Create a ConfigMap YAML file, for example named configmap-v1.26.8---vmware.2-tkg.1-mycustomtkr.yaml, with values as shown:

    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: CUSTOM-TKG-BOM
     labels:
       tanzuKubernetesRelease: CUSTOM-TKR
    binaryData:
     bomContent: "BOM-BINARY-CONTENT"
    

    Where:

    • CUSTOM-TKG-BOM is the name of the ConfigMap, which must include the TKr release.version value that you specified in the BOM file, and replace any + symbols with a triple dash (—). For example, set v1.26.8---vmware.2-tkg.1-mycustomtkr.
    • CUSTOM-TKR is a name for your TKr, which must match the value you specify for CUSTOM-TKG-BOM. For example, v1.26.8---vmware.2-tkg.1-mycustomtkr.
    • BOM-BINARY-CONTENT is the base64-encoded content of your customized BoM file, that you generated in the previous step.

    For example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: v1.26.8---vmware.2-tkg.1-mycustomtkr
     labels:
       tanzuKubernetesRelease: v1.26.8---vmware.2-tkg.1-mycustomtkr
    binaryData:
     bomContent: "YXBpVmVyc2lvbjogcnVuLnRhbnp1...."
    
  4. Save the ConfigMap file, set the kubectl context to a management cluster you want to add TKr to, and apply the file to the cluster, for example:

    kubectl -n tkr-system apply -f configmap-v1.26.8---vmware.2-tkg.1-mycustomtkr.yaml
    

    The TKr Controller reconciles the new ConfigMap object by creating a TanzuKubernetesRelease. The default reconciliation period is 600 seconds. You can avoid this delay by deleting the TKr Controller pod, which makes the pod restore and reconcile immediately:

    1. List pods in the tkr-system namespace:

      kubectl get pod -n tkr-system
      
    2. Retrieve the name of the TKr Controller pod, which looks like tkr-controller-manager-f7bbb4bd4-d5lfd

    3. Delete the pod:

      kubectl delete pod -n tkr-system TKG-CONTROLLER
      

      Where TKG-CONTROLLER is the name of the TKr Controller pod.

  5. To check that the custom TKr was added, run tanzu kubernetes-release get or kubectl get tkr or and look for the CUSTOM-TKR value set above in the output.

Once your custom TKr is listed by the kubectl and tanzu CLIs, you can use it to create management or workload clusters as described below.

Use a Linux Image for a Management Cluster

To create a management cluster that uses your custom image as the base OS for its nodes:

  1. When you run the installer interface, select the custom image in the OS Image pane, as described in Select the Base OS Image.

For more information, see How Base OS Image Choices are Generated.

Use a Linux Image for a Workload Cluster

The procedure for creating a workload cluster from your Linux image differs depending on whether you created a TKr in Create a TKr for the Linux Image above.

  • If you created a TKr, pass the TKr name as listed by tanzu kubernetes-release get to the --tkr option of tanzu cluster create.

  • If you did not create a TKr, follow these steps:

    1. Copy your management cluster configuration file and save it with a new name by following the procedure in Configuration Files and Object Specs.

    2. In the new configuration file, add or modify the following:

      VSPHERE_TEMPLATE: LINUX-IMAGE
      

      Where LINUX-IMAGE is the name of the Linux image you created in Build a Linux Image.

      Remove CLUSTER_NAME and its setting, if it exists.

    3. Deploy a workload cluster as described in Create Workload Clusters.

check-circle-line exclamation-circle-line close-line
Scroll to top icon