Linux Custom Machine Images

This procedure walks you through building a Linux custom machine image to use when creating clusters on AWS, Azure, or vSphere. It is divided into the sections below:

Linux Image Prerequisites

To build a Linux custom machine image, you need:

  • An account on your target infrastructure, AWS, Azure, or vSphere.
  • A macOS or Linux workstation with the following installed:
    • Docker Desktop
    • For AWS: The aws command-line interface (CLI)
    • For Azure: The az CLI
    • For vSphere: To build a RHEL 8 image you need a Linux workstation, not macOS.

Build a Linux Image

  1. On AWS and Azure, log in to your infrastructure CLI. Authenticate and specify your region, if prompted:

    • AWS: Run aws configure.
    • Azure: Run az login.
  2. On vSphere, create a credentials JSON file and fill in its values:

    {
    "cluster": "",
    "convert_to_template": "false",
    "create_snapshot": "true",
    "datacenter": "",
    "datastore": "",
    "folder": "",
    "insecure_connection": "false",
    "linked_clone": "true",
    "network": "",
    "password": "",
    "resource_pool": "",
    "template": "",
    "username": "",
    "vcenter_server": ""
    }
    
  3. Determine the Image Builder configuration version that you want to build from.

    • Search the VMware {code} Sample Exchange for TKG Image Builder to list the available versions.
    • Each Image Builder version corresponds to its compatible Kubernetes and Tanzu Kubernetes Grid versions. For example, TKG-Image-Builder-for-Kubernetes-v1_23_8---vmware_1-tkg-v1_6_0.zip builds a Kubernetes v1.23.8 image for Tanzu Kubernetes Grid v1.6.0.
    • If you need to create a management cluster, which you must do when you first install Tanzu Kubernetes Grid, choose the default Kubernetes version of your Tanzu Kubernetes Grid version. For example, in Tanzu Kubernetes Grid v1.6, the default Kubernetes version is v1.23.8.
  4. Download the configuration code zip file, and unpack its contents.

  5. cd into the TKG-Image-Builder- directory, so that the tkg.json file is in your current directory.

  6. Ensure that your workstation can access the VMware image registry projects.registry.vmware.com.

  7. Download and run the desired artifact container from projects.registry.vmware.com.

    docker pull projects.registry.vmware.com/tkg/linux-resource-bundle:v1.23.8_vmware.2-tkg.1
    
    docker run -d -p 3000:3000 projects.registry.vmware.com/tkg/linux-resource-bundle:v1.23.8_vmware.2-tkg.1
    
  8. Edit tkg.json to populate <IP> and <PORT>, where:

    • IP corresponds to the IP of the machine running the Docker container.
    • PORT associates an unused port on the Docker host with port 3000 on the container, for example 3001:3000. The container exposes the artifacts over port 3000.
  9. Node hardening is supported for Ubuntu and Photon images (CIS for Ubuntu, STIG for Photon). Add custom roles in "custom_role_names" field in tkg.json, for example:

    "custom_role_names": "/home/imagebuilder/tkg, /home/imagebuilder/<cis/stig>"
    
  10. Internet-Restricted: To build images for an internet-restricted environment that accesses the internet via HTTP proxy server, add the following to the tkg.json file:

    {
      "http_proxy": "http://proxy.acme.com:80",
      "https_proxy": "http://proxy.acme.com:80",
      "no_proxy": "localhost, 127.0.0.1, acme.com, 10.0.0.0/8"
    }
    
  11. GPU-Enabled Clusters: To build an image for a GPU-enabled cluster for vSphere, create a file named customizations.json and add the following:

    {
    "vmx_version": "17"
    }
    
  12. Save the customizations.json in the same directory as tkg.json, which you edited in a previous step.

  13. Collect the following parameter strings to plug into the command in the next step. Many of these specify docker run -v parameters that copy your current working directories into the /home/imagebuilder directory of the container used to build the image.

    • AUTHENTICATION: Copies your local CLI directory:
      • AWS: Use ~/.aws:/home/imagebuilder/.aws
      • Azure: Use ~/.azure:/home/imagebuilder/.azure
      • vSphere: /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json
    • SOURCES: Copies the repo’s tkg.json file, which lists download sources for versioned OS, Kubernetes, container network interface (CNI). images:
      • Use /PATH/TO/tkg.json:/home/imagebuilder/tkg.json
    • ROLES: Copies the repo’s tkg directory, which contains Ansible roles required by Image Builder.
      • Use /PATH/TO/tkg:/home/imagebuilder/tkg
      • To add custom Ansible roles, edit the tkg.json file to reformat the custom_role_names setting with escaped quotes (\"), in order to make it a list with multiple roles. For example:
        "custom_role_names": "\"/home/imagebuilder/tkg /home/imagebuilder/mycustomrole\"",
    • TESTS: Copies a goss test directory designed for the image’s target infrastructure, OS, and Kubernetes verson:
      • Use the filename of a file in the repo’s goss directory, for example amazon-ubuntu-1.23.8+vmware.2-goss-spec.yaml.
    • CUSTOMIZATIONS: Copies a customizations file in JSON format. See Customization in the Image Builder documentation. Before making any modifications, consult with VMware Customer Reliability Engineering (CRE) for best practices and recommendations.
    • PACKER_VAR_FILES: A space-delimited list of the JSON files above that contain variables for Packer.
    • (Azure) AZURE-CREDS: Path to an Azure credentials file, as described in the Image Builder documentation.
    • COMMAND: Use a command like one of the following, based on the custom image OS. For vSphere and Azure images, the commands start with build-node-ova- and build-azure-sig-:
      • build-ami-ubuntu-2004: Ubuntu v20.04
      • build-ami-ubuntu-1804: Ubuntu v18.04
      • build-ami-amazon-2: Amazon Linux 2
      • build-node-ova-vsphere-ubuntu-2004-efi: GPU-enabled clusters
  14. Using the strings above, run the Image Builder in a Docker container pulled from the VMware registry projects.registry.vmware.com:

    docker run -it --rm \
        -v AUTHENTICATION \
        -v SOURCES \
        -v ROLES \
        -v /PATH/TO/goss/TESTS.yaml:/home/imagebuilder/goss/goss.yaml \
        -v /PATH/TO/CUSTOMIZATIONS.json:/home/imagebuilder/CUSTOMIZATIONS.json \
        --env PACKER_VAR_FILES="tkg.json CUSTOMIZATIONS.json" \
        --env-file AZURE-CREDS \
        --env IB_OVFTOOL=1 \
        projects.registry.vmware.com/tkg/image-builder:v0.1.11_vmware.3 \
        COMMAND
    

    Notes:

    • Omit env-file if you are not building an image for Azure.
    • This command may take several minutes to complete.

    For example, to create a custom image with Ubuntu v20.04 and Kubernetes v1.23.8 to run on AWS, running from the directory that contains tkg.json:

    docker run -it --rm \
        -v ~/.aws:/home/imagebuilder/.aws \
        -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
        -v $(pwd)/tkg:/home/imagebuilder/tkg \
        -v $(pwd)/cis:/home/imagebuilder/cis \
        -v $(pwd)/goss/amazon-ubuntu-1.23.8+vmware.2-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
        -v /PATH/TO/CUSTOMIZATIONS.json /home/imagebuilder/aws.json \
        --env PACKER_VAR_FILES="tkg.json aws.json" \
        --env IB_OVFTOOL=1 \
        projects.registry.vmware.com/tkg/image-builder:v0.1.11_vmware.3 \
        build-ami-ubuntu-2004
    

    For vSphere, you must use the custom container image created above. You must also set a version string that will match what you pass in your custom TKr in the later steps. While VMware published OVAs will have a version string like v1.23.8+vmware.2-tkg.1, it is recommended that the -tkg.1 be replaced with a string meaningful to your organization. To set this version string, define it in a metadata.json file like the following:

    {
      "VERSION": "v1.23.8+vmware.2-myorg.0"
    }
    

    When building OVAs, the .ova file is saved to the local filesystem of your workstation. Whatever folder you want those OVAs to be saved in should be mounted to /home/imagebuilder/output within the container. Then, create the OVA using the container image:

    docker run -it --rm \
      -v /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json \
      -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
      -v $(pwd)/tkg:/home/imagebuilder/tkg \
      -v $(pwd)/cis:/home/imagebuilder/cis \
      -v $(pwd)/goss/vsphere-ubuntu-1.23.8+vmware.2-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
      -v $(pwd)/metadata.json:/home/imagebuilder/metadata.json \
      -v /PATH/TO/OVA/DIR:/home/imagebuilder/output \
      --env PACKER_VAR_FILES="tkg.json vsphere.json" \
      --env OVF_CUSTOM_PROPERTIES=/home/imagebuilder/metadata.json \
      --env IB_OVFTOOL=1 \
      projects.registry.vmware.com/tkg/image-builder:v0.1.11_vmware.3 \
      build-node-ova-vsphere-ubuntu-2004-efi
    

    For GPU-Enabled clusters, we need to include the customizations.json file created in the steps above when running the command to create the OVA.

    docker run -it --rm \
      -v /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json \
      -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \
      -v $(pwd)/tkg:/home/imagebuilder/tkg \
      -v $(pwd)/cis:/home/imagebuilder/cis \
      -v $(pwd)/goss/vsphere-ubuntu-1.23.8+vmware.2-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \
      -v $(pwd)/metadata.json:/home/imagebuilder/metadata.json \
      -v $(pwd)/customizations.json:/home/imagebuilder/customizations.json \
      -v /PATH/TO/OVA/DIR:/home/imagebuilder/output \
      --env PACKER_VAR_FILES="tkg.json vsphere.json customizations.json" \
      --env OVF_CUSTOM_PROPERTIES=/home/imagebuilder/metadata.json \
      --env IB_OVFTOOL=1 \
      projects.registry.vmware.com/tkg/image-builder:v0.1.11_vmware.3 \
      build-node-ova-vsphere-ubuntu-2004-efi
    

    RHEL: To build a RHEL OVA you need to use a Linux machine, not macOS, because Docker on macOS does not support the --network host option.
    You must also include additional flags in the docker run command above, so that the container mounts your RHEL ISO rather than pulling from a public URL, and so that it can access Red Hat Subscription Manager credentials to connect to vCenter:

      -v $(pwd)/isos/rhel-8.4-x86_64-dvd.iso:/rhel-8.4-x86_64-dvd.iso \
      --network host \
      --env RHSM_USER=USER --env RHSM_PASS=PASS
    

    Where:

    • RHSM_USER and RHSM_PASS, are the user/password combination that registers licensed usage of the OS with Red Hat Subscription Manager to gain temporary access to RPM repositories.
    • You map your local RHEL ISO path, in $(pwd)/isos/rhel-8.4-x86-64-dvd.iso in the example above, as an additional volume.

Create a TKr for the Linux Image

To make your Linux image the default for future Kubernetes versions and manage it using all the options detailed in Deploy Workload Clusters with Different Kubernetes Versions, create a TKr based on it. Otherwise, skip to Use a Linux Image for a Workload Cluster below.

To create a TKr, you add it to the Bill of Materials (BoM) of the TKr for the image’s Kubernetes version. For example, to add a custom image that you built with Kubernetes v1.23.8, you modify the current ~/.config/tanzu/tkg/bom/tkr-bom-v1.23.8+vmware.2-tkg.1.yaml file.

  1. From your ~/.config/tanzu/tkg/bom/ directory, open the TKr BoM corresponding to your custom image’s Kubernetes version. For example with a filename like tkr-bom-v1.23.8+vmware.2-tkg.1.yaml for Kubernetes v1.23.8.

  2. In the BoM file, find the image definition blocks for your infrastructure: ova for vSphere, ami for AWS, and azure for Azure.

  3. Determine whether an existing definition block applies to your image’s OS, as listed by osinfo.name, .version, and .arch.

  4. If no existing block applies to your image’s osinfo, add a new block as follows. If an existing block does apply, replace its values as follows:

    • vSphere:
      • name: a unique name for your OVA that includes the OS version, like my-ubuntu-2004
      • version: follow existing version value format, but use the unique VERSION assigned in metadata.json when you created the OVA, for example v1.23.8+vmware.2-myorg.0.
    • AWS - for each region that you plan to use the custom image in:
      • id: follow existing id value format, but use a unique hex string at the end, for example ami-693a5e2348b25e428
    • Azure:
      • sku: a unique SKU for your image that includes the OS version, like my-k8s-1dot23dot8-ubuntu-2004

    If the BoM file defines images under regions, your new or modified custom image definition block must be listed first in its region. Within each region, the cluster creation process picks the first suitable image listed.

  5. In the release.version value, set a custom version by adding a suffix.

    Do not customize the version by adding a prefix. For example, change v1.23.8+vmware.2-tkg.1 to v1.23.8+vmware.2-tkg.1-mycustomtkr.

  6. Save the BoM file with the same custom suffix as you specified for release.version in the previous step.

    If the filename includes a plus (+) character, replace the + with a triple dash (---).

    For example, save the BOM file as tkr-bom-v1.23.8---vmware.2-tkg.1-mycustomtkr.yaml.

  7. base64-encode the file contents into a binary string, for example:

    cat tkr-bom-v1.23.8---vmware.2-tkg.1-mycustomtkr.yaml | base64 -w 0
    
  8. Create a ConfigMap YAML file, for example named configmap-v1.23.8---vmware.2-tkg.1-mycustomtkr.yaml, with values as shown:

    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: CUSTOM-TKG-BOM
     labels:
       tanzuKubernetesRelease: CUSTOM-TKR
    binaryData:
     bomContent: BOM-BINARY-CONTENT
    

    Where:

    • CUSTOM-TKG-BOM is the name of the ConfigMap, which must include the TKr release.version value that you specified in the BOM file, and replace any + symbols with a triple dash (—). For example, set v1.23.8---vmware.2-tkg.1-mycustomtkr.
    • CUSTOM-TKR is a name for your TKr, which must match the value you specify for CUSTOM-TKG-BOM. For example, v1.23.8---vmware.2-tkg.1-mycustomtkr.

    • BOM-BINARY-CONTENT is the base64-encoded content of your customized BoM file, that you generated in the previous step.

    For example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: v1.23.8---vmware.2-tkg.1-mycustomtkr
     labels:
       tanzuKubernetesRelease: v1.23.8---vmware.2-tkg.1-mycustomtkr
    binaryData:
     bomContent: BOM-BINARY-CONTENT
    
  9. Save the ConfigMap file, set the kubectl context to a management cluster you want to add TKr to, and apply the file to the cluster, for example:

    kubectl -n tkr-system apply -f configmap-v1.23.8---vmware.2-tkg.1-mycustomtkr.yaml
    
    
    • Once the ConfigMap is created, the TKr Controller reconciles the new object by creating a TanzuKubernetesRelease.
      The default reconciliation period is 600 seconds. You can avoid this delay by deleting the TKr Controller pod, which makes the pod restore and reconcile immediately:

      1. List pods in the tkr-system namespace:

        kubectl get pod -n tkr-system
        
      2. Retrieve the name of the TKr Controller pod, which looks like tkr-controller-manager-f7bbb4bd4-d5lfd

      3. Delete the pod:

        kubectl delete pod -n tkr-system TKG-CONTROLLER
        

      Where TKG-CONTROLLER is the name of the TKr Controller pod.

  10. To check that the custom TKr was added, run tanzu kubernetes-release get or kubectl get tkr or and look for the CUSTOM-TKR value set above in the output.

Once your custom TKr is listed by the kubectl and tanzu CLIs, you can use it to create management or workload clusters as described below.

Use a Linux Image for a Management Cluster

To create a management cluster that uses your custom image as the base OS for its nodes:

  1. Upload the image to your cloud provider.

  2. When you run the installer interface, select the custom image in the OS Image pane, as described in Select the Base OS Image.

For more information, see How Base OS Image Choices are Generated.

Use a Linux Image for a Workload Cluster

The procedure for creating a workload cluster from your Linux image differs depending on whether you created a TKr in Create a TKr for the Linux Image above.

  • If you created a TKr, pass the TKr name as listed by tanzu kubernetes-release get to the --tkr option of tanzu cluster create.

  • If you did not create a TKr, follow these steps:

    1. Copy your management cluster configuration file and save it with a new name by following the procedure in Create a Workload Cluster Configuration File.

    2. In the new configuration file, add or modify the following:

      VSPHERE_TEMPLATE: LINUX-IMAGE
      

      Where LINUX-IMAGE is the name of the Linux image you created in Build a Linux Image.

      Remove CLUSTER_NAME and its setting, if it exists.

    3. Deploy a workload cluster as described in Deploy Workload Clusters.

check-circle-line exclamation-circle-line close-line
Scroll to top icon