Copyright 2021 VMware, Inc

SPDX-License-Identifier: BSD-2-Clause


Tanzu Kubernetes Grid Multicloud Private Deployment on Azure Playbook

This deployment is based on the Reference Architecture, Tanzu for Kuberentes Oprations on Azure Hybrid-Cloud

Quick Install

Configure your Tanzu Mission Control instance with Tanzu Observability and Tanzu Service Mesh integrations before running the quick install.


az login

sh <subscription_id> <tenant_id>
cd 3_bootstrap
ssh -o IdentitiesOnly=yes -i ./bootstrap.pem azureuser@xxx

Run the ssh command provided to access the jumpbox. On the jumpbox run

cd tkg-install
export TO_URL="http://your-wavefront-url"
export TO_TOKEN="your-wavefront-token"
export TMC_TOKEN="your-tmc-token"
chmod a+x ./finish-install.sh
./finish-install.sh <myvmw_username> <myvmw_password>

If you set export SKIP_TSM=true, the installation will skip installing Tanzu Service Mesh. If you do not set TO_TOKEN, the installation will skip installing Tanzu Observability. If you do not set TMC_TOKEN, the installation will skip Tanzu Mission Control, Tanzu Observability, and Tanzu Service Mesh.

Repository Contents

This repository is broken into four components:

  • Keepers - defines long-lasting components that are used by terraform
  • Netsec - defines subnets, network security groups, and VNETs
  • DNS - used for Hybrid Azure deployments (not the quick install)
  • Bootstrap - used to create the jumpbox and prepopulate the jumpbox with relevant configurations

Each section is documented with its own README. The break out of sections is designed to align with separation of duties as necessary to allow each team to apply the terraform for their responsibilities.

If you are unable to use the wrapper to install Tanzu Kubernetes Grid Multicloud, you can follow this playbook to achieve the same results, or modify source code as necessary to achieve the changes you need.

  1. Apply Terraform code in 0_keepers

    • terraform apply -var="sub_id=..." -var="tenant_id=..."
    • where … above represent your respective values
  2. Execute the run_cmd output instructions from 0_keepers

    • OS requirements will vary
    • For example, export ARM_ACCESS_KEY="$(terraform output -raw access_key)"
    • For example, $env:ARM_ACCESS_KEY=(terraform output -raw access_key)
  3. Apply Terraform code in 1_netsec

    • terraform apply -var="sub_id=..."
    • where … above represent your respective values
  4. Apply Terraform code in 2_dns (as-needed)

    • This option only applies if you need an non-Azure source to resolve Azure Private DNS
    • terraform apply -var="sub_id=..."
    • where … above represent your respective values
  5. Apply Terraform code in 3_bootstrap

    • terraform apply -var="sub_id=..."
    • where … above represent your respective values
    • ssh_cmd output will vary by OS, so mind your rules (if you’re on Windows, you’ll probably have to fix the ACLs on this file)


Modify terraform.tfvars as necessary for each Terraform config directory. Anything uncommented or otherwise added to the tfvars file will override defaults and effectively escape the validation deployment that is the default. If the architecture of the code is maintained, then changes to 0_keepers will carry forward into subsequent steps.

Terraform, Infrastructure as Code

The examples given were designed to use an Azure Storage Account as a remote backend for Terraform’s state management. “Keepers” below, is a prerequisite and does not get stored in a remote state (in fact, it establishes a place remote state can be stored).

The following components are divided in such a way that steps can be skipped if the answers to those features are provided by another, either pre-existent service or Central IT-provided. Each component supplies a set of resources that are intended to be passed forward, ideally through secret storage within a secure vault.

The components are as follows:

Terraform runtime:

  • Terraform v1.17*
  • hashicorp/azurerm v2.98.0*

* See individual component directories for updates to these versions.

  • Keepers
  • Network and Network Security
  • DNS (Intermediate)
  • Deployment Prerequisites
  • Tanzu Bootstrap

Here are some of the assumptions on consumer’s behalf in case the preference must be modified.


Tag Names - In addition to those listed within the terraform.tfvars files, “StartDate” is in use within the code as an origin date in case it’s important to track that for resources. It’s set once when the resource is created, and it should never be changed thereafter (by Terraform). Additional tags can be added to the map var type in terraform.tfvars.

Azure Cloud - This has never been built for anything outside of the standard “AzureCloud.” Your mileage may vary on China or Government types.

Naming - the naming practice used herein could follow published Microsoft Cloud Adoption Framework. In short, that’s:


You will likely have to modify this to fit your customer’s needs. The liberties I’ve taken over this framework are as follows:

<resource-type>-<bu>-<app>-<env>-<region-abbrv>-### where ### is useful where multiples are generated (automatic). Otherwise, it’s not used. What’s more, the naming standard is entirely based upon the various prefix vars collected in terraform.tfvars. You are allowed to format those prefixes however you like, so the rules above are just suggestions. The only enforcement takes place at the resource level where <resource-type> is prepended to your prefix per Microsoft’s guidelines where applicable, and suffixes are added in situations to maintain uniqueness.

  • Resource-Type is aligned to Microsoft published guidelines where possible
  • region-abbrv can (and shoulbe) be an abbreviation. These examples are country-first and 4 characters:

East US 2 = use2

Modules - Modules used herein are the epitome of assumptions. These modules have been constructed to perform a set of tasks against categorical resources to produce standardization. This is because they represent those parts of an organization that may perform work on the Tanzu Kubernetes Grid Multicloud platform owner’s behalf. For instance, the subnet modules can create route tables and associate Network Security Groups as well. The important part of these modules is ultimately the output, and therefore you may arrive at these outputs in any number of ways.


“Keepers” are those resources that preempt the state-managed resources deployed by Terraform for this solution. They do not need to be dedicated to the Tanzu Kubernetes Grid Multicloud solution! Keepers currently include a Storage Account for state and an Azure Key Vault for secret storage.

IMPORTANT Update terraform.tfvars for your environment.

keepers - providers

Providers are maintained for all deployment directories out of the keepers directory. provider.tftpl is used to construct those files in future directories, so it’s important to understand that relationship.

keepers - terraform.tfvars

  • sub_id: Azure Subscription ID
  • location: Azure Region (For example, eastus2 or East US 2)
  • prefix: A prefix to resource names using your naming standards (For example, vmw-use2-svcname)
  • prefix_short: Some resources are limited in size and characters - this prefix solves for those (For example, vmwuse2svc). Can include 4-digits of randomized hexadecimal at the end

Tag values defaults to tags at the Subscription level, but are designed to be overriden by anything provided here.

  • ServiceName: Free text to name or describe your application, solution, or service
  • BusinessUnit: Should align with a predetermined list of known BUs
  • Environment: Should align with a predetermined list of environments
  • OwnerEmail: A valid email for a responsible person or group of the resource(s)
  • <Optional Tags>: Such as RequestorEmail
**from the 0_keepers sub-directory**
terraform init
terraform validate
terraform apply

Network and Security

NetSec should be replaced by a solution wherein the Central IT team provides these details where necessary. Specifically, Central IT should build the VNET to be in compliance with ExpressRoute requirements and allow the development team to add their own subnets and Network Security Groups (see Azure Landing Zones)

IMPORTANT Update terraform.tfvars for your environment.

  • storage_account_name: Storage account named pulled from the keepers.json where terraform state will be stored in perpetuity
  • container_name: Like a folder - generally “terraform-state”
  • key: Further pathing for the storage and filename of your terraform state - must be unique (For example, bu/product/use2-s-net.tfstate)
  • access_key: This can be found in your keepers.json and is the access_key credential used to read and write against the keeper storage account - SENSITIVE

netsec - terraform.tfvars (In addition to others listed previously…)

  • tkg_cluster_name: The name passed into naming pertaining to the tanzu cli
  • core_address_space: The VNET address space - it’s the largest CIDR block specified for a network
  • boot_diag_sa_name: This name is passed to a storage account that is used for boot diagnostics - it should conform to Azure’s naming requirements for storage accounts
  • vault_resource_group_name: A Resource Group name provided by the output of 0_keepers
  • vault_name: The AKV name provided by the output of 0_keepers

netsec - user_subnets.tf

This file is used to define the subnets used for Tanzu Kubernetes Grid Multicloud and configure the subnets within Azure. Examples are provided, but the results are as follows (as defined within the associated modules):


Subnets are modified via the large local map that is passed into the subnet module. Maps provided this way can either be passed in directly as a single map answering the argument requirement of the module, or the module can be looped through while reading each key from the map. In the former case, all subnets will get the same Network Security Group. In the latter, each subnet gets its own Network Security Group. Network Security Groups are named within the map itself, so care should be given to that parameter.

**from the 1_netsec sub-directory**
terraform init
terraform validate
terraform apply



DNS, in this solution, represents a BIND9 forwarder for Azure Private DNS. In order for on-prem resources to resolve Private DNS resources, conditional or zone forwarding must be in place on-prem to point to these DNS servers.

Update terraform.tfvars for your environment.

dns - terraform.tfvars (In addition to others listed above…)

  • subnet_name: Subnet name where DNS Forwarders will allocate internal IP(s) (output from 1_netsec)
  • vnet_name: The VNET name (pre-existing - is output from 1_netsec)
  • netsec_resource_group: The resource group name where the pre-existing VNET lives
  • bindvms: Count of VMs to deploy to host BIND9
  • boot_diag_sa_name: Pre-generated boot diagnostics storage account name (output from 1_netsec)
**from the 2_dns sub-directory**
terraform init
terraform validate
terraform apply

Tanzu Kubernetes Grid Automation

Bootstrap VM (3_bootstrap)


The Bootstrap VM is used for Tanzu Kubernetes Grid Multicloud deployment activities and is setup from the start with the Tanzu CLI and related binaries.

NOTE The bootstrap VM should be provided outbound access to the Internet during initial deployment to perform updates and pull software packages necessary for its role. The default deployment in this guide makes use of a NAT Gateway for the VNET and associated subnets to extend it.

IMPORTANT Update terraform.tfvars for your environment.

boot - terraform.tfvars (In addition to others listed above…)

  • subnet_name: Subnet name where the bootstrap VM will allocate an internal IP (output from 1_netsec)
  • vnet_name: The VNET name (pre-existing - is output from 1_netsec)
  • netsec_resource_group: The resource group name where the pre-existing VNET lives
  • boot_diag_sa_name: Pre-generated boot diagnostics storage account name (output from 1_netsec)
**from the 3_bootstrap sub-directory**
terraform init
terraform validate
terraform apply

Creating the first management cluster is done through “kind” on the bootstrap VM and outputs from IaC above (captured in Azure KeyVault) should be compiled for the resultant answer files.

Final Steps

Shell environment variables will need to be set for proxy configuration, and may be passed via the tfvars file as well:

export HTTP_PROXY="http://PROXY:PORT"
export HTTPS_PROXY="http://PROXY:PORT"

Docker proxy config will need to be set. Add the following section to /etc/systemd/system/docker.service.d/http-proxy.conf:


Docker will need to be restarted for this setting to take effect.

Apt does not use environmental proxy configurations, and instead uses its own file. You will need to modify (create as-needed) the file /etc/apt/apt.conf.d/proxy.conf with the following:

Acquire {
  HTTP::proxy "http://PROXY:PORT";
  HTTPS::proxy "http://PROXY:PORT";

Through Terraform, all other configuration files are written to the bootstrap VM with configuration values obtained through the Azure deployment. Sample configuration values are also provided for packages added to the cluster, should you wish to exercise those and add the packages. The default deployment uses one of tkgm-azure .ps1 or .sh to handle the cluster deployment automation. This script should have taken values from the deployment to do its work, but you may want to review the script(s) for accuracy depending on the changes you’ve made to the code.

check-circle-line exclamation-circle-line close-line
Scroll to top icon