Management Cluster Configuration for AWS

To create a cluster configuration file, you can copy an existing configuration file for a previous deployment to Amazon Web Services (AWS) and update it. Alternatively, you can create a file from scratch by using an empty template.

Important

Tanzu Kubernetes Grid v2.4.x is the last version of TKG that supports the creation of standalone TKG management clusters on AWS. The ability to create standalone TKG management clusters on AWS will be removed in the Tanzu Kubernetes Grid v2.5 release.

Starting from now, VMware recommends that you use Tanzu Mission Control to create native AWS EKS clusters instead of creating new TKG management clusters on AWS. For information about how to create native AWS EKS clusters with Tanzu Mission Control, see Managing the Lifecycle of AWS EKS Clusters in the Tanzu Mission Control documentation.

For more information, see Deprecation of TKG Management and Workload Clusters on AWS and Azure in the VMware Tanzu Kubernetes Grid v2.4 Release Notes.

Management Cluster Configuration Template

The template below includes all of the options that are relevant to deploying management clusters on AWS. You can copy this template and use it to deploy management clusters to AWS.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

#! ---------------------------------------------------------------------
#! Basic cluster creation configuration
#! ---------------------------------------------------------------------

CLUSTER_NAME:
CLUSTER_PLAN: dev
INFRASTRUCTURE_PROVIDER: aws
# CLUSTER_API_SERVER_PORT:
ENABLE_CEIP_PARTICIPATION: true
ENABLE_AUDIT_LOGGING: true
CLUSTER_CIDR: 100.96.0.0/11
SERVICE_CIDR: 100.64.0.0/13
# CAPBK_BOOTSTRAP_TOKEN_TTL: 30m

#! ---------------------------------------------------------------------
#! Node configuration
#! AWS-only MACHINE_TYPE settings override cloud-agnostic SIZE settings.
#! ---------------------------------------------------------------------

# SIZE:
# CONTROLPLANE_SIZE:
# WORKER_SIZE:
CONTROL_PLANE_MACHINE_TYPE: t3.large
NODE_MACHINE_TYPE: m5.large
# OS_NAME: ""
# OS_VERSION: ""
# OS_ARCH: ""

#! ---------------------------------------------------------------------
#! AWS configuration
#! ---------------------------------------------------------------------

AWS_REGION:
AWS_NODE_AZ: ""
AWS_ACCESS_KEY_ID:
AWS_SECRET_ACCESS_KEY:
AWS_SSH_KEY_NAME:
BASTION_HOST_ENABLED: true
# AWS_NODE_AZ_1: ""
# AWS_NODE_AZ_2: ""
# AWS_VPC_ID: ""
# AWS_PRIVATE_SUBNET_ID: ""
# AWS_PUBLIC_SUBNET_ID: ""
# AWS_PUBLIC_SUBNET_ID_1: ""
# AWS_PRIVATE_SUBNET_ID_1: ""
# AWS_PUBLIC_SUBNET_ID_2: ""
# AWS_PRIVATE_SUBNET_ID_2: ""
# AWS_PRIVATE_NODE_CIDR: 10.0.0.0/24
# AWS_PUBLIC_NODE_CIDR: 10.0.1.0/24
# AWS_PRIVATE_NODE_CIDR_1: 10.0.2.0/24
# AWS_PUBLIC_NODE_CIDR_1: 10.0.3.0/24
# AWS_PRIVATE_NODE_CIDR_2: 10.0.4.0/24
# AWS_PUBLIC_NODE_CIDR_2: 10.0.5.0/24
# AWS_SECURITY_GROUP_BASTION: sg-12345
# AWS_SECURITY_GROUP_CONTROLPLANE: sg-12346
# AWS_SECURITY_GROUP_APISERVER_LB: sg-12347
# AWS_SECURITY_GROUP_NODE: sg-12348
# AWS_SECURITY_GROUP_LB: sg-12349
# DISABLE_TMC_CLOUD_PERMISSIONS: false # Deactivates IAM permissions required for TMC enablement

#! ---------------------------------------------------------------------
#! Image repository configuration
#! ---------------------------------------------------------------------

# TKG_CUSTOM_IMAGE_REPOSITORY: ""
# TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

#! ---------------------------------------------------------------------
#! Proxy configuration
#! ---------------------------------------------------------------------

# TKG_HTTP_PROXY: ""
# TKG_HTTPS_PROXY: ""
# TKG_NO_PROXY: ""

#! ---------------------------------------------------------------------
#! Machine Health Check configuration
#! ---------------------------------------------------------------------

ENABLE_MHC:
ENABLE_MHC_CONTROL_PLANE: true
ENABLE_MHC_WORKER_NODE: true
MHC_UNKNOWN_STATUS_TIMEOUT: 5m
MHC_FALSE_STATUS_TIMEOUT: 12m

#! ---------------------------------------------------------------------
#! Identity management configuration
#! ---------------------------------------------------------------------

IDENTITY_MANAGEMENT_TYPE: "none"

#! Settings for IDENTITY_MANAGEMENT_TYPE: "oidc"
# CERT_DURATION: 2160h
# CERT_RENEW_BEFORE: 360h
# OIDC_IDENTITY_PROVIDER_CLIENT_ID:
# OIDC_IDENTITY_PROVIDER_CLIENT_SECRET:
# OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: groups
# OIDC_IDENTITY_PROVIDER_ISSUER_URL:
# OIDC_IDENTITY_PROVIDER_SCOPES: "email,profile,groups,offline_access"
# OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: email

#! The following two variables are used to configure Pinniped JWTAuthenticator for workload clusters
# SUPERVISOR_ISSUER_URL:
# SUPERVISOR_ISSUER_CA_BUNDLE_DATA:

#! Settings for IDENTITY_MANAGEMENT_TYPE: "ldap"
# LDAP_BIND_DN:
# LDAP_BIND_PASSWORD:
# LDAP_HOST:
# LDAP_USER_SEARCH_BASE_DN:
# LDAP_USER_SEARCH_FILTER:
# LDAP_USER_SEARCH_ID_ATTRIBUTE: dn
# LDAP_USER_SEARCH_NAME_ATTRIBUTE:
# LDAP_GROUP_SEARCH_BASE_DN:
# LDAP_GROUP_SEARCH_FILTER:
# LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: dn
# LDAP_GROUP_SEARCH_USER_ATTRIBUTE: dn
# LDAP_ROOT_CA_DATA_B64:

#! ---------------------------------------------------------------------
#! Antrea CNI configuration
#! ---------------------------------------------------------------------

# ANTREA_NO_SNAT: true
# ANTREA_NODEPORTLOCAL: true
# ANTREA_NODEPORTLOCAL_ENABLED: true
# ANTREA_NODEPORTLOCAL_PORTRANGE: 61000-62000
# ANTREA_TRAFFIC_ENCAP_MODE: "encap"
# ANTREA_PROXY: true
# ANTREA_PROXY_ALL: false
# ANTREA_PROXY_LOAD_BALANCER_IPS: false
# ANTREA_PROXY_NODEPORT_ADDRS:
# ANTREA_PROXY_SKIP_SERVICES: ""
# ANTREA_POLICY: true
# ANTREA_TRACEFLOW: true
# ANTREA_DISABLE_UDP_TUNNEL_OFFLOAD: false
# ANTREA_ENABLE_USAGE_REPORTING: false
# ANTREA_EGRESS: true
# ANTREA_EGRESS_EXCEPT_CIDRS: ""
# ANTREA_FLOWEXPORTER: false
# ANTREA_FLOWEXPORTER_COLLECTOR_ADDRESS: "flow-aggregator.flow-aggregator.svc:4739:tls"
# ANTREA_FLOWEXPORTER_POLL_INTERVAL: "5s"
# ANTREA_FLOWEXPORTER_ACTIVE_TIMEOUT: "5s"
# ANTREA_FLOWEXPORTER_IDLE_TIMEOUT: "15s"
# ANTREA_IPAM: false
# ANTREA_KUBE_APISERVER_OVERRIDE: ""
# ANTREA_MULTICAST: false
# ANTREA_MULTICAST_INTERFACES: ""
# ANTREA_NETWORKPOLICY_STATS: true
# ANTREA_SERVICE_EXTERNALIP: true
# ANTREA_TRANSPORT_INTERFACE: ""
# ANTREA_TRANSPORT_INTERFACE_CIDRS: ""

AWS Connection Settings

To furnish information about your AWS account and the region and availability zone in which you want to deploy the cluster, do one of the following:

  • (Recommended) Configure an AWS credential profile with the AWS CLI, and set an environment variable AWS_PROFILE to the profile name on your bootstrap machine.

  • Include the account credentials and other information in the cluster configuration file. For example:

    AWS_REGION: eu-west-1
    AWS_NODE_AZ: "eu-west-1a"
    # Only use AWS_PROFILE OR combination of AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, but not both.
    AWS_PROFILE: tkg
    # AWS_ACCESS_KEY_ID:  <encoded:QUtJQVQ[...]SU82TTM=>
    # AWS_SECRET_ACCESS_KEY: <encoded:eGN4RHJmLzZ[...]SR08yY2ticQ==>
    AWS_SSH_KEY_NAME: default
    BASTION_HOST_ENABLED: true
    
    • Only use AWS_PROFILE or a combination of AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, but not both.
    • The values for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY must be base64-encoded.

Use an Internal Load Balancer for API Server

By default, Tanzu Kubernetes Grid on AWS creates a public-facing load balancer for the management cluster’s Kubernetes API Server.

For internet-restricted environments, such as airgapped or proxied, you can avoid creating a public-facing load balancer by setting AWS_LOAD_BALANCER_SCHEME_INTERNAL to true in the cluster configuration file:

AWS_LOAD_BALANCER_SCHEME_INTERNAL: true

This setting customizes the management cluster’s load balancer to use an internal scheme, which means that its Kubernetes API server will not be accessible and routed over the Internet.

Configure Node Sizes

The Tanzu CLI creates the individual nodes of workload clusters according to settings that you provide in the configuration file. On AWS, you can configure all node VMs to have the same predefined configurations or set different predefined configurations for control plane and worker nodes. By using these settings, you can create Tanzu Kubernetes clusters that have nodes with different configurations to the management cluster nodes. You can also create clusters in which the control plane nodes and worker nodes have different configurations.

When you created the management cluster, the instance types for the node machines are set in the CONTROL_PLANE_MACHINE_TYPE and NODE_MACHINE_TYPE options. By default, these settings are also used for workload clusters. The minimum configuration is 2 CPUs and 8 GB memory. The list of compatible instance types varies in different regions.

CONTROL_PLANE_MACHINE_TYPE: "t3.large"
NODE_MACHINE_TYPE: "m5.large"

You can override these settings by using the SIZE, CONTROLPLANE_SIZE and WORKER_SIZE options. To create a Tanzu Kubernetes cluster in which all of the control plane and worker node VMs are the same size, specify the SIZE variable. If you set the SIZE variable, all nodes will be created with the configuration that you set. For information about the configurations of the different sizes of node instances for Amazon EC2, see Amazon EC2 Instance Types.

SIZE: "t3.large"

To create a workload cluster in which the control plane and worker node VMs are different sizes, specify the CONTROLPLANE_SIZE and WORKER_SIZE options.

CONTROLPLANE_SIZE: "t3.large"
WORKER_SIZE: "m5.xlarge"

You can combine the CONTROLPLANE_SIZE and WORKER_SIZE options with the SIZE option. For example, if you specify SIZE: "t3.large" with WORKER_SIZE: "m5.xlarge", the control plane nodes will be set to t3.large and worker nodes will be set to m5.xlarge.

SIZE: "t3.large"
WORKER_SIZE: "m5.xlarge"

Configure VPC

Uncomment and update the following rows to specify the VPC and other AWS infrastructure that will host and be used by the standalone management cluster

AWS_REGION:
AWS_NODE_AZ:
AWS_PRIVATE_SUBNET_ID:
AWS_PUBLIC_SUBNET_ID:
AWS_SSH_KEY_NAME:
AWS_VPC_ID:
BASTION_HOST_ENABLED:
CONTROL_PLANE_MACHINE_TYPE:
NODE_MACHINE_TYPE:
SERVICE_CIDR:
CLUSTER_CIDR:

If you are deploying a production management cluster, also uncomment and fill in the following for the additional two control plane nodes:

AWS_NODE_AZ_1:
AWS_NODE_AZ_2:
AWS_PRIVATE_SUBNET_ID_1:
AWS_PRIVATE_SUBNET_ID_2:
AWS_PUBLIC_SUBNET_ID_1:
AWS_PUBLIC_SUBNET_ID_2:

For example, the configuration of a production management cluster on an existing VPC might look like this:

AWS_REGION: us-west-2
AWS_NODE_AZ: us-west-2a
AWS_NODE_AZ_1: us-west-2b
AWS_NODE_AZ_2: us-west-2c
AWS_PRIVATE_SUBNET_ID: subnet-ID
AWS_PRIVATE_SUBNET_ID_1: subnet-ID
AWS_PRIVATE_SUBNET_ID_2: subnet-ID
AWS_PUBLIC_SUBNET_ID: subnet-ID
AWS_PUBLIC_SUBNET_ID_1: subnet-ID
AWS_PUBLIC_SUBNET_ID_2: subnet-ID
AWS_SSH_KEY_NAME: tkg
AWS_VPC_ID: vpc-ID
BASTION_HOST_ENABLED: "true"
CONTROL_PLANE_MACHINE_TYPE: m5.large
NODE_MACHINE_TYPE: m5.large
SERVICE_CIDR: 100.64.0.0/13
CLUSTER_CIDR: 100.96.0.0/11

By default, Tanzu Kubernetes Grid creates new security groups for connecting the control plane, worker nodes and load balancers. If you require custom rules, you can pre-provision the security groups, add the rulesets, and configure clusters to use the custom security groups as described below.

Configure Custom Security Groups

By default, Tanzu Kubernetes Grid creates five security groups within a VPC.

To prevent Tanzu Kubernetes Grid from creating new security groups, and instead use existing, pre-provisioned ones with custom rulesets, you:

  • Create security groups with custom rulesets, matching the default rule sets as closely as possible.
  • Specify the custom security groups in the cluster configuration file, by setting AWS_SECURITY_GROUP_* variables to the security group names. For example:

    AWS_SECURITY_GROUP_BASTION: sg-12345
    

The five security groups, their default rules, and their corresponding cluster configuration variables are listed below:

  • Group: CLUSTER-NAME-bastion

    Set with cluster configuration variable AWS_SECURITY_GROUP_BASTION.

    Rules:

    Description Protocol From Port To Port Allow Ingress From Mandatory
    SSH TCP 22 22 0.0.0.0/0 No
  • Group: CLUSTER-NAME-node

    Set with cluster configuration variable AWS_SECURITY_GROUP_NODE.

    Rules:

    Description Protocol From Port To Port Allow Ingress From Mandatory
    SSH TCP 22 22 Security Group <cluster-name>-bastion No
    Node Port Services TCP 30000 32767 0.0.0.0/0 No (see note below)
    Kubelet API TCP 10250 10250 Security Groups <cluster-name>-controlplane and <cluster-name>-node Yes
    Antrea CNI TCP 10349-10351 10349-10351 Security Group <cluster-name>-node Yes
    GENEVE UDP 6081 6081 Security Group <cluster-name>-node Yes
    Note

    The 0.0.0.0/0 is inbound only from within the VPC, peered VPCs and any connected networks via VPN or DirectConnect. The 0.0.0.0/0 should not be interpreted as internet accessible. It is possible to change both the port range and the ingress rule for node port services as long as administrators, and are not used for the functioning of the cluster.

  • Group: CLUSTER-NAME-controlplane

    Set with cluster configuration variable AWS_SECURITY_GROUP_CONTROLPLANE.

    Rules:

    Description Protocol From Port To Port Allow Ingress From Mandatory
    SSH TCP 22 22 Security Group <cluster-name>-bastion No
    Kubernetes API TCP 6443* 6443* Security Groups <cluster-name>-apiserver-lb, <cluster-name>-apiserver-controlplane, and <cluster-name>-apiserver-node Yes
    etcd TCP 2379 2379 Security Group <cluster-name>-controlplane Yes
    etcd peer TCP 2380 2380 Security Group <cluster-name>-controlplane Yes
    addons-manager TCP 9865 9865 Security Groups <cluster-name>-controlplane Yes
    kapp-controller TCP 10100 10100 Security Groups <cluster-name>-controlplane Yes

    * If you set CLUSTER_API_SERVER_PORT, replace 6443 with the port number that you set in the variable.

  • Group: CLUSTER-NAME-apiserver-lb

    Set with cluster configuration variable AWS_SECURITY_GROUP_APISERVER_LB.

    Rules:

    Description Protocol From Port To Port Allow Ingress From Mandatory
    Kubernetes API TCP 6443* 6443* 0.0.0.0/0 No (see note below)

    * If you set CLUSTER_API_SERVER_PORT, replace 6443 with the port number that you set in the variable.

    Note

    The 0.0.0.0/0 rule by default is internet accessible when it is not specified to provision the load balancer internally. If the load balancer is internal, then it must be accessible from the management cluster (for workload clusters) or bootstrap machine (for the management cluster). This rule can be locked down, but if done so, then the following rule MUST be added:

    Description Protocol From Port To Port Allow Ingress From Mandatory
    Kubernetes API In Cluster TCP 6443* 6443* Security Groups <cluster-name>-controlplane and <cluster-name>-node Yes

    * If you set CLUSTER_API_SERVER_PORT, replace 6443 with the port number that you set in the variable.

  • Group: CLUSTER-NAME-lb

    Set with cluster configuration variable AWS_SECURITY_GROUP_LB.

    This security group is used for workload load balancers. No rules are added to this security group, and it is expected that AWS administrators customize the ruleset as needed for the application workload.

What to Do Next

After you have finished updating the management cluster configuration file, create the management cluster by following the instructions in Deploy Management Clusters from a Configuration File.

check-circle-line exclamation-circle-line close-line
Scroll to top icon