This topic describes how to create your own custom ClusterClass
resource for a Tanzu Kubernetes Grid (TKG) standalone management cluster, use it to create class-based workload clusters, and work with clusters that are based on the custom ClusterClass.
To base a cluster on a custom ClusterClass, you set its spec.topology.class
to the custom ClusterClass name.
These procedures do not apply to TKG with a vSphere with Tanzu Supervisor.
CautionCustom ClusterClass is an experimental Kubernetes feature per the upstream Cluster API documentation. Due to the range of customizations available with custom ClusterClass, VMware cannot test or validate all possible customizations. Customers are responsible for testing, validating, and troubleshooting their custom ClusterClass clusters. Customers can open support tickets regarding their custom ClusterClass clusters, however, VMware support is limited to a best effort basis only and cannot guarantee resolution to every issue opened for custom ClusterClass clusters. Customers should be aware of these risks before deploying custom ClusterClass clusters in production environments. To create workload clusters using the default
ClusterClass
resource, follow the procedures in Steps for Cluster Deployment.
To create a custom ClusterClass you need imgpkg
, the Tanzu CLI, and kubectl
installed locally. If you are using ytt
templates as a source for your customizations, you also need ytt
installed. For information about how to download and install ytt
and imgpkg
, see Install the Carvel Tools.
To create a custom ClusterClass, VMware recommends starting with the existing, default ClusterClass manifest or YTT templates described under Create a Base ClusterClass Manifest. When a new version of the default ClusterClass object is published, for example with a new version of TKG, you can then apply the overlay to the new version in order to implement the same customizations.
In v1.x versions of TKG, standalone management clusters used YTT overlays stored on the bootstrap machine to customize the clusters that they created. You can use these same overlays as sources for a shared custom ClusterClass in TKG v2.
The procedures in this topic describe this method of creating a custom ClusterClass object.
To write an entirely new ClusterClass object without using an existing template, follow the procedure in Writing a ClusterClass in the Cluster API documentation.
variables
and patches
For cluster values that are neither part of the Cluster object’s topology
nor supplied by its underlying objects, a ClusterClass can define them in its variables
and patches
blocks:
variables
blocks define variables that are settable within the Cluster object itself, like workerMachineType
with name, type, default setting, etc., following the OpenAPI Specification v3 standard. For example, if a ClusterClass variables
block specifies a required variable named imageRepository
of type string
and with a default value of k8s.gcr.io
, then every Cluster object that the ClusterClass creates will include variable.imageRepository
, defaulted and settable as specified in the Cluster object spec.
patches
blocks apply Sprig functions to variable values in the ClusterClass and in the object templates that it references to derive new values, and then use jsonPatch rules to read and set those values. In this way, patches provide a general-purpose mechanism that enables settings within a Cluster object to determine other settings within the cluster itself and its underlying objects.
You can create a base ClusterClass manifest based on the default ClusterClass manifest or on YTT templates that Tanzu Kubernetes Grid provides. Any custom clusters that you create should be based on this base ClusterClass manifest. There are 3 steps to the process:
There are three methods in which you can create a base ClusterClass manifest for your clusters.
ImportantMethods 2 and 3 are for advanced users who need to satisfy the following use-cases:
- You want to generate custom ClusterClass definitions for your CI system, without deploying a standalone management cluster.
- You want workload clusters to use different infrastructure than management clusters.
In Tanzu Kubernetes Grid v2.3.0 and later, after you deploy a management cluster, you can find the default ClusterClass manifest in the ~/.config/tanzu/tkg/clusterclassconfigs
folder.
To see the manifest for the target platforms on which you have deployed a management cluster, run the following command:
tree ~/.config/tanzu/tkg/clusterclassconfigs/
For example, if you deployed a management cluster to vSphere, you will see the following YAML file.
.config/tanzu/tkg/clusterclassconfigs/
└── tkg-vsphere-default-v1.1.0.yaml
1 directory, 1 file
The generated manifest contains information about your target platform that has been extracted from your management cluster deployment. You can use this as the base manifest for custom ClusterClass creation directly. For the next steps, see Customize Your Base ClusterClass Manifest.
After you install the Tanzu CLI v1.0.0 or later, but before you deploy a management cluster, you can find the YTT templates for the default ClusterClass in the ~/.config/tanzu/tkg/providers/infrastructure-<provider name>/<provider version>/cconly
folder. You can use these templates to create a base manifest for custom ClusterClass creation.
To find the templates, run the appropriate command for your target platform.
tree ~/.config/tanzu/tkg/providers/infrastructure-vsphere/v1.7.0/cconly
You will see the following YAML files.
.config/tanzu/tkg/providers/infrastructure-vsphere/v1.7.0/cconly
├── base.yaml
├── overlay-kube-apiserver-admission.yaml
└── overlay.yaml
1 directory, 3 files
tree ~/.config/tanzu/tkg/providers/infrastructure-aws/v2.1.3/cconly
You will see the following YAML files.
.config/tanzu/tkg/providers/infrastructure-aws/v2.1.3/cconly
├── base.yaml
├── overlay-kube-apiserver-admission.yaml
└── overlay.yaml
tree ~/.config/tanzu/tkg/providers/infrastructure-azure/v1.9.2/cconly/
You will see the following YAML files.
.config/tanzu/tkg/providers/infrastructure-azure/v1.9.2/cconly/
├── base.yaml
├── overlay-kube-apiserver-admission.yaml
└── overlay.yaml
Create a YTT data-value file named user_input.yaml
with the following content.
#@data/values
#@overlay/match-child-defaults missing_ok=True
---
Add configuration values appropriate to your target platform to user_input.yaml
.
You can use the configuration values that you used when you deployed a management cluster deployment. For example, a user_input.yaml
file for vSphere will resemble the following:
#@data/values
#@overlay/match-child-defaults missing_ok=True
---
ENABLE_MHC: true
ENABLE_MHC_CONTROL_PLANE: true
ENABLE_MHC_WORKER_NODE: true
MHC_UNKNOWN_STATUS_TIMEOUT: 5m
MHC_FALSE_STATUS_TIMEOUT: 12m
MHC_MAX_UNHEALTHY_CONTROL_PLANE: 100%
MHC_MAX_UNHEALTHY_WORKER_NODE: 100%
NODE_STARTUP_TIMEOUT: 20m
VSPHERE_TLS_THUMBPRINT: 0A:B4:8F:2E:E4:34:82:90:D5:6A:F8:77:8C:8C:51:24:D2:49:3B:E8
VSPHERE_DATACENTER: /dc0
VSPHERE_DATASTORE: /dc0/datastore/sharedVmfs-0
VSPHERE_FOLDER: /dc0/vm
VSPHERE_NETWORK: /dc0/network/VM Network
VSPHERE_RESOURCE_POOL: /dc0/host/cluster0/Resources
VSPHERE_SERVER: xxx.xxx.xxx.xxx
VSPHERE_USERNAME: [email protected]
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_MEM_MIB: "8192"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "4" VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "8192"
VSPHERE_WORKER_NUM_CPUS: "4"
VSPHERE_CLUSTER_CLASS_VERSION: "v1.1.0" # change it to the version in TKG BOM
NAMESPACE: "tkg-system" # DO NOT change it
Use ytt
to generate the base ClusterClass manifest for your target platform.
ytt -f ~/.config/tanzu/tkg/providers/infrastructure-vsphere/v1.7.0/cconly -f ~/.config/tanzu/tkg/providers/config_default.yaml -f user_input.yaml
ytt -f ~/.config/tanzu/tkg/providers/infrastructure-aws/v2.1.3/cconly -f ~/.config/tanzu/tkg/providers/config_default.yaml -f user_input.yaml
ytt -f ~/.config/tanzu/tkg/providers/infrastructure-azure/v1.9.2/cconly -f ~/.config/tanzu/tkg/providers/config_default.yaml -f user_input.yaml
Now that you have generated a base manifest for your infrastructure, for the next steps, see Customize Your Base ClusterClass Manifest.
The YTT templates for the default ClusterClass are also available for download from the TKG image repository. You can use these templates to create a base manifest for custom ClusterClass creation.
Pull the YTT templates from the providerTemplate
image in the official TKG registry.
For TKG v2.3.1, the providerTemplate
image tag is v0.30.2
. You can find the tag version in the TKG BOM file by searching for providerTemplateImage
.
imgpkg pull -i projects.registry.vmware.com/tkg/tanzu_core/provider/provider-templates:v0.30.2 -o providers
You will see output similar to the following:
Pulling image 'projects.registry.vmware.com/tkg/tanzu_core/provider/provider-templates@sha256:b210d26c610800f5da4b3aa55bfbc8ffae9275fa2c4073a2b1332e2045a6e1da'
Extracting layer 'sha256:3ba336232c0e616b2b6c8f263593497c5a059a645f4c6137ea0eb658f4a8538a' (1/1)
Succeeded
The YAML template files are downloaded into a providers
folder.
Create a YTT data-value file named user_input.yaml
with the following content.
#@data/values
#@overlay/match-child-defaults missing_ok=True
---
Add configuration values appropriate to your target platform to user_input.yaml
.
You can use the configuration values that you used when you deployed a management cluster deployment. For example, a user_input.yaml
file for vSphere will resemble the following:
#@data/values
#@overlay/match-child-defaults missing_ok=True
---
ENABLE_MHC: true
ENABLE_MHC_CONTROL_PLANE: true
ENABLE_MHC_WORKER_NODE: true
MHC_UNKNOWN_STATUS_TIMEOUT: 5m
MHC_FALSE_STATUS_TIMEOUT: 12m
MHC_MAX_UNHEALTHY_CONTROL_PLANE: 100%
MHC_MAX_UNHEALTHY_WORKER_NODE: 100%
NODE_STARTUP_TIMEOUT: 20m
VSPHERE_TLS_THUMBPRINT: 0A:B4:8F:2E:E4:34:82:90:D5:6A:F8:77:8C:8C:51:24:D2:49:3B:E8
VSPHERE_DATACENTER: /dc0
VSPHERE_DATASTORE: /dc0/datastore/sharedVmfs-0
VSPHERE_FOLDER: /dc0/vm
VSPHERE_NETWORK: /dc0/network/VM Network
VSPHERE_RESOURCE_POOL: /dc0/host/cluster0/Resources
VSPHERE_SERVER: xxx.xxx.xxx.xxx
VSPHERE_USERNAME: [email protected]
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_MEM_MIB: "8192"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "4" VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "8192"
VSPHERE_WORKER_NUM_CPUS: "4"
VSPHERE_CLUSTER_CLASS_VERSION: "v1.1.0" # change it to the version in TKG BOM
NAMESPACE: "tkg-system" # DO NOT change it
Use ytt
to generate the base ClusterClass manifest for your target platform.
ytt -f providers/infrastructure-vsphere/v1.7.0/cconly -f providers/config_default.yaml -f user_input.yaml
ytt -f providers/infrastructure-aws/v2.1.3/cconly -f providers/config_default.yaml -f user_input.yaml
ytt -f ~/.config/tanzu/tkg/providers/infrastructure-azure/v1.9.2/cconly -f ~/.config/tanzu/tkg/providers/config_default.yaml -f user_input.yaml
Now that you have generated a base manifest for your infrastructure, for the next steps, see Customize Your Base ClusterClass Manifest.
To customize your ClusterClass manifest, you create ytt
overlay files alongside the manifest. The following example shows how to modify a Linux kernel parameter in the ClusterClass definition.
Create a custom
folder structured as follows:
tree custom
custom
|-- overlays
|-- filter.yaml
|-- kernels.yaml
Edit custom/overlays/kernels.yaml
.
For example, add nfConntrackMax
as a variable and define a patch for it that adds its value to the kernel parameter net.netfilter.nf_conntrack_max
for control plane nodes.
This overlay appends a command to the field preKubeadmCommands
, to write the configuration to sysctl.conf
. To make the setting take effect you append the command sysctl -p
to apply the change. Default ClusterClass definitions are immutable, so this overlay also changes the name of your custom ClusterClass and all of its templates by adding -extended
.
This overlay also removes the run.tanzu.vmware.com/tkg-managed
label if it is present, to ensure that this ClusterClass is treated as a custom object, not a default and a managed one, during workflows such as upgrade.
cat custom/overlays/kernels.yaml
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"ClusterClass"})
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: tkg-vsphere-default-v1.1.0-extended
labels:
#@overlay/match missing_ok=True
#@overlay/remove
run.tanzu.vmware.com/tkg-managed: ""
spec:
variables:
- name: nfConntrackMax
required: false
schema:
openAPIV3Schema:
type: string
patches:
- name: nfConntrackMax
enabledIf: '{{ not (empty .nfConntrackMax) }}'
definitions:
- selector:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
matchResources:
controlPlane: true
jsonPatches:
- op: add
path: /spec/template/spec/kubeadmConfigSpec/preKubeadmCommands/-
valueFrom:
template: echo "net.netfilter.nf_conntrack_max={{ .nfConntrackMax }}" >> /etc/sysctl.conf
- op: add
path: /spec/template/spec/kubeadmConfigSpec/preKubeadmCommands/-
value: sysctl -p
Remove resources that you do not want to change.
In this example, all templates are intended to be shared between the custom and default ClusterClass so they are all removed. You can also create a custom template based on the default template in the same way, in which case make sure that kind
is not excluded.
cat custom/overlays/filter.yaml
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.not_op(overlay.subset({"kind": "ClusterClass"})),expects="0+"
---
#@overlay/remove
Use the default ClusterClass manifest to generate the base ClusterClass.
ytt -f tkg-vsphere-default-v1.1.0.yaml -f custom/overlays/filter.yaml > default_cc.yaml
Generate the custom ClusterClass.
ytt -f tkg-vsphere-default-v1.1.0.yaml -f custom/ > custom_cc.yaml
(Optional) Check the difference between the default ClusterClass and your custom one, to confirm that the changes have been applied.
diff default_cc.yaml custom_cc.yaml
You should see output similar to the following:
4c4
< name: tkg-vsphere-default-v1.1.0
---
> name: tkg-vsphere-default-v1.1.0-extended
638a639,643
> - name: nfConntrackMax
> required: false
> schema:
> openAPIV3Schema:
> type: string
2607a2613,2628
> - name: nfConntrackMax
> enabledIf: '{{ not (empty .nfConntrackMax) }}'
> definitions:
> - selector:
> apiVersion: controlplane.cluster.x-k8s.io/v1beta1
> kind: KubeadmControlPlaneTemplate
> matchResources:
> controlPlane: true
> jsonPatches:
> - op: add
> path: /spec/template/spec/kubeadmConfigSpec/preKubeadmCommands/-
> valueFrom:
> template: echo "net.netfilter.nf_conntrack_max={{ .nfConntrackMax }}" >> /etc/sysctl.conf
> - op: add
> path: /spec/template/spec/kubeadmConfigSpec/preKubeadmCommands/-
> value: sysctl -p
In this example, you can see that -extended
has been added to the cluster name.
To enable your management cluster to use your custom ClusterClass, install it by applying the new manifest.
Apply the ClusterClass manifest.
kubectl apply -f custom_cc.yaml
You should see the following output.
clusterclass.cluster.x-k8s.io/tkg-vsphere-default-v1.1.0-extended created
Check if the custom ClusterClass has propagated to the default namespace, for example:
kubectl get clusterclass
You should see the following output.
NAME AGE
tkg-vsphere-default-v1.1.0 2d23h
tkg-vsphere-default-v1.1.0-extended 11s
After creating your custom ClusterClass, you can use it to create a new workload cluster that includes your customization.
Run tanzu cluster create
with the --dry-run
option to generate a cluster manifest from a standard cluster configuration file.
tanzu cluster create --file workload-1.yaml --dry-run > default_cluster.yaml
Create a ytt
overlay or edit the cluster manifest directly.
The recommended option is to create a ytt
overlay, for example cluster_overlay.yaml
, to do the following:
topology.class
value with the name of your custom ClusterClassvariables
block, with a default valueAs with modifying ClusterClass object specs, using an overlay as follows lets you automatically apply the changes to new objects whenever there is a new upstream cluster release.
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"Cluster"})
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
spec:
topology:
class: tkg-vsphere-default-v1.1.0-extended
variables:
- name: nfConntrackMax
value: "1048576
Generate the custom_cluster.yaml
manifest:
ytt -f default_cluster.yaml -f cluster_overlay.yaml > custom_cluster.yaml
(Optional) As with the ClusterClass above, you can run diff
to compare your custom-class cluster manifest with a default class-based cluster, for example:
diff custom_cluster.yaml default_cluster.yaml
You should see output similar to the following:
< class: tkg-vsphere-default-v1.1.0
---
> class: tkg-vsphere-default-v1.1.0-extended
142a143,144
> - name: nfConntrackMax
> value: "1048576"
Create a custom workload cluster based on the custom manifest generated above as follows.
Create the cluster.
tanzu cluster create -f custom_cluster.yaml
Check created object properties.
For example with the kernel modification above, retrieve the KubeadmControlPlane
object and confirm that the kernel configuration is set:
kubectl get kcp workload-1-jgwd9 -o yaml
You should see output similar to the following:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
...
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost" >>/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
- echo "net.netfilter.nf_conntrack_max=1048576" >> /etc/sysctl.conf
- sysctl -p
...
Log in to a control plane node and confirm that its sysctl.conf
is modified:
capv@workload-1-jgwd9:~$ sudo cat /etc/sysctl.conf
...
net.ipv6.neigh.default.gc_thresh3=16384
fs.file-max=9223372036854775807
net.netfilter.nf_conntrack_max=1048576
If you created custom clusters with a previous release, you can upgrade them to the latest TKG release.
ImportantIf your custom ClusterClass definition is based on a default ClusterClass, make sure the label
run.tanzu.vmware.com/tkg-managed
is not present in itsmetadata.labels
field. Otherwise, upgrading the cluster may replace your custom ClusterClass with the default.
Before you upgrade clusters, there are preparatory steps to perform.
Before you upgrade the management cluster, create test clusters with the version of the custom manifest that you created for the previous release.
For example, create a custom cluster named test-upgrade
.
Get information about the ClusterClass versions available with your TKG 2.2 management cluster.
kubectl get clusterclass
If you created the custom clusters with TKG v2.2.0, the ClusterClass version should be 1.0.0. For example:
NAME AGE
tkg-vsphere-default-extended-v1.0.0 21m
tkg-vsphere-default-v1.0.0 10d
Get the information about the ClusterClass versions running in your test-upgrade
cluster.
kubectl get cluster test-upgrade -o jsonpath='{.spec.topology.class}'
The output should be tkg-vsphere-default-extended-v1.0.0
.
Follow the instructions in Upgrade Standalone Management Clusters to upgrade the management cluster to TKG 2.3.
Get information about the ClusterClass version available after upgrading the management cluster to v.2.3.
kubectl get cluster mgmt-cluster-name -n tkg-system -o jsonpath='{.spec.topology.class}'
The output should be tkg-vsphere-default-v1.1.0
if the management cluster is running on vSphere.
tkg-vsphere-default-v1.1.0-extended
and include the same custom variables as in the old version tkg-vsphere-default-extended-v1.0.0
.custom_cc.yaml
.Install the new custom ClusterClass in the management cluster.
kubectl apply -f custom_cc.yaml
Get information about the ClusterClass versions that are now available with your management cluster.
kubectl get clusterclass
Both the older versions and the newer versions should be displayed.
NAME AGE
tkg-vsphere-default-extended-v1.0.0 61m
tkg-vsphere-default-v1.0.0 10d
tkg-vsphere-default-v1.1.0 25m
tkg-vsphere-default-v1.1.0-extended 15s
After you have performed the preparatory steps, you can test the upgrade on the test cluster before you upgrade your production clusters.
Rebase the cluster test-upgrade
from the old version of the custom ClusterClass to the new one.
kubectl patch cluster test-upgrade --type merge -p '{"spec": {"topology": {"class": "tkg-vsphere-default-v1.1.0-extended"}}}'
The output should be cluster.cluster.x-k8s.io/test-upgrade patched
.
Get the information about the ClusterClass version now running in your test-upgrade
cluster.
kubectl get cluster test-upgrade -o jsonpath='{.spec.topology.class}'
The output should be tkg-vsphere-default-v1.1.0-extended
. Previously it was tkg-vsphere-default-extended-v1.0.0
.
Wait for several minutes and run kubectl get cluster
until you see that UpdatesAvailable
is updated to true
.
kubectl get cluster test-upgrade -o yaml
When ready, a message similar to the following should appear in the output:
...
status:
conditions:
...
- lastTransitionTime: "2023-06-19T09:59:21Z"
message: '[v1.25.9+vmware.1-tkg.1-20230609 v1.26.4+vmware.1-tkg.1-20230609]'
status: "True"
type: UpdatesAvailable
controlPlaneReady: true
infrastructureReady: true
observedGeneration: 5
phase: Provisioned
Upgrade the test-upgrade
cluster.
tanzu cluster upgrade test-upgrade
Check created object properties.
For example with the kernel modification described in the example above, retrieve the KubeadmControlPlane
object and confirm that the kernel configuration is set:
kubectl get kcp test-upgrade-nsc6d -o yaml
You should see output similar to the following:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
...
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost" >>/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
- sed -i 's|".*/pause|"projects-stg.registry.vmware.com/tkg/pause|' /etc/containerd/config.toml
- systemctl restart containerd
- echo "net.netfilter.nf_conntrack_max=1048576" >> /etc/sysctl.conf
- sysctl -p
If the test upgrade was successful, repeat the steps from Perform the Upgrade on your production clusters.
NoteIf you encounter any errors during the upgrade, you can roll back by rebasing the cluster from the new version of the custom ClusterClass to the old version.
If you have created clusters with the default ClusterClass and want to update them to use a custom ClusterClass, use kubectl
to edit the Cluster object:
spec.topology.class
to the name of your custom ClassClass manifest.spec.topology.variables
to append your custom variables.If you want to revert to a new version of the default ClusterClass:
spec.topology.class
to the new version of the default ClusterClass.spec.topology.variables
to remove your custom variables. You might need to add new variables that are defined in the new version of default ClusterClass.