As a platform engineer, you might want to bring a cluster down before maintenance or to fix any issues that the cluster has. You can achieve this by running the cluster cordon
, cluster drain
, and cluster uncordon
Tanzu CLI commands and by using the Space disruption budgets to ensure that application use is not disrupted.
The following descriptions provide a high-level overview of cordon, drain, and uncordon operations and Space disruption budgets.
Cluster cordon:
Cordoning off a cluster is the action of indicating to the Space scheduler that the cluster should not be considered for Space scheduling. Cordoning does not remove any currently scheduled Space replicas from the cluster.
Cluster drain:
Draining is the process of evicting of all Space replicas from a given cluster after the Space disruption budget constraint has been satisfied. It cordons the cluster before initiating the eviction of Space replicas. After the Space replicas are evicted, a cluster is considered fully drained.
Cluster uncordon:
Uncordoning a cluster indicates to the Space scheduler that the cluster can now be considered for Space scheduling. After you have finish cluster maintenance, you can uncordon the cluster to re-enable Space scheduling on that cluster.
Space disruption:
A SpaceDisruptionBudget
describes a limit on the number of Space replicas that can be down simultaneously while still maintaining availability guarantees. You can use a Space disruption budget to ensure availability of applications running on the Space, while managing voluntary disruptions to the underlying clusters. A SpaceDisruptionBudget
is used as an input to the cluster drain
process. Involuntary disruptions cannot be prevented by SpaceDisruptionBudget
, however they do count against the budget.
Before you cordon, uncordon, or drain a cluster, make sure you install and log in with the tanzu
CLI, and then set the context.
Install the Tanzu CLI (v1.4.0 or later). See Installing the Tanzu CLI.
Install the platform-engineer
plugin group.
tanzu plugin install --group vmware-tanzu/platform-engineer
Log in with the Tanzu CLI.
tanzu login
To create a Space disruption budget:
Set the context to your project.
tanzu project use PROJECT-NAME
Create a space-disruption-budget.yaml
file for your Space, specifying an availabilityTarget
with the following contents:
apiVersion: spaces.tanzu.vmware.com/v1alpha1
kind: SpaceDisruptionBudget
metadata:
name: my-first-space
namespace: default
spec:
availabilityTargets:
- minAvailable: 1
name: all-regions.tanzu.vmware.com
Apply the space-disruption-budget.yaml
file in the project context.
To cordon a cluster:
Using the tanzu
CLI, set your context to cluster group.
tanzu operations clustergroup use CLUSTER-GROUP-NAME
Cordon a cluster belonging to the cluster group you specified above.
tanzu cluster cordon CLUSTER-NAME
The output from this command should look something like this:
i successfully cordoned cluster CLUSTER-NAME
i to see the state of the cluster use kubectl get kubernetescluster CLUSTER-NAME
To drain a cluster using the tanzu
CLI:
You can simulate draining a cluster by performing dry-run to find out the Space replicas that will be evicted.
tanzu cluster drain CLUSTER-NAME --dry-run=true
The output from this command should look something like this:
i successfully cordoned cluster CLUSTER-NAME (dry run)
i evicting 4 Space replicas from cluster CLUSTER-NAME (dry run)
i evicting Space replica <replica-name-1> (dry run)
i evicting Space replica <replica-name-2> (dry run)
i evicting Space replica <replica-name-3> (dry run)
i evicting Space replica <replica-name-4> (dry run)
i cluster CLUSTER-NAME drained successfully (dry run)
Drain the cluster.
tanzu cluster drain CLUSTER-NAME
The output from this command should look something like this:
i successfully cordoned cluster CLUSTER-NAME
i to see the state of the cluster use kubectl get kubernetescluster CLUSTER-NAME
i evicting 4 Space replicas from cluster CLUSTER-NAME
i evicting Space replica <replica-name-1>
i evicting Space replica <replica-name-2>
i evicting Space replica <replica-name-3>
i evicting Space replica <replica-name-4>
i <replica-name-1> evicted
i <replica-name-2> evicted
i <replica-name-3> evicted
i <replica-name-4> evicted
i 4 Space replicas under went eviction
i cluster CLUSTER-NAME drained successfully
After the cluster is finished being drained, you can perform any necessary maintenance on the cluster before you put it back into service.
To uncordon a cluster by using the Tanzu CLI:
Uncordon a cluster belonging to the cluster group you specified above.
tanzu cluster uncordon CLUSTER-NAME
Verify that the output from this command look similar to this:
i successfully uncordoned cluster CLUSTER-NAME
i to see the state of the cluster use kubectl get kubernetescluster CLUSTER-NAME