Cordon, drain, and uncordon a cluster

As a platform engineer, you might want to bring a cluster down before maintenance or to fix any issues that the cluster has. You can achieve this by running the cluster cordon, cluster drain, and cluster uncordon Tanzu CLI commands and by using the Space disruption budgets to ensure that application use is not disrupted.

The following descriptions provide a high-level overview of cordon, drain, and uncordon operations and Space disruption budgets.

Cluster cordon:

Cordoning off a cluster is the action of indicating to the Space scheduler that the cluster should not be considered for Space scheduling. Cordoning does not remove any currently scheduled Space replicas from the cluster.
Cluster drain:

Draining is the process of evicting of all Space replicas from a given cluster after the Space disruption budget constraint has been satisfied. It cordons the cluster before initiating the eviction of Space replicas. After the Space replicas are evicted, a cluster is considered fully drained.
Cluster uncordon:

Uncordoning a cluster indicates to the Space scheduler that the cluster can now be considered for Space scheduling. After you have finish cluster maintenance, you can uncordon the cluster to re-enable Space scheduling on that cluster.
Space disruption:

A SpaceDisruptionBudget describes a limit on the number of Space replicas that can be down simultaneously while still maintaining availability guarantees. You can use a Space disruption budget to ensure availability of applications running on the Space, while managing voluntary disruptions to the underlying clusters. A SpaceDisruptionBudget is used as an input to the cluster drain process. Involuntary disruptions cannot be prevented by SpaceDisruptionBudget, however they do count against the budget.

Before you begin

Before you cordon, uncordon, or drain a cluster, make sure you install and log in with the tanzu CLI, and then set the context.

Install the Tanzu CLI (v1.4.0 or later). See Installing the Tanzu CLI.

Install the platform-engineer plugin group.

tanzu plugin install --group vmware-tanzu/platform-engineer

Create a Space disruption budget

To create a Space disruption budget:

Set the context to your project.
```
tanzu project use PROJECT-NAME
```

Create a space-disruption-budget.yaml file for your Space, specifying an availabilityTarget with the following contents:

   apiVersion: spaces.tanzu.vmware.com/v1alpha1
   kind: SpaceDisruptionBudget
   metadata:
     name: my-first-space
   namespace: default
   spec:
     availabilityTargets:
     - minAvailable: 1
       name: all-regions.tanzu.vmware.com

Apply the space-disruption-budget.yaml file in the project context.

Cordon a cluster

To cordon a cluster:

Using the tanzu CLI, set your context to cluster group.

tanzu operations clustergroup use CLUSTER-GROUP-NAME

Cordon a cluster belonging to the cluster group you specified above.

tanzu cluster cordon CLUSTER-NAME

The output from this command should look something like this:

i  successfully cordoned cluster CLUSTER-NAME
i  to see the state of the cluster use  kubectl get kubernetescluster CLUSTER-NAME

Drain a cluster

To drain a cluster using the tanzu CLI:

You can simulate draining a cluster by performing dry-run to find out the Space replicas that will be evicted.

tanzu cluster drain CLUSTER-NAME --dry-run=true

The output from this command should look something like this:

i  successfully cordoned cluster CLUSTER-NAME (dry run)
i  evicting 4 Space replicas from cluster CLUSTER-NAME (dry run)
i  evicting Space replica <replica-name-1> (dry run)
i  evicting Space replica <replica-name-2> (dry run)
i  evicting Space replica <replica-name-3> (dry run)
i  evicting Space replica <replica-name-4> (dry run)
i  cluster CLUSTER-NAME drained successfully (dry run)

Drain the cluster.

tanzu cluster drain CLUSTER-NAME

The output from this command should look something like this:

i  successfully cordoned cluster CLUSTER-NAME
i  to see the state of the cluster use  kubectl get kubernetescluster CLUSTER-NAME
i  evicting 4 Space replicas from cluster CLUSTER-NAME
i  evicting Space replica <replica-name-1>
i  evicting Space replica <replica-name-2>
i  evicting Space replica <replica-name-3>
i  evicting Space replica <replica-name-4>
i  <replica-name-1> evicted
i  <replica-name-2> evicted
i  <replica-name-3> evicted
i  <replica-name-4> evicted
i  4 Space replicas under went eviction
i  cluster CLUSTER-NAME drained successfully

After the cluster is finished being drained, you can perform any necessary maintenance on the cluster before you put it back into service.

Uncordon a cluster

To uncordon a cluster by using the Tanzu CLI:

Uncordon a cluster belonging to the cluster group you specified above.
```
tanzu cluster uncordon CLUSTER-NAME
```

Verify that the output from this command look similar to this:

i  successfully uncordoned cluster CLUSTER-NAME
i  to see the state of the cluster use  kubectl get kubernetescluster CLUSTER-NAME