If an NSX Controller Cluster is unrecoverable, or if you need to replace one or more controllers due to changes to cluster membership, you should restore the entire cluster of controllers.

About this task

Before restoring a cluster of controllers, you first determine if control cluster membership has changed between what is known by the management plane and the actual membership as known by the controllers themselves. Membership can differ if changes were made after a backup.

  • If the entire cluster is unrecoverable, see "Redeploy the NSX Controller Cluster.".

  • Follow the steps below to determine if cluster membership has changed, and if so, restore from a backup.

Prerequisites

  • Verify that you have a recent backup.

  • Perform a restore. See Restore a Backup.

Procedure

  1. Log in to the CLI of an NSX Manager and then run the get management-cluster status command.
  2. Log in to the CLI of an NSX Controller and then run the get managers command to ensure that the controller is registered with the Manager.
  3. Run the get control-cluster status command.
  4. To determine if there are membership changes, compare the IP addresses from the output of the get management-cluster status command to the output from the get control-cluster status command.

    No action is needed if the set of IP addresses is the same. If any IP address is different, continue with the remaining steps to restore the entire controller cluster.

  5. Log in to the CLI of the NSX Controllers to determine which is the master controller by running the get control-cluster status command.

    The master controller output will show is master: true.

  6. Run the stop service <controller> command on one non-master controller.
  7. Log in to the master controller and then run the detach control-cluster <ip-address[:port]> command to detach the non-master controller from the previous step.
  8. (Optional) Run the detach controller <uuid> command on the NSX Manager to detach this controller only if the get management-cluster status command shows this controller on the NSX Manager.
  9. Log in to the CLI of the NSX Controller and then run the deactivate control-cluster command.
  10. Remove the bootstrap file and the uuid file with the following commands: rm -r /opt/vmware/etc/bootstrap-config and rm -r /config/vmware/node-uuid
  11. Perform steps 6-10 for the remaining non-master controllers.
  12. Log in to the CLI of the master controller and then run the stop service <controller> command.
  13. Run the detach controller <uuid> command on the NSX Manager to detach this controller.
  14. Log in to the CLI of the master controller and then run the deactivate control-cluster command.
  15. Remove the bootstrap file and the uuid file with the following commands: rm -r /opt/vmware/etc/bootstrap-config and rm -r /config/vmware/node-uuid
  16. Run the get management-cluster status command from the NSX Manager. If there are still controllers shown in the output, run the detach controller <uuid> command to detach any that remain.

What to do next

Complete the following tasks in the listed order.

  1. Complete a restore.

  2. Join the NSX Controllers with the Management Plane, as documented in the NSX-T Installation Guide.

  3. Redeploy the NSX Controller cluster, as documented in the NSX-T Installation Guide.