To perform troubleshooting or maintenance on a vSphere Bitfusion server, you must remove the server from the vSphere Bitfusion cluster.

When powering off a vSphere Bitfusion server for maintenance or to perform troubleshooting, the health status of the vSphere Bitfusion cluster changes. When the cluster is not in a healthy state, you cannot add new vSphere Bitfusion servers or perform a backup operation. If half of the servers are powered off, your cluster is inoperable. When powering off a server for a longer period of time, you can prevent any potential risk by removing the server from the cluster.

Performing the following procedure immediately removes the server from the vSphere Bitfusion cluster. Any running applications that are using the GPUs receive an immediate GPU failure and usually return an error condition.

Prerequisites

  • Prevent new client connections to the specific server in the server settings.

  • Verify that there are no running applications on the server.

Procedure

  1. In the vSphere Client, select Menu > Bitfusion.
  2. On the Servers tab, select a server from the list.
  3. From the Actions drop-down menu, select Delete.
  4. In the confirmation dialog box, click Delete.

Results

You have removed the selected server from the vSphere Bitfusion cluster.

What to do next

After removing the vSphere Bitfusion server, allow the VM to run for 10 minutes or longer before powering off the VM. During this time, the backing storage rebalances.

Note:

After the delete operation finishes, to reuse the underlying hardware as a vSphere Bitfusion server, you must delete the vSphere Bitfusion Server virtual machine (VM) and redeploy the vSphere Bitfusion Server Appliance.