If a host is dead, but the disks are still working (SATADOM, capacity, and cache drives), you can move the disks to a new host.

Prerequisites

The new host should have the necessary firmware and BIOS settings.

Procedure

  1. Note down name and IP details of the dead host.
    1. On the Dashboard page, click VIEW DETAILS for Physical Resources and click the affected rack.

    2. Scroll down to the Hosts section.

    3. In the HOST column, click the host name that shows a critical status.

      The Host Details page displays the details for this host.

    4. Note down the host name, and Management IP, Network One, and Network Two IP addresses.

  2. Decommission the dead host.
    1. If you are decommissioning a qualified vSAN Ready Node (i.e. if you did not purchase a fully integrated system from a partner), note the BMC password for the host by navigating to the /home/vrack/bin/directory in the SDDC Manager Controller VM VM and running the lookup-password command.
    2. On the Dashboard page, click VIEW DETAILS for Workload Domain and click the affected domain.
    3. In the PHYSICAL RACKS column, click the physical rack that contains the affected server.
    4. Scroll down to the Hosts section.
    5. In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).
      rack
    6. In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).

      The Host Details page displays the details for this host.

      host

    7. In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).
    8. Note the IP addresses displayed in the NETWORK TWO and MANAGEMENT IP ADDRESS fields.
    9. Click Decommission.

      decomm

      If this host belongs to a workload domain, the domain must include at least 4 hosts. If the domain has fewer than 4 hosts, you must expand the domain before decommissioning the host. If the domain contains only 4 hosts and one of them is dead, click Force decommission to decommission the host.

    10. Click CONFIRM.

      During the host decommissioning task, the host is removed from the workload domain to which it was allocated and the environment's available capacity is updated to reflect the reduced capacity. The ports that were being used by the server are marked unused and the network configuration is updated.

    11. Monitor the progress of the decommissioning task.
      1. On the SDDC Manager Dashboard, click STATUS in the left navigation pane.

      2. In the Workflow Tasks section, click View Details.

      3. Look for the VI Resource Pool - Decommission of hosts task.

      4. After about 10 minutes, refresh this page and wait till the task status changes to Successful.

    12. For qualified vSAN Ready Nodes, change the password on the host to the common password for ESXi hosts. Log in to the BMC console using the password noted in step a and change the OOB password to D3c0mm1ss10n3d!.

      This step is automated for hosts in an integrated system.

  3. SSH to the management switch (IP address 192.168.100.1) and take backup of dhcpd.leases file with the following command.

    cp /var/lib/dhcp/dhcpd.leases /var/lib/dhcp/dhcpd.leases.bk

  4. SSH to the SDDC Manager Controller VM and take a backup of hms_ib_inventory.json and prm-manifest.json files with the following commands:

    cp /home/vrack/VMware/vRack/hms_ib_inventory.json /home/vrack/VMware/vRack/hms_ib_inventory.json.bk

    cp /home/vrack/VMware/vRack/prm-manifest.json /home/vrack/VMware/vRack/ prm-manifest.json.bkp

  5. Power off the dead host and note the ports on the management and ToR switches it is connected to. Remove all physical connections from it and remove it from the rack.
    Note:

    In vSphere Web Client, the dead host will not be responsive . Do not remove the dead host from the inventory. After the disks are moved to the new host, the new host will automatically reconnect.

  6. Remove the SATADOM, SSDs, and HDDs from the dead host and install them in the new host in the appropriate order and slots.
  7. Mount the new host in the rack and connect it to the same ports of the management and ToR switches as the dead host. Refer to your notes from step 4.
  8. Power on the new host.
  9. Retrieve the password of the root account. See Look Up Account Credentials.
  10. Login to the vCenter Web Client with the root account and confirm that the new host is connected to the vCenter Server. If it is not connected, right-click on the disconnected host, click Connection, and then click Connect.
  11. If the dead host belonged to a workload domain, ensure that re-synching is in progress.
    1. In vCenter Web Client, click the cluster name.

    2. Click Monitor > vSAN > Resyncing Components.

    3. Check for any reported issues.

  12. On the SDDC Manager Dashboard, confirm that the replacement host has the same host name, Management, and Network IP addresses as the dead host that you removed from the rack. Refer to your notes in step 1.
  13. If the Management IP address of the new server is different from the one assigned to the dead host, update the IP address by following the steps below.
    1. Note the OOB Mac address of the new host. For details, refer to the vendor documentation.
    2. SSH to the management switch (IP address 192.168.100.1) and type the following command.

      cp /var/lib/dhcp/dhcpd.leases

      Look for the OOB MAC (Management IP) address for the new host from step 13a. Note the IP address 192.168.0.x next to lease.

    3. SSH to the SDDC Manager Controller VM and type vi /home/vrack/VMware/vRack/hms_ib_inventory.json.
    4. Search for the 192.168.100.x (Network Two) IP address noted in step 13b.
    5. Press the Insert key and update the managementIP record with this new IP address.
    6. On the SDDC Manager Dashboard, confirm that the Management IP address of the new host has been updated.
  14. SSH to the management switch and reboot it by typing the command sudo reboot.
  15. SSH to the SDDC Manager Controller VM and reboot it by typing the command sudo reboot.
  16. Check vSAN status, and health and disk groups to ensure that they are healthy and operational.
    1. In vCenter Web Client, click the vRack-Cluster, and select Manage > Settings > Disk Management.

    2. Click the host that you added in and check the State and vSAN columns.

  17. Perform vSAN proactive tests to confirm that the vSAN disks are healthy. In the vcenter server click on the cluster name, go to Monitor, vSAN, Proactive Tests. Click on each of the tests and press the green play button. Once the tests complete, logout of the system.
    1. In vCenter Web Client, click the cluster and select Monitor > vSAN > Proactive Tests.

    2. Click each test and press the green Play button.

    3. After the tests are complete, log out of vCenter Web Client.