Troubleshooting the vSAN Network

vSAN allows you to examine and troubleshoot the different types of issues that arise from a misconfigured vSAN network.

vSAN operations depend on the network configuration, reliability, and performance. Many support requests stem from an incorrect network configuration, or the network not performing as expected.

Use the vSAN health service to resolve network issues. Network health checks can direct you to an appropriate Knowledge Base article, depending on the results of the health check. The Knowledge Base article provides instructions to solve the network problem.

Network Health Checks

The health service includes a category for networking health checks.

Each health check has an Ask VMware link. If a health check fails, click Ask VMware and read the associated VMware Knowledge Base article for further details, and guidance on how to address the issue at hand.

The following networking health checks provide useful information about your vSAN environment.

vSAN: Basic (unicast) connectivity check. This check verifies that IP connectivity exists among all ESXi hosts in the vSAN cluster, by pinging each ESXi host on the vSAN network from each other ESXi host.
vMotion: Basic (unicast) connectivity check. This check verifies that IP connectivity exists among all ESXi hosts in the vSAN cluster that have vMotion configured. Each ESXi host on the vMotion network pings all other ESXi hosts.
All hosts have a vSAN vmknic configured. This check ensures each ESXi host in the vSAN cluster has a VMkernel NIC configured for vSAN traffic.
All hosts have matching multicast settings. This check ensures that each hosts have a properly configured multicast address.
All hosts have matching subnets. This check tests that all ESXi hosts in a vSAN cluster have been configured so that all vSAN VMkernel NICs are on the same IP subnet.
Hosts disconnected from VC. This check verifies that the vCenter Server has an active connection to all ESXi hosts in the vSAN cluster.
Hosts with connectivity issues. This check refers to situations where vCenter Server lists the host as connected, but API calls from vCenter to the host are failing. It can highlight connectivity issues between a host and the vCenter Server.
Network latency. This check performs a network latency check of vSAN hosts. If the threshold exceeds 100 ms, a warning is displayed. If the latency threshold exceeds 200 ms, and error is raised.
vMotion: MTU checks (ping with large packet size). This check complements the basic vMotion ping connectivity check. Maximum Transmission Unit size is increased to improve network performance. Incorrectly configured MTUs might not appear as a network configuration issue, but can cause performance issues.
vSAN cluster partition. This health check examines the cluster to see how many partitions exist. It displays an error if there is more than a single partition in the vSAN cluster.
Multicast assessment based on other checks. This health check aggregates data from all network health checks. If this check fails, it indicates that multicast is likely the root cause of a network partition.

Commands to Check the Network

When the vSAN network has been configured, use these commands to check its state. You can check which VMkernel Adapter (vmknic) is used for vSAN, and what attributes it contains.

Use ESXCLI and RVC commands to verify that the network is fully functional, and to troubleshoot any network issues with vSAN.

You can verify that the vmknic used for the vSAN network is uniformly configured correctly across all hosts, check that multicast is functional, and verify that hosts participating in the vSAN cluster can successfully communicate with one another.

esxcli vsan network list

This command enables you to identify the VMkernel interface used by the vSAN network.

The output below shows that the vSAN network is using vmk2. This command continues to work even if vSAN has been disabled on the cluster and the hosts no longer participate in vSAN.

The Agent Group Multicast and Master Group Multicast are also important to check.

[root@esxi-dell-m:~] esxcli vsan network list
Interface
   VmkNic Name: vmk1
   IP Protocol: IP
   Interface UUID: 32efc758-9ca0-57b9-c7e3-246e962c24d0
   Agent Group Multicast Address: 224.2.3.4
   Agent Group IPv6 Multicast Address: ff19::2:3:4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
   Master Group IPv6 Multicast Address: ff19::1:2:3
   Master Group Multicast Port: 12345
   Host Unicast Channel Bound Port: 12321
   Multicast TTL: 5
   Traffic Type: vsan

This provides useful information, such as which VMkernel interface is being used for vSAN traffic. In this case, it is vmk1. However, also shown are the multicast addresses. This information might be displayed even when the cluster us running in unicast mode. There is the group multicast address and port. Port 23451 is used for the heartbeat, sent every second by the primary, and is visible on every other host in the cluster. Port 12345 is used for the CMMDS updates between the primary and backup.

esxcli network ip interface list

This command enables you to verify items such as vSwitch or distributed switch.

Use this command to check which vSwitch or distributed switch that it is attached to, and the MTU size, which can be useful if jumbo frames have been configured in the environment. In this case, MTU is at the default of 1500.

[root@esxi-dell-m:~] esxcli network ip interface list
vmk0
   Name: vmk0
   <<truncated>>
vmk1
   Name: vmk1
   MAC Address: 00:50:56:69:96:f0
   Enabled: true
   Portset: DvsPortset-0
   Portgroup: N/A
   Netstack Instance: defaultTcpipStack
   VDS Name: vDS
   VDS UUID: 50 1e 5b ad e3 b4 af 25-18 f3 1c 4c fa 98 3d bb
   VDS Port: 16
   VDS Connection: 1123658315
   Opaque Network ID: N/A
   Opaque Network Type: N/A
   External ID: N/A
   MTU: 9000
   TSO MSS: 65535
   Port ID: 50331814

The Maximum Transmission Unit size is shown as 9000, so this VMkernel port is configured for jumbo frames, which require an MTU of about 9,000. VMware does not make any recommendation around the use of jumbo frames. However, jumbo frames are supported for use with vSAN.

esxcli network ip interface ipv4 get –i vmk2

This command displays information such as IP address and netmask of the vSAN VMkernal interface.

With this information, an administrator can now begin to use other commands available at the command line to check that the vSAN network is working correctly.

[root@esxi-dell-m:~] esxcli network ip interface ipv4 get -i vmk1
Name  IPv4 Address  IPv4 Netmask   IPv4 Broadcast  Address Type  Gateway  DHCP DNS
----  ------------  -------------  --------------  ------------  -------  --------
vmk1  172.40.0.9   255.255.255.0   172.40.0.255   STATIC         0.0.0.0 false

vmkping

The vmkping command verifies whether all the other ESXi hosts on the network are responding to your ping requests.

~ # vmkping -I vmk2 172.32.0.3 -s 1472 -d
 PING 172.32.0.3 (172.32.0.3): 56 data bytes
 64 bytes from 172.32.0.3: icmp_seq=0 ttl=64 time=0.186 ms
 64 bytes from 172.32.0.3: icmp_seq=1 ttl=64 time=2.690 ms
 64 bytes from 172.32.0.3: icmp_seq=2 ttl=64 time=0.139 ms
 
 --- 172.32.0.3 ping statistics ---
 3 packets transmitted, 3 packets received, 0% packet loss
 round-trip min/avg/max = 0.139/1.005/2.690 ms

While it does not verify multicast functionality, it can help identify a rogue ESXi host that has network issues. You can also examine the response times to see if there is any abnormal latency on the vSAN network.

If jumbo frames are configured, this command does not report any issues if the jumbo frame MTU size is incorrect. By default, this command uses an MTU size of 1500. If there is a need to verify if jumbo frames are successfully working end-to-end, use vmkping with a larger packet size (-s) option as follows:

 ~ # vmkping -I vmk2 172.32.0.3 -s 8972 -d
 PING 172.32.0.3 (172.32.0.3): 8972 data bytes
 9008 bytes from 172.32.0.3: icmp_seq=0 ttl=64 time=0.554 ms
 9008 bytes from 172.32.0.3: icmp_seq=1 ttl=64 time=0.638 ms
 9008 bytes from 172.32.0.3: icmp_seq=2 ttl=64 time=0.533 ms
 
 --- 172.32.0.3 ping statistics ---
 3 packets transmitted, 3 packets received, 0% packet loss
 round-trip min/avg/max = 0.533/0.575/0.638 ms
 ~ #

Consider adding -d to the vmkping command to test if packets can be sent without fragmentation.

esxcli network ip neighbor list

This command helps to verify if all vSAN hosts are on the same network segment.

In this configuration, we have a four-host cluster, and this command returns the ARP (Address Resolution Protocol) entries of the other three hosts, including their IP addresses and their vmknic (vSAN is configured to use vmk1 on all hosts in this cluster).

[root@esxi-dell-m:~] esxcli network ip neighbor list -i vmk1
Neighbor     Mac Address        Vmknic   Expiry  State  Type   
-----------  -----------------  ------  -------  -----  -------
172.40.0.12  00:50:56:61:ce:22  vmk1    164 sec         Unknown
172.40.0.10  00:50:56:67:1d:b2  vmk1    338 sec         Unknown
172.40.0.11  00:50:56:6c:fe:c5  vmk1    162 sec         Unknown
[root@esxi-dell-m:~]

esxcli network diag ping

This command checks for duplicates on the network, and round-trip times.

To get even more detail regarding the vSAN network connectivity between the various hosts, ESXCLI provides a powerful network diagnostic command. Here is an example of one such output, where the VMkernel interface is on vmk1 and the remote vSAN network IP of another host on the network is 172.40.0.10

[root@esxi-dell-m:~] esxcli network diag ping -I vmk1 -H 172.40.0.10
   Trace: 
         Received Bytes: 64
         Host: 172.40.0.10
         ICMP Seq: 0
         TTL: 64
         Round-trip Time: 1864 us
         Dup: false
         Detail: 
      
         Received Bytes: 64
         Host: 172.40.0.10
         ICMP Seq: 1
         TTL: 64
         Round-trip Time: 1834 us
         Dup: false
         Detail: 
      
         Received Bytes: 64
         Host: 172.40.0.10
         ICMP Seq: 2
         TTL: 64
         Round-trip Time: 1824 us
         Dup: false
         Detail: 
   Summary: 
         Host Addr: 172.40.0.10
         Transmitted: 3
         Recieved: 3
         Duplicated: 0
         Packet Lost: 0
         Round-trip Min: 1824 us
         Round-trip Avg: 1840 us
         Round-trip Max: 1864 us
[root@esxi-dell-m:~]

vsan.lldpnetmap

This RVC command displays uplink port information.

If there are non-Cisco switches with Link Layer Discovery Protocol (LLDP) enabled in the environment, there is an RVC command to display uplink <-> switch <-> switch port information. For more information on RVC, refer to the RVC Command Guide.

This helps you determine which hosts are attached to which switches when the vSAN cluster is spanning multiple switches. It can help isolate a problem to a particular switch when only a subset of the hosts in the cluster is impacted.

> vsan.lldpnetmap 02013-08-15 19:34:18 -0700: This operation will take 30-60 seconds ...+---------------+---------------------------+| Host          | LLDP info                 |+---------------+---------------------------+| 10.143.188.54 | w2r13-vsan-x650-2: vmnic7 ||               | w2r13-vsan-x650-1: vmnic5 |+---------------+---------------------------+

This is only available with switches that support LLDP. To configure it, log in to the switch and run the following:

switch# config t
Switch(Config)# feature lldp

To verify that LLDP is enabled:

switch(config)#do show running-config lldp

Note:

LLDP operates in both send and receive mode, by default. Check the settings of your vDS properties if the physical switch information is not being discovered. By default, vDS is created with discovery protocol set to CDP, Cisco Discovery Protocol. To resolve this, set the discovery protocol to LLDP, and set operation to both on the vDS.

Checking Multicast Communications

Multicast configurations can cause issues for initial vSAN deployment.

One of the simplest ways to verify if multicast is working correctly in your vSAN environment is by using the tcpdump-uw command. This command is available from the command line of the ESXi hosts.

This tcpdump-uw command shows if the primary is correctly sending multicast packets (port and IP info) and if all other hosts in the cluster are receiving them.

On the primary, this command shows the packets being sent out to the multicast address. On all other hosts, the same packets are visible (from the primary to the multicast address). If they are not visible, multicast is not working correctly. Run the tcpdump-uw command shown here on any host in the cluster, and the heartbeats from the primary are visible. In this case, the primary is at IP address 172.32.0.2. The -v for verbosity is optional.

[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast -v 
tcpdump-uw: listening on vmk2, link-type EN10MB (Ethernet), capture size 96 bytes 
11:04:21.800575 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 34917, offset 0, flags [none], proto UDP (17), length 228) 
    172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200 
11:04:22.252369 IP truncated-ip - 234 bytes missing! (tos 0x0, ttl 5, id 15011, offset 0, flags [none], proto UDP (17), length 316) 
    172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288 
11:04:22.262099 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 3359, offset 0, flags [none], proto UDP (17), length 228) 
    172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200 
11:04:22.324496 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 20914, offset 0, flags [none], proto UDP (17), length 228) 
    172.32.0.5.60460 > 224.1.2.3.12345: UDP, length 200 
11:04:22.800782 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 35010, offset 0, flags [none], proto UDP (17), length 228) 
    172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200 
11:04:23.252390 IP truncated-ip - 234 bytes missing! (tos 0x0, ttl 5, id 15083, offset 0, flags [none], proto UDP (17), length 316) 
    172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288 
11:04:23.262141 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 3442, offset 0, flags [none], proto UDP (17), length 228) 
    172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200

While this output might seem a little confusing, suffice to say that the output shown here indicates that the four hosts in the cluster are getting a heartbeat from the primary. This tcpdump-uw command must be run on every host to verify that they are all receiving the heartbeat. This verifies that the primary is sending the heartbeats, and every other host in the cluster is receiving them, which indicates that multicast is working.

If some of the vSAN hosts are not able to pick up the one-second heartbeats from the primary, the network administrator needs to check the multicast configuration of their switches.

To avoid the annoying truncated-ip – 146 bytes missing! message, use the –s0 option to the same command to stop trunacating of packets:

[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast -v -s0
tcpdump-uw: listening on vmk2, link-type EN10MB (Ethernet), capture size 65535 bytes
11:18:29.823622 IP (tos 0x0, ttl 5, id 56621, offset 0, flags [none], proto UDP (17), length 228)
    172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200
11:18:30.251078 IP (tos 0x0, ttl 5, id 52095, offset 0, flags [none], proto UDP (17), length 228)
    172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200
11:18:30.267177 IP (tos 0x0, ttl 5, id 8228, offset 0, flags [none], proto UDP (17), length 316)
    172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288
11:18:30.336480 IP (tos 0x0, ttl 5, id 28606, offset 0, flags [none], proto UDP (17), length 228)
    172.32.0.5.60460 > 224.1.2.3.12345: UDP, length 200
11:18:30.823669 IP (tos 0x0, ttl 5, id 56679, offset 0, flags [none], proto UDP (17), length 228)
    172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200

The tcpdump command is related to IGMP (Internet Group Management Protocol) membership. Hosts (and network devices) use IGMP to establish multicast group membership.

Each ESXi host in the vSAN cluster sends out regular IGMP membership reports (Join).

The tcpdump command shows IGMP member reports from a host:

[root@esxi-dell-m:~] tcpdump-uw -i vmk1 igmp
tcpdump-uw: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmk1, link-type EN10MB (Ethernet), capture size 262144 bytes
15:49:23.134458 IP 172.40.0.9 > igmp.mcast.net: igmp v3 report, 1 group record(s)
15:50:22.994461 IP 172.40.0.9 > igmp.mcast.net: igmp v3 report, 1 group record(s)

The output shows IGMP v3 reports are taking place, indicating that the ESXi host is regularly updating its membership. If a network administrator has any doubts whether or not vSAN ESXi hosts are doing IGMP correctly, running this command on each ESXi host in the cluster and showing this trace can be used to verify.

If you have multicast communications, use IGMP v3.

In fact, the following command can be used to look at multicast and IGMP traffic at the same time:

[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast or igmp -v -s0

A common issue is that the vSAN cluster is configured across multiple physical switches, and while multicast has been enabled on one switch, it has not been enabled across switches. In this case, the cluster forms with two ESXi hosts in one partition, and another ESXi host (connected to the other switch) is unable to join this cluster. Instead it forms its own vSAN cluster in another partition. The vsan.lldpnetmap command seen earlier can help you determine network configuration, and which hosts are attached to which switch.

While a vSAN cluster forms, there are indicators that show multicast might be an issue.

Assume that the checklist for subnet, VLAN, MTU has been followed, and each host in the cluster can vmkping every other host in the cluster.

If there is a multicast issue when the cluster is created, a common symptom is that each ESXi host forms its own vSAN cluster, with itself as the primary. If each host has a unique network partition ID, this symptom suggests that there is no multicast between any of the hosts.

However, if there is a situation where a subset of the ESXi hosts form a cluster, and another subset form another cluster, and each have unique partitions with their own primary, backup and perhaps even agent hosts, multicast is enabled in the switch, but not across switches. vSAN shows hosts on the first physical switch forming their own cluster partition, and hosts on the second physical switch forming their own cluster partition, each with its own primary. If you can verify which switches the hosts in the cluster connect to, and hosts in a cluster are connected to the same switch, then this probably is the issue.

Checking vSAN Network Performance

Make that there is sufficient bandwidth between your ESXi hosts. This tool can assist you in testing whether your vSAN network is performing optimally.

To check the performance of the vSAN network, you can use iperf tool to measure maximum TCP bandwidth and latency. It is located in /usr/lib/vmware/vsan/bin/iperf.copy. Run it with -–help to see the various options. Use this tool to check network bandwidth and latency between ESXi hosts participating in a vSAN cluster.

VMware KB 2001003 can assist with setup and testing.

This is most useful when a vSAN cluster is being commissioned. Running iperf tests on the vSAN network when the cluster is already in production can impact the performance of the virtual machines running on the cluster.

Checking vSAN Network Limits

The vsan.check.limits command verifies that none of the vSAN thresholds are being breached.

> ls
0 /
1 vcsa-04.rainpole.com/
> cd 1
/vcsa-04.rainpole.com> ls
0 Datacenter (datacenter)
/vcsa-04.rainpole.com> cd 0
/vcsa-04.rainpole.com/Datacenter> ls
0 storage/
1 computers [host]/
2 networks [network]/
3 datastores [datastore]/
4 vms [vm]/
/vcsa-04.rainpole.com/Datacenter> cd 1
/vcsa-04.rainpole.com/Datacenter/computers> ls
0 Cluster (cluster): cpu 155 GHz, memory 400 GB
1 esxi-dell-e.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
2 esxi-dell-f.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
3 esxi-dell-g.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
4 esxi-dell-h.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
/vcsa-04.rainpole.com/Datacenter/computers> vsan.check_limits 0
2017-03-14 16:09:32 +0000: Querying limit stats from all hosts ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-m.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-n.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-o.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-p.rainpole.com (may take a moment) ...
2017-03-14 16:09:39 +0000: Done fetching vSAN disk infos
+--------------------------+--------------------+-----------------------------------------------------------------+
| Host                     | RDT                | Disks                                                           |
+--------------------------+--------------------+-----------------------------------------------------------------+
| esxi-dell-m.rainpole.com | Assocs: 1309/45000 | Components: 485/9000                                            |
|                          | Sockets: 89/10000  | naa.500a075113019b33: 0% Components: 0/0                        |
|                          | Clients: 136       | naa.500a075113019b37: 40% Components: 81/47661                  |
|                          | Owners: 138        | t10.ATA_____Micron_P420m2DMTFDGAR1T4MAX_____ 0% Components: 0/0 |
|                          |                    | naa.500a075113019b41: 37% Components: 80/47661                  |
|                          |                    | naa.500a07511301a1eb: 38% Components: 81/47661                  |
|                          |                    | naa.500a075113019b39: 39% Components: 79/47661                  |
|                          |                    | naa.500a07511301a1ec: 41% Components: 79/47661                  |
<<truncated>>

From a network perspective, it is the RDT associations (Assocs) and sockets count that are important. There are 45,000 associations per host in vSAN 6.0 and later. An RDT association is used to track peer-to-peer network state within vSAN. vSAN is sized so that it never runs out of RDT associations. vSAN also limits how many TCP sockets it is allowed to use, and vSAN is sized so that it never runs out of its allocation of TCP sockets. There is a limit of 10,000 sockets per host.

A vSAN client represents object's access in the vSAN cluster. The client typically represents a virtual machine running on a host. The client and the object might not be on the same host. There is no hard defined limit, but this metric is shown to help understand how clients balance across hosts.

There is only one vSAN owner for a given vSAN object, typically co-located with the vSAN client accessing this object. vSAN owners coordinate all access to the vSAN object and implement functionality, such as mirroring and striping. There is no hard defined limit, but this metric is once again shown to help understand how owners balance across hosts.