There is a collection of CLI commands that allow an operator to examine the running state of various parts of the NSX routing subsystem.

Due to the distributed nature of the the NSX routing subsystem, there are a number of CLIs available, accessible on various components of NSX. Starting in NSX version 6.2, NSX also has a centralized CLI that helps reduce the “travel time” required to access and log in to various distributed components. It provides access to most of the information from a single location: the NSX Manager shell.

Checking the Prerequisites

There are two major prerequisites that must be satisfied for each ESXi host:

  • Any logical switches connected to the DLR are healthy.

  • The ESXi host has been successfully prepared for VXLAN.

Logical Switch Health Check

NSX Routing works in conjunction with NSX logical switching. To verify that the logical switches connected to a DLR are healthy:

  • Find the segment ID (VXLAN VNI) for each logical switch connected to the DLR in question (for example, 5004..5007).

  • On the ESXi hosts where VMs served by this DLR are running, check the state of the VXLAN control plane for the logical switches connected to this DLR.

# esxcli network vswitch dvs vmware vxlan network list --vds-name=Compute_VDS
VXLAN ID  Multicast IP               Control Plane                        Controller Connection  Port Count  MAC Entry Count  ARP Entry Count
--------  -------------------------  -----------------------------------  ---------------------  ----------  ---------------  ---------------
    5004  N/A (headend replication)  Enabled (multicast proxy,ARP proxy) (up)            2                2                0
    5005  N/A (headend replication)  Enabled (multicast proxy,ARP proxy) (up)            1                0                0
    5006  N/A (headend replication)  Enabled (multicast proxy,ARP proxy) (up)            1                1                0
    5007  N/A (headend replication)  Enabled (multicast proxy,ARP proxy) (up)            1                0                0

Check the following for each relevant VXLAN:

  • For logical switches in hybrid or unicast mode:

    • Control Plane is “Enabled.”

    • “multicast proxy” and “ARP proxy” are listed; “ARP proxy” will be listed even if you disabled IP Discovery.

    • A valid Controller IP address is listed under “Controller,” and “Connection” is “up.”

  • “Port Count” looks right – there will be at least 1, even if there are no VMs on that host connected to the logical switch in question. This one port is the vdrPort, a special dvPort connected to the DLR kernel module on the ESXi host.

  • Run the following command to make sure that the vdrPort is connected to each of the relevant VXLANs.

~ # esxcli network vswitch dvs vmware vxlan network port list --vds-name=Compute_VDS --vxlan-id=5004
Switch Port ID  VDS Port ID  VMKNIC ID
--------------  -----------  ---------
      50331656  53                   0
      50331650  vdrPort              0

~ # esxcli network vswitch dvs vmware vxlan network port list --vds-name=Compute_VDS --vxlan-id=5005
Switch Port ID  VDS Port ID  VMKNIC ID
--------------  -----------  ---------
      50331650  vdrPort              0

  • In the example above, VXLAN 5004 has one VM and one DLR connection, while VXLAN 5005 only has a DLR connection.

  • Check whether the appropriate VMs have been properly wired to their corresponding VXLANs, for example web-sv-01a on VXLAN 5004.

~ # esxcfg-vswitch -l
DVS Name         Num Ports   Used Ports  Configured Ports  MTU     Uplinks
Compute_VDS      1536        10          512               1600    vmnic0

  DVPort ID           In Use      Client
  53                  1           web-sv-01a.eth0

VXLAN Preparation Check

As part of VXLAN configuration of an ESXi host, the DLR kernel module is also installed, configured, and connected to a dvPort on a DVS prepared for VXLAN.

  1. Run show cluster all to get the cluster ID.

  2. Run show cluster cluster-id to get the host ID.

  3. Run show logical-router host hostID connection to get the status information.

nsxmgr-01a# show logical-router host <hostID> connection

Connection Information:

DvsName           VdrPort           NumLifs  VdrVmac
-------           -------           -------  -------
Compute_VDS       vdrPort           4        02:50:56:56:44:52
    Teaming Policy: Default Teaming
    Uplink   : dvUplink1(50331650): 00:50:56:eb:41:d7(Team member)

   Stats : Pkt Dropped      Pkt Replaced     Pkt Skipped
   Input : 0                0                1968734458
  Output : 303              7799             31891126

  • A DVS enabled with VXLAN will have one vdrPort created, shared by all DLR instances on that ESXi host.

  • “NumLifs” refers to the number that is the sum of LIFs from all DLR instances that exist on this host.

  • “VdrVmac” is the vMAC that the DLR uses on all LIFs across all instances. This MAC is the same on all hosts. It is never seen in any frames that travel the physical network outside of ESXi hosts.

  • For each dvUplink of DVS enabled with VXLAN, there is a matching VTEP; except in cases where LACP / Etherchannel teaming mode is used, when only one VTEP is created irrespective of the number of dvUplinks.

    • Traffic routed by the DLR (SRC MAC = vMAC) when leaving the host will get the SRC MAC changed to pMAC of a corresponding dvUplink.

    • Note that the original VM’s source port or source MAC is used to determine the dvUplink (it is preserved for each packet in its DVS's metadata).

    • When there are multiple VTEPs on the host and one of dvUplinks fails, the VTEP associated with the failed dvUplink will be moved to one of the remaining dvUplinks, along with all VMs that were pinned to that VTEP. This is done to avoid flooding control plane changes that would be associated with moving VMs to a different VTEP.

  • The number in “()” next to each “dvUplinkX” is the dvPort number. It is useful for packet capture on the individual uplink.

  • The MAC address shown for each “dvUplinkX” is a “pMAC” associated with that dvUplink. This MAC address is used for traffic sourced from the DLR, such as ARP queries generated by the DLR and any packets that have been routed by the DLR when these packets leave the ESXi host. This MAC address can be seen on the physical network (directly, if DLR LIF is VLAN type, or inside VXLAN packets for VXLAN LIFs).

  • Pkt Dropped / Replaced / Skipped refer to counters related to internal implementation details of the DLR, and are not typically used for troubleshooting or monitoring.