The NSX dashboard simplifies troubleshooting by providing visibility into the overall health of NSX components in one central view.

You can access the dashboard from vSphere Web Client> Networking & Security > Dashboard.

The dashboard checks the following states:

  • NSX infrastructure—NSX Manager status

    • Component status for following services is monitored

      • Database service (vPostgres)

      • Message bus service (RabbitMQ)

      • Replicator service—Also monitors for replication errors (if Cross-vCenter NSX is enabled)

    • NSX Manager disk usage:

      • Yellow (disk usage >80%)

      • Red (disk usage >90%)

  • NSX Manager backup status:

    • Backup schedule

    • Last backup status (Failed/successful/not scheduled along with date and time)

    • Last backup attempt (date and time with details)

    • Last successful backup (date and time with details)

  • NSX infrastructure—NSX Controller status

    • Controller node status (up/down/running/deploying/removing/failed/unknown)

    • Controller peer connectivity status (If controller is down showing Red, peer controllers are displayed as Yellow)

    • Controller VM status (powered off/deleted)

    • Controller disk latency alerts

  • NSX infrastructure—Host status

    • Deployment related:

      • Number of clusters with installation failed status

      • Number of clusters that need upgrade

      • Number of clusters where installation is in progress

      • Number of unprepared clusters

    • Firewall:

      • Number of clusters with firewall disabled

      • Number of clusters where firewall status is yellow/red where

        • Yellow means distributed firewall is disabled on any of the clusters

        • Red means distributed firewall was unable to get installed on any of the hosts/clusters

    • VXLAN:

      • Number of clusters with VXLAN not configured

      • Number of clusters where VXLAN status is green/yellow/red where

        • Green means feature was successfully configured

        • Yellow means busy when VXLAN configuration is in-progress

        • Red (error) means state when VTEP creation failed, VTEP could not find the IP address, VTEP got LinkLocal IP address assigned, and so on

  • NSX infrastructure—Service deployment status

    • Deployment failures—installation status for the failed deployments

    • Service status—for all the failed services

  • NSX infrastructure—Edge notifications

    Edge notifications dashboard highlights active alarms for certain services. It monitors list of critical events that are listed below and tracks them till the issue is unresolved. Alarms are auto resolved when recovery event is reported, or edge is force synced, redeployed or upgraded.

    • Load balancer (edge load balancer server status)

      • Edge load balancer back end server is down

      • Edge load balancer back end server warning status

    • VPN (IPsec tunnel / IPsec channel status)

      • Edge IPsec channel is down

      • Edge IPsec tunnel is down

    • Appliance (edge VM, edge gateway, edge file system, NSX Manager, and edge services gateway reports status)

      • Edge services gateway missing health check pulse

      • Edge VM got powered off

      • Edge VM missing health check pulse

      • NSX Edge reports bad state

      • NSX Manager reports that edge services gateway is in bad state

      • Edge VM is not present in VC inventory

      • HA split brain detected

    Note:

    Load balancer and VPN alarms are not auto cleared on configuration update. Once the issue is resolved, you have to clear the alarms manually with API using the alarm-id command. Here is the example of API that you can use to clear the alarms. For details, refer to NSX API Guide.

    GET https://<<NSX-IP>>/api/2.0/services/alarms/{source-Id} 
    POST https://<<NSX-IP>>/api/2.0/services/alarms?action=resolve
    
    GET https://<<NSX-IP>>/api/2.0/services/systemalarms/<alarmId>
    POST https://<<NSX-IP>>/api/2.0/services/systemalarms/<alarmId>?action=resolve
    

  • NSX services—Firewall Publish status

    • Number of hosts with Firewall Publish status as failed. Status is red when any host do not successfully apply the published distributed firewall configuration

  • NSX services—Logical Networking status

    • Number of logical switches with status Error or Warning

    • Flags when backed DVPG port group is deleted from vCenter Server