This topic discusses several issues and how to resolve them.

The following issues are common when using NSX load balancing:
  • Load balancing on TCP port (for example, port 443) does not work.
  • A member of the load balancing pool is not utilized.
    • Verify the server is in the pool, enabled, and monitor health status.
  • Edge traffic is not load balanced.
    • Verify the pool and persistence configuration. If you have persistence configured and you are using a small number of clients, you may not see even distribution of connections to backend pool members.
  • Layer 7 load balancing engine is stopped.
  • Health monitor engine is stopped.
    • Enable load balancer service. Refer to the NSX Administration Guide.
  • Pool member monitor status is WARNING/CRITICAL.
    • Verify the application server is reachable from the load balancer.
    • Verify the application server firewall or DFW is allowing traffic.
    • Ensure the application server is able to respond to the specified health probe.
  • Pool member has the INACTIVE status.
    • Verify the pool member is enabled in the pool configuration.
  • Layer 7 sticky table is not synchronized with the standby Edge.
    • Ensure that HA is configured.
  • Client connections, but cannot complete an application transaction.
    • Verify that the proper persistence is configured in the application profile.
    • If the application works with only one server in the pool (and not two), it is most likely a persistence problem.

Basic Troubleshooting

  1. Check the load balancer configuration status in the vSphere Web Client:
    1. Click Networking & Security > NSX Edges.
    2. Double-click an NSX Edge.
    3. Click Manage, and then click the Load Balancer tab.
    4. Check the load balancer status and logging level configured.
  2. Before troubleshooting the load balancer service, run the following command on the NSX Manager to ensure that the service is up an running:
    nsxmgr> show edge edge-4 service loadbalancer
    haIndex:              0
    -----------------------------------------------------------------------
    Loadbalancer Services Status:
    
    L7 Loadbalancer     : running
    -----------------------------------------------------------------------
    L7 Loadbalancer Statistics:
    STATUS     PID        MAX_MEM_MB MAX_SOCK   MAX_CONN   MAX_PIPE   CUR_CONN   CONN_RATE  CONN_RATE_LIMIT MAX_CONN_RATE
    running    1580       0          2081       1024       0          0          0          0               0
    -----------------------------------------------------------------------
    L4 Loadbalancer Statistics:
    MAX_CONN   ACT_CONN   INACT_CONN TOTAL_CONN
    0          0          0          0
    
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    
    Note: You can run show edge all to look up the names of the NSX Edges.

Troubleshooting Configuration Issues

When the load balancer configuration operation is rejected by the NSX user interface or REST API call, this is classified as a configuration issue.

Troubleshooting Data Plane Issues

The load balancer configuration is accepted by NSX Manager, but there are connectivity or performance issues among the client-edge load-balance server. Data plane issues also include load balancer runtime CLI issues and load balancer system event issues.

  1. Change the Edge logging level in NSX Manager from INFO to TRACE or DEBUG using this REST API call.
    URL: https://NSX_Manager_IP/api/1.0/services/debug/loglevel/com.vmware.vshield.edge?level=TRACE 
    Method: POST
  2. Check the pool member status in the vSphere Web Client.
    1. Click Networking & Security > NSX Edges.
    2. Double-click an NSX Edge.
    3. Click Manage, and then click the Load Balancer tab.
    4. Click Pools to see a summary of the configured load balancer pools.
    5. Select your load balancer pool. click Show Pool Statistics, and verify that the pool state is UP.
  3. You can get more detailed load balancer pool configuration statistics from the NSX Manager using the following REST API call:
    URL: https://NSX_Manager_IP/api/4.0/edges/{edgeId}/loadbalancer/statistics 
    Method: GET
    
    <?xml version="1.0" encoding="UTF-8"?>
    <loadBalancerStatusAndStats>
        <timeStamp>1463507779</timeStamp>
        <pool>
            <poolId>pool-1</poolId>
            <name>Web-Tier-Pool-01</name>
            <member>
                <memberId>member-1</memberId>
                <name>web-01a</name>
                <ipAddress>172.16.10.11</ipAddress>
                <status>UP</status>
                <lastStateChangeTime>2016-05-16 07:02:00</lastStateChangeTime>
                <bytesIn>0</bytesIn>
                <bytesOut>0</bytesOut>
                <curSessions>0</curSessions>
                <httpReqTotal>0</httpReqTotal>
                <httpReqRate>0</httpReqRate>
                <httpReqRateMax>0</httpReqRateMax>
                <maxSessions>0</maxSessions>
                <rate>0</rate>
                <rateLimit>0</rateLimit>
                <rateMax>0</rateMax>
                <totalSessions>0</totalSessions>
            </member>
            <member>
                <memberId>member-2</memberId>
                <name>web-02a</name>
                <ipAddress>172.16.10.12</ipAddress>
                <status>UP</status>
                <lastStateChangeTime>2016-05-16 07:02:01</lastStateChangeTime>
                <bytesIn>0</bytesIn>
                <bytesOut>0</bytesOut>
                <curSessions>0</curSessions>
                <httpReqTotal>0</httpReqTotal>
                <httpReqRate>0</httpReqRate>
                <httpReqRateMax>0</httpReqRateMax>
                <maxSessions>0</maxSessions>
                <rate>0</rate>
                <rateLimit>0</rateLimit>
                <rateMax>0</rateMax>
                <totalSessions>0</totalSessions>
            </member>
            <status>UP</status>
            <bytesIn>0</bytesIn>
            <bytesOut>0</bytesOut>
            <curSessions>0</curSessions>
            <httpReqTotal>0</httpReqTotal>
            <httpReqRate>0</httpReqRate>
            <httpReqRateMax>0</httpReqRateMax>
            <maxSessions>0</maxSessions>
            <rate>0</rate>
            <rateLimit>0</rateLimit>
            <rateMax>0</rateMax>
            <totalSessions>0</totalSessions>
        </pool>
        <virtualServer>
            <virtualServerId>virtualServer-1</virtualServerId>
            <name>Web-Tier-VIP-01</name>
            <ipAddress>172.16.10.10</ipAddress>
            <status>OPEN</status>
            <bytesIn>0</bytesIn>
            <bytesOut>0</bytesOut>
            <curSessions>0</curSessions>
            <httpReqTotal>0</httpReqTotal>
            <httpReqRate>0</httpReqRate>
            <httpReqRateMax>0</httpReqRateMax>
            <maxSessions>0</maxSessions>
            <rate>0</rate>
            <rateLimit>0</rateLimit>
            <rateMax>0</rateMax>
            <totalSessions>0</totalSessions>
        </virtualServer>
    </loadBalancerStatusAndStats>
    
  4. To check load balancer statistics from the command line, run the following commands on the NSX Edge.

    For a particular virtual server: First run show service loadbalancer virtual to get the virtual server name. Then run show statistics loadbalancer virtual <virtual-server-name>.

    For a particular TCP pool: First run show service loadbalancer pool to get the pool name. Then run show statistics loadbalancer pool <pool-name>.

  5. Review the load balancer statistics for signs of failure.