This topic discusses several issues and how to resolve them.
- Load balancing on TCP port (for example, port 443) does not work.
- Verify the topology. For details, refer to NSX Administration Guide.
- Verify the virtual server IP address is reachable with ping, or look at the upstream router to ensure the ARP table is populated.
- Verify configurations in the UI.
- Verify configurations in the CLI.
- Capture packets.
- A member of the load balancing pool is not utilized.
- Verify the server is in the pool, enabled, and monitor health status.
- Edge traffic is not load balanced.
- Verify the pool and persistence configuration. If you have persistence configured and you are using a small number of clients, you may not see even distribution of connections to backend pool members.
- Layer 7 load balancing engine is stopped.
- Health monitor engine is stopped.
- Enable load balancer service. Refer to the NSX Administration Guide.
- Pool member monitor status is WARNING/CRITICAL.
- Verify the application server is reachable from the load balancer.
- Verify the application server firewall or DFW is allowing traffic.
- Ensure the application server is able to respond to the specified health probe.
- Pool member has the INACTIVE status.
- Verify the pool member is enabled in the pool configuration.
- Layer 7 sticky table is not synchronized with the standby Edge.
- Ensure that HA is configured.
- Client connections, but cannot complete an application transaction.
- Verify that the proper persistence is configured in the application profile.
- If the application works with only one server in the pool (and not two), it is most likely a persistence problem.
Basic Troubleshooting
- Check the load balancer configuration status in the vSphere Web Client:
- Click Networking & Security > NSX Edges.
- Double-click an NSX Edge.
- Click Manage, and then click the Load Balancer tab.
- Check the load balancer status and logging level configured.
- Before troubleshooting the load balancer service, run the following command on the NSX Manager to ensure that the service is up an running:
nsxmgr> show edge edge-4 service loadbalancer haIndex: 0 ----------------------------------------------------------------------- Loadbalancer Services Status: L7 Loadbalancer : running ----------------------------------------------------------------------- L7 Loadbalancer Statistics: STATUS PID MAX_MEM_MB MAX_SOCK MAX_CONN MAX_PIPE CUR_CONN CONN_RATE CONN_RATE_LIMIT MAX_CONN_RATE running 1580 0 2081 1024 0 0 0 0 0 ----------------------------------------------------------------------- L4 Loadbalancer Statistics: MAX_CONN ACT_CONN INACT_CONN TOTAL_CONN 0 0 0 0 Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn
Note: You can run show edge all to look up the names of the NSX Edges.
Troubleshooting Configuration Issues
When the load balancer configuration operation is rejected by the NSX user interface or REST API call, this is classified as a configuration issue.
Troubleshooting Data Plane Issues
The load balancer configuration is accepted by NSX Manager, but there are connectivity or performance issues among the client-edge load-balance server. Data plane issues also include load balancer runtime CLI issues and load balancer system event issues.
- Change the Edge logging level in NSX Manager from INFO to TRACE or DEBUG using this REST API call.
URL: https://NSX_Manager_IP/api/1.0/services/debug/loglevel/com.vmware.vshield.edge?level=TRACE Method: POST
- Check the pool member status in the vSphere Web Client.
- Click Networking & Security > NSX Edges.
- Double-click an NSX Edge.
- Click Manage, and then click the Load Balancer tab.
- Click Pools to see a summary of the configured load balancer pools.
- Select your load balancer pool. click Show Pool Statistics, and verify that the pool state is UP.
- You can get more detailed load balancer pool configuration statistics from the NSX Manager using the following REST API call:
URL: https://NSX_Manager_IP/api/4.0/edges/{edgeId}/loadbalancer/statistics Method: GET <?xml version="1.0" encoding="UTF-8"?> <loadBalancerStatusAndStats> <timeStamp>1463507779</timeStamp> <pool> <poolId>pool-1</poolId> <name>Web-Tier-Pool-01</name> <member> <memberId>member-1</memberId> <name>web-01a</name> <ipAddress>172.16.10.11</ipAddress> <status>UP</status> <lastStateChangeTime>2016-05-16 07:02:00</lastStateChangeTime> <bytesIn>0</bytesIn> <bytesOut>0</bytesOut> <curSessions>0</curSessions> <httpReqTotal>0</httpReqTotal> <httpReqRate>0</httpReqRate> <httpReqRateMax>0</httpReqRateMax> <maxSessions>0</maxSessions> <rate>0</rate> <rateLimit>0</rateLimit> <rateMax>0</rateMax> <totalSessions>0</totalSessions> </member> <member> <memberId>member-2</memberId> <name>web-02a</name> <ipAddress>172.16.10.12</ipAddress> <status>UP</status> <lastStateChangeTime>2016-05-16 07:02:01</lastStateChangeTime> <bytesIn>0</bytesIn> <bytesOut>0</bytesOut> <curSessions>0</curSessions> <httpReqTotal>0</httpReqTotal> <httpReqRate>0</httpReqRate> <httpReqRateMax>0</httpReqRateMax> <maxSessions>0</maxSessions> <rate>0</rate> <rateLimit>0</rateLimit> <rateMax>0</rateMax> <totalSessions>0</totalSessions> </member> <status>UP</status> <bytesIn>0</bytesIn> <bytesOut>0</bytesOut> <curSessions>0</curSessions> <httpReqTotal>0</httpReqTotal> <httpReqRate>0</httpReqRate> <httpReqRateMax>0</httpReqRateMax> <maxSessions>0</maxSessions> <rate>0</rate> <rateLimit>0</rateLimit> <rateMax>0</rateMax> <totalSessions>0</totalSessions> </pool> <virtualServer> <virtualServerId>virtualServer-1</virtualServerId> <name>Web-Tier-VIP-01</name> <ipAddress>172.16.10.10</ipAddress> <status>OPEN</status> <bytesIn>0</bytesIn> <bytesOut>0</bytesOut> <curSessions>0</curSessions> <httpReqTotal>0</httpReqTotal> <httpReqRate>0</httpReqRate> <httpReqRateMax>0</httpReqRateMax> <maxSessions>0</maxSessions> <rate>0</rate> <rateLimit>0</rateLimit> <rateMax>0</rateMax> <totalSessions>0</totalSessions> </virtualServer> </loadBalancerStatusAndStats>
-
To check load balancer statistics from the command line, run the following commands on the NSX Edge.
For a particular virtual server: First run show service loadbalancer virtual to get the virtual server name. Then run show statistics loadbalancer virtual <virtual-server-name>.
For a particular TCP pool: First run show service loadbalancer pool to get the pool name. Then run show statistics loadbalancer pool <pool-name>.
- Review the load balancer statistics for signs of failure.