The following are the common reasons for a server to be marked DOWN
:
- ARP Unresolved
-
If the Service Engine is not able to resolve the MAC address of the server's IP address (when in the same layer 2 domain) or is unable to initiate a TCP connection (when the server is a layer 3 hop away).
- Payload Mismatch
-
The health monitor expects specific content to be returned in the body of the response (HTTP or TCP). In the example, an excerpt of the server's response is shown. Often this type of error occurs when a server's first response is to send a redirect to a client. The expected content appears in the client browser, but from NSX Advanced Load Balancer's perspective, the client receives a redirect.
- Response Code Mismatch
-
HTTP health checks can be configured to expect a specific response code, such as
2xx
. Meanwhile, the server can be sending back a different code, such as404
. - Response Timeout with a Threshold Violation
-
Health monitors wait a timeout period for a response and every health monitor can be assigned to its threshold and timeout period. If a valid response is not received within the timeout period, for N consecutive times equal to the threshold, then the server is marked
DOWN
.
While NSX Advanced Load Balancer is engineered for easy troubleshooting, you will require more advanced tools. Hence you can capture a trace of the conversation between the SE and the server by navigating to .
For more information on traffic capture, see Packet Capture in VMware NSX Advanced Load BalancerAdministration guide.
You can use tools such as ping and curl while launching from a client machine to the server. However, these tools are not reliable if the tools are executed by administrators from SEs. This is due to the dual network stacks used for the data plane and management. For instance, a tool such as ping is executed from Linux using the SE management IP and network. The results can be different than the SE that is reporting its health check through its data NICs and networks. For instance, use ping -1
to verify the interface used.