Controller Metrics

The NSX Advanced Load Balancer system monitors Controller nodes in a Cluster. Analytics information about CPU, memory, and disk usage across Controller nodes are collected.

Background

This section explains troubleshooting crash, restart, and similar issues on the NSX Advanced Load Balancer Controller using REST API. To troubleshoot these issues, it is essential to know the processes that were running before and after a particular incident. Troubleshooting license issues also requires various metrics associated with the Controller and Service Engines(SEs).

The NSX Advanced Load Balancer Controller collects two types of analytics data. They are:

Controller Metrics: This includes CPU, memory, and disk usage information of all Controller nodes in a Cluster, tracked every minute.
Process Metrics: This includes CPU, memory and disk usage, the number of context switches, swap usage, IO-read bytes, IO-write bytes, the number of threads, and the number of files opened for each process running on Controller nodes.

Instructions

The data for controller metrics and process metrics are captured every minute and sent to the metrics manager process running on the Controller. The metric manager performs five-minute aggregations and writes them into the database. These metrics are thus available to authorized users through the REST API or SDK.

Example - Finding the vm_uuid

Use the following statment to find the vm_uuid for the desired Controller node:

https://10.10.24.102/api/cluster

In the above REST API call, 10.10.24.102 is the Controller IP. Below is the output of the REST API call.

{
  "nodes" : [
    {
      "ip" : {
	   "type" : "V4",
	   "addr" : "10.10.24.102"
	},
	"vm_hostname" : "node1.controller.local",
	"vm_uuid" : "005056b015e3",
	"name" : "10.10.24.102",
	"vm_mor: "vm_147457"      
    }
],
  "tenant_uuid" : "admin",
  "uuid" : "cluster-63087542-5f3b-44ec-8249-bf6428bed78f",
  "name" : "cluster-0-1"	
}

In the output, 005056b015e3 is the vm_uuid for the VM name or the Controller IP 10.10.24.102. Use this vm_uuid for the REST API calls to collect Controller metrics and process metrics.

Collecting Controller Metrics

Use the following REST API call to collect Controller metrics.

/api/analytics/metrics/controller/<vm-uuid>/?metric_id=controller_stats.avg_cpu_usage&pad_missing_data=false&limit=3&step=300

Replace vm_uuid with the vm_uuid_output of the Controller node.

Fetching Process Metrics

Use the following API call to collect data and logs for all processes running on a Controller node.
/api/analytics/metrics/controller/<vm-uuid>/?metric_id=process_stats.avg_fds&pad_missing_data=false&limit=3&step=300&obj_id=*
Use the following API call for a particular process named avi-health with PID (process ID) 14257.
/api/analytics/metrics/controller/<vm-uuid>/?metric_id=process_stats.avg_fds&pad_missing_data=false&limit=3&step=300&obj_id=avi-health:14257

Note:

For collecting metrics for a particular process, provide the value of the process name and the process ID, instead of just * for obj_id as shown in the above REST API calls.