VMware Aria Operations for Networks collects health metrics from the Collector, Platform, and system to diagnose and monitor health problems related to the VMware Aria Operations for Networks instance.
Health metrics are of the following types:
- Node metrics
- Service metrics
- System metrics
Viewing Health Metrics
You can view the node and service metrics on VMware Aria Operations for Networks Platform and Collector pages. Where as, you can view the system metrics only on the System Dashboard page.
To view node and service metrics:
- Enter Platform or Collector on the search bar.
- On the search results page, select a Platform or Collector to view the health metrics available for that entity.
To view the system metrics, Enter System Dashboard on the search bar.
Node Metrics
Node metrics provide information about the memory, disk IO, and CPU utilization of a node.
Metric Name | Metric API Name | Description |
---|---|---|
Memory Usage | CUSTOM_METRIC[level:node]vRNI.used.memory.percentage.rate.average.number | Percentage of memory used out of the total configured memory in the given node. |
Data Disk Usage | CUSTOM_METRIC[level:node]vRNI.used.disk.percentage.rate.average.number | Percentage of disk input/output used in the given node. |
CPU Usage | CUSTOM_METRIC[level:node]vRNI.cpu.utilization.percentage.rate.average.number | Percentage of CPU used out of the total available CPU for a given node. |
Service Metrics
Service metrics indicate if a service is running. In the metric API name, you must replace <service_name>
with your service name (such as IpfixProcessor, ElasticSearch, Kafka) to view the state of the service.
Metric Name | Metric API Name | Description |
---|---|---|
Service Uptime | CUSTOM_METRIC[service.name:<service_name>]vRNI.service.uptime.rate.average.number | Binary indicator to check if the VMware Aria Operations for Networks service is running and is healthy.
|
System Metrics
System metrics provide information about the performance and usage of the overall VMware Aria Operations for Networks system.
Metric Name | Metric API Name | Description |
---|---|---|
Processing Lag | grid.messageAge.absolute.latest.millisecond | The lag time of the system's data processing grid. High processing lag may result in the system showing stale data. |
Grid Usage | grid.busy.absolute.latest.percent | The capacity utilization of VMware Aria Operations for Networks's processing grid. The utilization percentage is constantly high if the processing load increases. This may result in potential risk of developing high processing lag. Adding many data sources, frequent addition and deletion of VMs, and increase in flow count are few of the reasons for increase in the processing load. |
Indexer Lag | grid.indexerLag.absolute.latest.millisecond | The lag time of the system's indexer. High indexer lag may result in the system showing stale data. |
VM Count | vRNI.internal.tenants.usage.vm.count.absolute.maximum.number | Total count of discovered VMs in the system. |
Host Count | vRNI.internal.tenants.usage.host.count.absolute.maximum.numb |
Total count of discovered hosts in the system. |
Application Count | vRNI.internal.tenants.usage.application.count.absolute.maximum.number | Total count of saved applications in the system. |
Daily Flow Count | vRNI.internal.tenants.usage.flow.daily.count.absolute.maximum.number | Total count of unique flows in the last 24 hours. |
Weekly Flow Count | vRNI.internal.tenants.usage.flow.weekly.count.absolute.maximum.number | Total count of unique flows in the last seven days. |
Firewall Rule Count | vRNI.internal.tenants.usage.firewallRule.count.absolute.maximum.number | Total count of discovered firewall rules. |