Viewing NSX Application Platform Metrics Within the NSX Manager UI

From the NSX Manager UI, you can view NSX Application Platform point-in-time and time-series metrics.

To view the metrics, log in to the NSX Manager UI from a browser and navigate to System > NSX Application Platform. Click the Metrics tab. The following metrics are displayed:

Metric	Description
Infra Classifier	A mix of monitor-based reporting and the custom gRPC metrics. The monitor-based metrics are reported once daily and capture any uncaught status of generic failure or Spark-operator status. The possible status values are: COMPLETED FAILED SUBMISSION_FAILED FAILING INVALIDATING PENDING_RERUN RUNNING SUBMITTED SUCCEEDING UNKNOWN NOT_INITIATED The gRPC metrics show the reason for a graceful shutdown or a failure to run a task or service. The possible reasons are: INSUFFICIENT_MEMORY INSUFFICIENT_FLOWS INSUFFICIENT_DAYS FAILED
Recommendation Monitoring Job	A Spark job runs hourly and monitors the READY_TO_PUBLISH jobs. It reports changes in the recommendation to run a job, and if necessary, suggests a rerun of the job. The possible status values are: COMPLETED FAILED SUBMISSION_FAILED FAILING INVALIDATING NOT_AVAILABLE PENDING_RERUN RUNNING SUBMITTED SUCCEEDING UNKNOWN NOT_INITIATED
Flow Clustering Job	The status of the flow clustering job that runs every hour. The possible status values are: RUNNING - The clustering job is currently running. SUCCEEDED - The last running job completed successfully. FAILED - The last running job failed.
Flow Ingestion	This metric indicates whether flow ingestion is paused or enabled depending on the disk usage. The possible status values are: DISABLED - Ingestion is disabled. ENABLED - Ingestion is enabled.
Suspicious Traffic Detectors	After every run, each of the Security Intelligence detectors will have one of the following statuses. The status is displayed only for enabled detectors and not for detectors that are in the NOT_STARTED state. NOT_STARTED - The NTA detector is not enabled and has never been run on the onboarded site. SUCCESS - The detector successfully completed execution. NOT_ENOUGH_BASELINE - The baseline detectors (VERTICAL_PORT_SCAN, LLMNR_NBTNS, REMOTE_SERVICES and UNCOMMONPORT) finished execution successfully but could not report events because the baseline size was insufficient for event detection. FAILURE - The detector failed to execute.
Kafka Message Lag	The average message delay for each Kafka topic.
Druid Task Failures	Druid failure task count. A task can be a reindex task, a Kafka ingestion task, or a compaction task on flow table and configuration table.
Intelligence Configuration Updates	The number of Security Intelligence new configurations per config-type, identified hourly.
Average CPU Usage (%) on Node	The average CPU usage on all NSX Application Platform Kubernetes nodes.
Druid Average Retention Days	The Druid retention days for the table correlated_flow_viz. The default is 30 days.
Total Flows and Unique Flows	The total and unique flows in the entire Druid database when queried. One month data is available. This job runs once a day.
Kafka Average Message Input Rate	The average incoming message rate of all Kafka topic.