This section describes System Metrics Monitoring on the Orchestrator.

Orchestrator System Metrics Monitoring Overview

The Orchestrator comes with a built-in system metrics monitoring stack, which includes a metrics collector and a time-series database. With the monitoring stack, you can easily check the health condition and the system load for the Orchestrator.

To enable the monitoring stack, run the following command on the orchestrator:

sudo /opt/vc/scripts/vco_observability_manager.sh enable 

To check the status of the monitoring stack, run:

sudo /opt/vc/scripts/vco_observability_manager.sh status

To deactivate the monitoring stack, run:

sudo /opt/vc/scripts/vco_observability_manager.sh disable

The Metrics Collector

Telegraf is used as the Orchestrator system metrics collector, which includes plugins to collect system metrics. The following metrics are enabled by default.

Metric Name Description
inputs.cpu Metrics about CPU usage.
inputs.mem Metrics about memory usage.
inputs.net Metrics about network interfaces.
inputs.system Metrics about system load and uptime.
inputs.processes The number of processes grouped by status.
inputs.disk Metrics about disk usage.
inputs.diskio Metrics about disk IO by device.
inputs.procstat CPU and memory usage for specific processes.
inputs.nginx Nginx's basic status information (ngx_http_stub_status_module).
inputs.mysql Statistic data from the MySQL server.
inputs.clickhouse Metrics from one or many ClickHouse servers.
inputs.redis Metrics from one or many redis servers.
inputs.filecount The number and total size of files in specified directories.
inputs.ntpq Standard NTP query metrics (requires ntpq executable).
Inputs.x509_cert Metrics from a SSL certificate.

To activate more metrics or deactivate some enabled metrics, edit the Telegraf configuration file on the Orchestrator by the following:

  • sudo vi /etc/telegraf/telegraf.d/system_metrics_input.conf
  • sudo systemctl restart telegraf

The Time-series Database

Prometheus is used to store the system metrics collected by Telegraf. The metrics data will be kept in the database for three weeks at the most. By default, Prometheus listens on port 9090. If you have an external monitoring tool, provide the Prometheus database as a source, so that you can view the Orchestrator system metrics on your monitoring UI.