Recording rules allow you to precompute frequently needed or computationally expensive PromQL (Prometheus Query Language) expressions and save their result as a new set of time series.

Prometheus allows you to configure and evaluate recording rules at regular intervals. Using recording rules, you can compute complex functions on existing metrics such as rate, round, scalar and so on. For details, see Functions. You can configure recording rules in Prometheus and ingest the metrics in vRealize Operations.

Sample Recording Rule Configuration to Compute Rate

To include rules in Prometheus, create a file containing the necessary rule statements and have Prometheus load the file through the rule_files field in the Prometheus configuration. Rule files use YAML. For details, see Defining Recording Rules.

Below is the sample configuration to compute rate for the metric 'container_cpu_usage_seconds_total'.

 expr: |
          sum(rate(container_cpu_usage_seconds_total{name!=""}[5m])) BY (id,job) * 100
        record: container_cpu_usage_node_container_5m

Label Names for Recording Rules

To consume the computed metrics in vRealize Operations, modify or change the config file with label names containing the service identifiers (Container ID/Pod Name, etc). For details, see Set up vRealize Operations configuration for third-party Prometheus exporters.

At least one of the labels in the recording rule should contain the service identifiers (Container ID/Pod Name etc) so that the vRealize Operations can map the metrics to appropriate services. To modify any existing labels, you can:

  • Use the label_replace() function to rename or modify the existing label name with a new label name. For details, see label_replace().

  • Specify a 'without' clause with the labels you are aggregating. This is to preserve all the other labels such as job, which will avoid conflicts and give you more useful metrics and alerts.

Sample Configuration to Modify the Labels

- expr: |
          sum without (kubernetes_node,app,component,instance,kapp_k14s_io_app,kapp_k14s_io_association,kubernetes_name,kubernetes_namespace)
          (label_replace((rate(node_context_switches_total{job="kubernetes-service-endpoints"}[5m])) / (count without(cpu, mode) (node_cpu_seconds_total{mode="idle"})),
          "nodename", "$1", "kubernetes_node", "(.*)"))
        record: node_context_switch_rate_5m
        labels:
          job: kubernetes-service-endpoints