Deploy Prometheus on Workload Clusters

This topic explains how to deploy Prometheus into a workload cluster. The procedures below apply to vSphere, Amazon Web Services (AWS), and Azure deployments.

Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit. Tanzu Kubernetes Grid includes signed binaries for Prometheus that you can deploy on workload clusters to monitor cluster health and services.

Prerequisites

Important

Support for IPv6 addresses in Tanzu Kubernetes Grid is limited; see Deploy Clusters on IPv6 (vSphere Only). If you are not deploying to an IPv6-only networking environment, you must provide IPv4 addresses in the following steps.

Prepare the Workload Cluster for Prometheus Deployment

To prepare the cluster:

  1. Get the admin credentials of the workload cluster into which you want to deploy Prometheus. For example:

    tanzu cluster kubeconfig get my-cluster --admin
    
  2. Set the context of kubectl to the cluster. For example:

    kubectl config use-context my-cluster-admin@my-cluster
    

(Optional) Enable Ingress for Prometheus

To enable ingress, you can install the below optional packages:

  1. Install cert-manager. For information, see Install cert-manager for Certificate Management.
  2. Install Contour. For information, see Install Contour for Ingress control.

Continue to Deploy Prometheus into the Workload Cluster below.

Deploy Prometheus into the Workload Cluster

To install Prometheus:

  1. If the cluster does not have a package repository with the Prometheus package installed, such as the tanzu-standard repository, install one:

    tanzu package repository add PACKAGE-REPO-NAME --url PACKAGE-REPO-ENDPOINT --namespace tkg-system
    

    Where:

    • PACKAGE-REPO-NAME is the name of the package repository, such as tanzu-standard or the name of a private image registry configured with ADDITIONAL_IMAGE_REGISTRY variables.
    • PACKAGE-REPO-ENDPOINT is the URL of the package repository.

      • For this release, the tanzu-standard URL is projects.registry.vmware.com/tkg/packages/standard/repo:v2023.10.16. See List Package Repositories to obtain this value from the Tanzu CLI, or in Tanzu Mission Control see the Addons > Repositories list in the Cluster pane.
  2. Confirm that the Prometheus package is available in your workload cluster:

    tanzu package available list -A
    
  3. Retrieve the version of the available package:

    tanzu package available list prometheus.tanzu.vmware.com -A
    | Retrieving package versions for prometheus.tanzu.vmware.com...
     NAME                           VERSION                          RELEASED-AT           NAMESPACE
     prometheus.tanzu.vmware.com    2.43.0+vmware.1-tkg.4            2020-11-24T18:00:00Z  tanzu-package-repo-global
    

When you are ready to deploy Prometheus, you can:

vSphere with Tanzu: To deploy the Prometheus package to a workload cluster created by a vSphere Supervisor cluster, you must deploy it with custom values. The Prometheus package has not been validated for workload clusters on vSphere 7.0 U3.

Deploy Prometheus with Default Configurations

After you confirm the package version and retrieve it, you can install the package.

  1. Install the Prometheus package using its default values:

    tanzu package install prometheus \
    --package prometheus.tanzu.vmware.com \
    --version AVAILABLE-PACKAGE-VERSION \
    --namespace TARGET-NAMESPACE
    

    Where:

    • TARGET-NAMESPACE is the namespace in which you want to install the Prometheus package. For example, the my-packages or tanzu-cli-managed-packages namespace.

      • If the --namespace flag is not specified, the Tanzu CLI uses the default namespace. The Prometheus pods and any other resources associated with the Prometheus component are created in the tanzu-system-monitoring namespace; do not install the Prometheus package into this namespace.
      • The specified namespace must already exist, for example from running kubectl create namespace my-packages.
    • AVAILABLE-PACKAGE-VERSION is the version that you retrieved above, for example 2.43.0+vmware.1-tkg.4.

    For example:

    tanzu package install prometheus --package prometheus.tanzu.vmware.com --namespace my-packages --version 2.43.0+vmware.1-tkg.4
    
    \ Installing package 'prometheus.tanzu.vmware.com'
    | Getting package metadata for 'prometheus.tanzu.vmware.com'
    | Creating service account 'prometheus-my-packages-sa'
    | Creating cluster admin role 'prometheus-my-packages-cluster-role'
    | Creating cluster role binding 'prometheus-my-packages-cluster-rolebinding'
    - Creating package resource
    \ Package install status: Reconciling
    
    Added installed package 'prometheus' in namespace 'my-packages'
    
  2. vSphere with Tanzu: On vSphere 8 and vSphere 7.0 U2 with the vSphere with Tanzu feature enabled, the tanzu package install prometheus command may return the error Failed to get final advertise address: No private IP address found, and explicit IP not provided.

    To fix this error, create and apply a package overlay to reconfigure the alertmanager component:

    1. Create a file overlay-alertmanager.yaml containing:

      ---
      #@ load("@ytt:overlay", "overlay")
      
      #@overlay/match by=overlay.and_op(overlay.subset({"kind": "Deployment"}), overlay.subset({"metadata": {"name": "alertmanager"}}))
      ---
      spec:
        template:
          spec:
            containers:
              #@overlay/match by="name",expects="0+"
              - name: alertmanager
                args:
                  - --cluster.listen-address=
      
    2. Create a secret from the overlay:

      kubectl create secret generic alertmanager-overlay -n tanzu-package-repo-global -o yaml --dry-run=client --from-file=overlay-alertmanager.yaml | kubectl apply -f -
      
    3. Annotate the package with the secret:

      kubectl annotate PackageInstall prometheus -n tanzu-package-repo-global ext.packaging.carvel.dev/ytt-paths-from-secret-name.1=alertmanager-overlay
      

Continue to Verify Prometheus Deployment below.

Deploy Prometheus with Custom Values

To install the Prometheus package using user-provided values:

  1. Create a configuration file. This file configures the Prometheus package.

    tanzu package available get prometheus.tanzu.vmware.com/PACKAGE-VERSION --default-values-file-output FILE-PATH
    

    Where PACKAGE-VERSION is the version of the Prometheus package that you want to install and FILE-PATH is the location to which you want to save the configuration file, for example, prometheus-data-values.yaml. The above command creates a configuration file named prometheus-data-values.yaml containing the default values. Note that in the previous versions, this file was called prometheus-data-values.yaml.

    For information about configuration parameters to use in prometheus-data-values.yaml, see Prometheus Package Configuration Parameters below.

  2. vSphere with Tanzu: If you are deploying Prometheus to a workload cluster created by a vSphere Supervisor cluster, set a non-null value for prometheus.pvc.storageClassName and alertmanager.pvc.storageClassName in the prometheus-data-values.yaml file:

    ingress:
      enabled: true
      virtual_host_fqdn: "prometheus.corp.tanzu"
      prometheus_prefix: "/"
      alertmanager_prefix: "/alertmanager/"
      prometheusServicePort: 80
      alertmanagerServicePort: 80
    prometheus:
      pvc:
        storageClassName: STORAGE-CLASS
    alertmanager:
      pvc:
        storageClassName: STORAGE-CLASS
    

    Where STORAGE-CLASS is the name of the cluster’s storage class, as returned by kubectl get storageclass.

  3. After you make any changes needed to your prometheus-data-values.yaml file, remove all comments in it:

    yq -i eval '... comments=""' prometheus-data-values.yaml
    
  4. Deploy the package:

    tanzu package install prometheus \
    --package prometheus.tanzu.vmware.com \
    --version PACKAGE-VERSION \
    --values-file prometheus-data-values.yaml \
    --namespace TARGET-NAMESPACE
    

    Where:

    • TARGET-NAMESPACE is the namespace in which you want to install the Prometheus package, Prometheus package app, and any other Kubernetes resources that describe the package. For example, the my-packages or tanzu-cli-managed-packages namespace. If the --namespace flag is not specified, the Tanzu CLI uses the default namespace. The Prometheus pods and any other resources associated with the Prometheus component are created in the tanzu-system-monitoring namespace; do not install the Prometheus package into this namespace.
    • PACKAGE-VERSION is the version that you retrieved above, for example 2.43.0+vmware.1-tkg.4.

Continue to Verify Prometheus Deployment below.

Verify Prometheus Deployment

After you deploy Prometheus, you can verify that the deployment is successful:

  1. Confirm that the Prometheus package is installed. For example:

    tanzu package installed list -A
    / Retrieving installed packages...
    NAME            PACKAGE-NAME                       PACKAGE-VERSION                STATUS                   NAMESPACE
    cert-manager    cert-manager.tanzu.vmware.com      1.10.1+vmware.1-tkg.2           Reconcile succeeded      my-packages
    prometheus      prometheus.tanzu.vmware.com        2.43.0+vmware.1-tkg.4          Reconcile succeeded      my-packages
    antrea          antrea.tanzu.vmware.com                                           Reconcile succeeded      tkg-system
    metrics-server  metrics-server.tanzu.vmware.com                                   Reconcile succeeded      tkg-system
    vsphere-cpi     vsphere-cpi.tanzu.vmware.com                                      Reconcile succeeded      tkg-system
    vsphere-csi     vsphere-csi.tanzu.vmware.com                                      Reconcile succeeded      tkg-system
    

    The prometheus package and the prometheus app are installed in the namespace that you specify when running the tanzu package install command.

  2. Confirm that the prometheus app is successfully reconciled:

    kubectl get apps -A
    

    For example:

    NAMESPACE     NAME                                DESCRIPTION           SINCE-DEPLOY   AGE
    my-packages   cert-manager                        Reconcile succeeded   74s            29m
    my-packages   prometheus                          Reconcile succeeded   20s            33m
    tkg-system    antrea                              Reconcile succeeded   70s            3h43m
    [...]
    

    If the status is not Reconcile succeeded, view the full status details of the prometheus app. Viewing the full status can help you troubleshoot the problem:

    kubectl get app prometheus --namespace PACKAGE-NAMESPACE -o yaml
    

    Where PACKAGE-NAMESPACE is the namespace in which you installed the package.

  3. Confirm that the new services are running by listing all of the pods that are running in the cluster:

    kubectl get pods -A
    

    In the tanzu-system-monitoring namespace, you should see the prometheus, alertmanager, node_exporter, pushgateway, cadvisor and kube_state_metrics services running in a pod:

    NAMESPACE               NAME                                    READY   STATUS    RESTARTS   AGE
    [...]
    tanzu-system-monitoring   alertmanager-d6bb4d94d-7fgmb                             1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-cadvisor-pgfck                                1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-kube-state-metrics-868b5b749d-9w5f2           1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-node-exporter-97x6c                           1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-node-exporter-dnrkk                           1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-pushgateway-84cc9b85c6-tgmv6                  1/1     Running   0          35m
    tanzu-system-monitoring   prometheus-server-6479964fb6-kk9g2                       2/2     Running   0          35m
    [...]
    

The Prometheus pods and any other resources associated with the Prometheus component are created in the namespace you provided in prometheus-data-values.yaml. If you are using the default namespace, these are created in the tanzu-system-monitoring namespace.

Prometheus Package Configuration Parameters

There are two ways you can view configuration parameters of the Prometheus package:

Review the Package Schema

To retrieve the package schema:

tanzu package available get prometheus.tanzu.vmware.com/2.43.0+vmware.1-tkg.4 -n AVAILABLE-PACKAGE-NAMESPACE --values-schema

This command lists configuration parameters of the Prometheus package and their default values. You can use the output to update your prometheus-data-values.yml file created in Deploy Prometheus with Custom Values above.

Review Configuration Parameters

The following table lists configuration parameters of the Prometheus package and describes their default values.

You can set the following configuration values in your prometheus-data-values.yml file created in Deploy Prometheus with Custom Values above.

Parameter Description Type Default
namespace Namespace where Prometheus will be deployed. String tanzu-system-monitoring
prometheus.deployment.replicas Number of Prometheus replicas. String 1
prometheus.deployment.containers.args Prometheus container arguments. You can configure this parameter to change retention time. For information about configuring Prometheus storage parameters, see the Prometheus documentation. Note Longer retention times require more storage capacity than shorter retention times. It might be necessary to increase the persistent volume claim size if you are significantly increasing the retention time. List n/a
prometheus.deployment.containers.resources Prometheus container resource requests and limits. Map {}
prometheus.deployment.podAnnotations The Prometheus deployments pod annotations. Map {}
prometheus.deployment.podLabels The Prometheus deployments pod labels. Map {}
prometheus.deployment.configMapReload.containers.args Configmap-reload container arguments. List n/a
prometheus.deployment.configMapReload.containers.resources Configmap-reload container resource requests and limits. Map {}
prometheus.service.type Type of service to expose Prometheus. Supported Values: ClusterIP. String ClusterIP
prometheus.service.port Prometheus service port. Integer 80
prometheus.service.targetPort Prometheus service target port. Integer 9090
prometheus.service.labels Prometheus service labels. Map {}
prometheus.service.annotations Prometheus service annotations. Map {}
prometheus.pvc.annotations Storage class annotations. Map {}
prometheus.pvc.storageClassName Storage class to use for persistent volume claim. By default this is null and default provisioner is used. String null
prometheus.pvc.accessMode Define access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany. String ReadWriteOnce
prometheus.pvc.storage Define storage size for persistent volume claim. String 150Gi
prometheus.config.prometheus_yml For information about the global Prometheus configuration, see the Prometheus documentation. YAML file prometheus.yaml
prometheus.config.alerting_rules_yml For information about the Prometheus alerting rules, see the Prometheus documentation. YAML file alerting_rules.yaml
prometheus.config.recording_rules_yml For information about the Prometheus recording rules, see the Prometheus documentation. YAML file recording_rules.yaml
prometheus.config.alerts_yml Additional prometheus alerting rules are configured here. YAML file alerts_yml.yaml
prometheus.config.rules_yml Additional prometheus recording rules are configured here. YAML file rules_yml.yaml
alertmanager.deployment.replicas Number of alertmanager replicas. Integer 1
alertmanager.deployment.containers.resources Alertmanager container resource requests and limits. Map {}
alertmanager.deployment.podAnnotations The Alertmanager deployments pod annotations. Map {}
alertmanager.deployment.podLabels The Alertmanager deployments pod labels. Map {}
alertmanager.service.type Type of service to expose Alertmanager. Supported Values: ClusterIP. String ClusterIP
alertmanager.service.port Alertmanager service port. Integer 80
alertmanager.service.targetPort Alertmanager service target port. Integer 9093
alertmanager.service.labels Alertmanager service labels. Map {}
alertmanager.service.annotations Alertmanager service annotations. Map {}
alertmanager.pvc.annotations Storage class annotations. Map {}
alertmanager.pvc.storageClassName Storage class to use for persistent volume claim. By default this is null and default provisioner is used. String null
alertmanager.pvc.accessMode Define access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany. String ReadWriteOnce
alertmanager.pvc.storage Define storage size for persistent volume claim. String 2Gi
alertmanager.config.alertmanager_yml For information about the global YAML configuration for Alert Manager, see the Prometheus documentation. YAML file alertmanager_yml
kube_state_metrics.deployment.replicas Number of kube-state-metrics replicas. Integer 1
kube_state_metrics.deployment.containers.resources kube-state-metrics container resource requests and limits. Map {}
kube_state_metrics.deployment.podAnnotations The kube-state-metrics deployments pod annotations. Map {}
kube_state_metrics.deployment.podLabels The kube-state-metrics deployments pod labels. Map {}
kube_state_metrics.service.type Type of service to expose kube-state-metrics. Supported Values: ClusterIP. String ClusterIP
kube_state_metrics.service.port kube-state-metrics service port. Integer 80
kube_state_metrics.service.targetPort kube-state-metrics service target port. Integer 8080
kube_state_metrics.service.telemetryPort kube-state-metrics service telemetry port. Integer 81
kube_state_metrics.service.telemetryTargetPort kube-state-metrics service target telemetry port. Integer 8081
kube_state_metrics.service.labels kube-state-metrics service labels. Map {}
kube_state_metrics.service.annotations kube-state-metrics service annotations. Map {}
node_exporter.daemonset.replicas Number of node-exporter replicas. Integer 1
node_exporter.daemonset.containers.resources node-exporter container resource requests and limits. Map {}
node_exporter.daemonset.hostNetwork Host networking requested for this pod. boolean false
node_exporter.daemonset.podAnnotations The node-exporter deployments pod annotations. Map {}
node_exporter.daemonset.podLabels The node-exporter deployments pod labels. Map {}
node_exporter.service.type Type of service to expose node-exporter. Supported Values: ClusterIP. String ClusterIP
node_exporter.service.port node-exporter service port. Integer 9100
node_exporter.service.targetPort node-exporter service target port. Integer 9100
node_exporter.service.labels node-exporter service labels. Map {}
node_exporter.service.annotations node-exporter service annotations. Map {}
pushgateway.deployment.replicas Number of pushgateway replicas. Integer 1
pushgateway.deployment.containers.resources pushgateway container resource requests and limits. Map {}
pushgateway.deployment.podAnnotations The pushgateway deployments pod annotations. Map {}
pushgateway.deployment.podLabels The pushgateway deployments pod labels. Map {}
pushgateway.service.type Type of service to expose pushgateway. Supported Values: ClusterIP. String ClusterIP
pushgateway.service.port pushgateway service port. Integer 9091
pushgateway.service.targetPort pushgateway service target port. Integer 9091
pushgateway.service.labels pushgateway service labels. Map {}
pushgateway.service.annotations pushgateway service annotations. Map {}
cadvisor.daemonset.replicas Number of cadvisor replicas. Integer 1
cadvisor.daemonset.containers.resources cadvisor container resource requests and limits. Map {}
cadvisor.daemonset.podAnnotations The cadvisor deployments pod annotations. Map {}
cadvisor.daemonset.podLabels The cadvisor deployments pod labels. Map {}
ingress.enabled Activate/Deactivate ingress for prometheus and alertmanager. Boolean false
ingress.virtual_host_fqdn Hostname for accessing promethues and alertmanager. String prometheus.system.tanzu
ingress.prometheus_prefix Path prefix for prometheus. String /
ingress.alertmanager_prefix Path prefix for alertmanager. String /alertmanager/
ingress.prometheusServicePort Prometheus service port to proxy traffic to. Integer 80
ingress.alertmanagerServicePort Alertmanager service port to proxy traffic to. Integer 80
ingress.tlsCertificate.tls.crt Optional certificate for ingress if you want to use your own TLS certificate. A self signed certificate is generated by default. Note tls.crt is a key and not nested. String Generated cert
ingress.tlsCertificate.tls.key Optional certificate private key for ingress if you want to use your own TLS certificate.
Note tls.key is a key and not nested.
String Generated cert key
ingress.tlsCertificate.ca.crt Optional CA certificate. Note ca.crt is a key and not nested. String CA certificate


Update a Running Prometheus Deployment

To make changes to the configuration of the Prometheus package after deployment, update your deployed Prometheus package:

  1. Update the Prometheus configuration in the prometheus-data-values.yaml file.

  2. Update the installed package:

    tanzu package installed update prometheus \
    --version 2.43.0+vmware.1-tkg.4 \
    --values-file prometheus-data-values.yaml \
    --namespace my-packages
    

    Expected output:

    | Updating package 'prometheus'
    - Getting package install for 'prometheus'
    | Updating secret 'prometheus-my-packages-values'
    | Updating package install for 'prometheus'
    
     Updated package install 'prometheus' in namespace 'my-packages'
    

The Prometheus package is reconciled using the new value or values that you added. It can take up to five minutes for kapp-controller to apply the changes.

For information about updating, see Update a Package.

Delete a Prometheus Deployment

To remove the Prometheus package on your cluster, run:

tanzu package installed delete prometheus --namespace my-packages

For information about deleting, see Delete a Package.

Configure Notifications in Alert Manager

To configure notifications for Alert Manager, edit the alertmanager.config.alertmanager_yml section in your prometheus-data-values.yml file.

For information about configuring notifications, such as Slack or Email, see Configuration in the Prometheus documentation.

Access the Prometheus Dashboard

By default, ingress is not enabled on Prometheus. This is because access to the Prometheus dashboard is not authenticated. To access the Prometheus dashboard:

  1. Deploy Contour on the cluster.

    For information about deploying Contour, see Install Contour for Ingress Control.

  2. Copy the ingress.enabled section below into prometheus-data-values.yaml.

    ingress:
      enabled: false
      virtual_host_fqdn: "prometheus.system.tanzu"
      prometheus_prefix: "/"
      alertmanager_prefix: "/alertmanager/"
      prometheusServicePort: 80
      alertmanagerServicePort: 80
      #! [Optional] The certificate for the ingress if you want to use your own TLS certificate.
      #! We will issue the certificate by cert-manager when it's empty.
      tlsCertificate:
        #! [Required] the certificate
        tls.crt:
        #! [Required] the private key
        tls.key:
        #! [Optional] the CA certificate
        ca.crt:
    
  3. Update ingress.enabled from false to true.

  4. Create a DNS record to map prometheus.system.tanzu to the address of the Envoy load balancer.

    To obtain the address of the Envoy load balancer, see Install Contour for Ingress Control.

  5. Access the Prometheus dashboard by navigating to https://prometheus.system.tanzu in a browser.

    Prometheus dashboard

What to Do Next

The Prometheus package is now running and scraping data from your cluster. To visualize the data in Grafana dashboards, see Deploy Grafana on Workload Clusters.

check-circle-line exclamation-circle-line close-line
Scroll to top icon