This topic describes how to deploy and manage the TKG Extension v1.3.1 for Grafana. Grafana lets you query, visualize, alert on, and explore metrics no matter where they are stored. Grafana provides tools to form graphs and visualizations from application data. Deploy the TKG Extension for Grafana to generate and view metrics for Tanzu Kubernetes clusters.

Grafana Extension Prerequisites

Adhere to the following prerequisites to deploy the extension.

Grafana Extension Additional Requirements

The TKG Extension v1.3.1 for Grafana Monitoring has additional requirements pre- and post-installation.

  • The Grafana monitoring extension requires a default persistent storage class. You can either create a cluster with a default persistent storage class, or specify one in the Grafana configuration file when deploying the extension. See Review Persistent Storage Requirements for TKG Extensions.
  • Once the Grafana extension is deployed, you access the Grafana Dashboard via HTTP/S using the IP address exposed by one of the following Kuberetes service types: ClusterIP (default), NodePort, or LoadBalancer. To access the Grafana Dashboard from outside the cluster, deploy the Contour extension before you deploy Grafana. To deploy the Contour extension, see Deploy and Manage the TKG Extension for Grafana Monitoring.

    Grafana supports the following Kuberentes service types:

    Service Type Description Accessability
    ClusterIP Exposes the Service on a cluster-internal IP. Service is only accessible from within the cluster.
    NodePort Exposes the Service on each Node's IP at a static port. Service is accessible from outside the cluster.
    LoadBalancer Exposes the Service externally using a load balancer. Service is accessible from outside the cluster.
    ClusterIP is the default, but it is only accessible from within the cluster. If you are using NSX-T networking for the Supervisor Cluster, create an Contour Envoy service of type LoadBalancer. If you are using vSphere vDS networking for the Supervisor Cluster, create an Contour Envoy service of type LoadBalancer or NodePort, depending on your requirements.

Deploy the Grafana Extension for Visualization and Analytics

The TKG Extension for Grafana deploys a single container. For more information, see https://grafana.com/.
Container Resource Type Replicas Description
Grafana Deployment 2 Data visualization
The extension is configured to pull the containers from the VMware public registry at https://projects.registry.vmware.com/. If you are using a private registry, change the endpoint URL in the data values and extension configurations to match. See Configure the Grafana Extension.
  1. Verify that you have completed each of the Grafana extension prerequisites. See Grafana Extension Prerequisites.
  2. Change directory to the Grafana extension.
    cd /tkg-extensions-v1.3.1+vmware.1/extensions/monitoring/grafana
  3. Create the tanzu-system-monitoring namespace and Grafana service account and role objects.
    kubectl apply -f namespace-role.yaml
  4. Create a Grafana data values file.
    The example data values file provides the minimum required configuration.
    cp grafana-data-values.yaml.example grafana-data-values.yaml
  5. Configure the Grafana extension by updating grafana-data-values.yaml.

    Customize the configuration as needed. See Configure the Grafana Extension.

    The admin_password should be base64 encoded, but it will not block the deployment of the extension if it is not. In the example below, the password "admin" is base64 encoded. Encode your own Grafana password here: https://www.base64encode.org/.

    If the cluster is not provisioned with a default storage class, you can specify it in the data values file. Also, make sure the namespace has sufficient storage for the persistent volume claims.
    monitoring:
      grafana:
        image:
          repository: "projects.registry.vmware.com/tkg/grafana"
        pvc:
          storage_class: vwt-storage-policy
          storage: "8Gi"  
        secret:
          admin_password: "YWRtaW4="
      grafana_init_container:
        image:
          repository: "projects.registry.vmware.com/tkg/grafana"
      grafana_sc_dashboard:
        image:
          repository: "projects.registry.vmware.com/tkg/grafana"
    
    If you deployed Contour with an Envoy service of type LoadBalancer or NodePort, specify that in the configuration file as shown. See Configure the Grafana Extension for more information.
    monitoring:
      grafana:
        service:
          type: LoadBalancer OR NodePort
    

    By default the Grafana extension creates the Fully Qualified Domain Name (FQDN) grafana.system.tanzu for accessing the Grafana Dashboard. You can customize this FQDN by specifying the desired hostname in the configuation file at monitoring.grafana.ingress.virtual_host_fqdn. See Configure the Grafana Extension for more information.

  6. Create the Grafana secret with grafana-data-values file.
    kubectl create secret generic grafana-data-values --from-file=values.yaml=grafana-data-values.yaml -n tanzu-system-monitoring

    The grafana-data-values secret is created in the tanzu-system-monitoring namespace. Verify using kubectl get secrets -n tanzu-system-monitoring.

  7. Deploy the Grafana extension.
    kubectl apply -f grafana-extension.yaml

    On success the Grafana app is created: app.kappctrl.k14s.io/grafana created.

  8. Check the status of the Grafana app.
    kubectl get app grafana -n tanzu-system-monitoring
    The status should change from Reconciling to Reconcile succeeded. If the status is Reconcile failed, see Troubleshooting.
  9. View detailed status on the Grafana app.
    kubectl get app grafana -n tanzu-system-monitoring -o yaml
  10. Verify the Grafana Deployment.
    kubectl get deployments -n tanzu-system-monitoring

Access the Grafana Dashboard Using a Contour Envoy Service of Type LoadBalancer

If the prerequisite Contour Envoy service of type LoadBalancer is deployed, and you specified this in the Grafana configuration file, obtain the external IP address of the load balancer and create DNS records for the Grafana FQDN.
  1. Get the External-IP address for the Envoy service of type LoadBalancer.
    kubectl get service envoy -n tanzu-system-ingress
    You should see the External-IP address returned, for example:
    NAME    TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE
    envoy   LoadBalancer   10.99.25.220   10.195.141.17   80:30437/TCP,443:30589/TCP   3h27m
    Alternatively you can get the External-IP address using the following command.
    kubectl get svc envoy -n tanzu-system-ingress -o jsonpath='{.status.loadBalancer.ingress[0]}'
  2. To verify the installation of the Grafana extension, update your local /etc/hosts file with the Grafana FQDN mapped to the External-IP address of the load balancer, for example:
    127.0.0.1 localhost
    127.0.1.1 ubuntu
    # TKG Grafana Extension with Envoy Load Balancer
    10.195.141.17 grafana.system.tanzu
    
  3. Access the Grafana Dashboard by navigating to https://grafana.system.tanzu.

    Because the site uses self-signed certificates, you might need to navigate through a browser-specific security warning before you are able to access the dashboard.

  4. For production access, create two CNAME records on a DNS server that map the Envoy service Load Balancer External-IP address to the Grafana Dashboard.

Access the Grafana Dashboard Using a Contour Envoy Service of Type NodePort

If the prerequisite Contour Envoy service of type NodePort is deployed, and you specified this in the Grafana configuration file, obtain the virtual machine IP address of a worker node and create DNS records for the Grafana FQDN.
  1. Switch context to the vSphere Namespace where the cluster is provisioned.
    kubectl config use-context VSPHERE-NAMESPACE
  2. List the nodes in the cluster.
    kubectl get virtualmachines
    You should see the cluster nodes, for example:
    NAME                                            POWERSTATE   AGE
    tkgs-cluster-X-control-plane-6dgln              poweredOn    6h7m
    tkgs-cluster-X-control-plane-j6hq6              poweredOn    6h10m
    tkgs-cluster-X-control-plane-xc25f              poweredOn    6h14m
    tkgs-cluster-X-workers-9twdr-59bc54dc97-kt4cm   poweredOn    6h12m
    tkgs-cluster-X-workers-9twdr-59bc54dc97-pjptr   poweredOn    6h12m
    tkgs-cluster-X-workers-9twdr-59bc54dc97-t45mn   poweredOn    6h12m
  3. Pick one of the worker nodes and describe it using the following command.
    kubectl describe virtualmachines tkgs-cluster-X-workers-9twdr-59bc54dc97-kt4cm
  4. Locate the IP address of the virtual machine, for example Vm Ip: 10.115.22.43.
  5. To verify the installation of the Grafana extension, update your local /etc/hosts file with the Grafana FQDN mapped to a worker node IP address, for example:
    127.0.0.1 localhost
    127.0.1.1 ubuntu
    # TKGS Grafana with Envoy NodePort
    10.115.22.43 grafana.system.tanzu
    
  6. Access the Grafana Dashboard by navigating to https://grafana.system.tanzu.

    Because the site uses self-signed certificates, you might need to navigate through a browser-specific security warning before you are able to access the dashboard.

Troubleshoot Grafana Deployment

If the deployment or reconciliation fails, run kubectl get pods -A to view pod status. The contour and envoy pods should be Running. If a pod status is ImagePullBackOff or ImageCrashLoopBackOff, the container image could not be pulled. Check the registry URL in the data values and the extension YAML files and make sure they are accurate.

Check the container logs, where name-XXXX is the unique pod name when you run kubectl get pods -A:
kubectl logs pod/grafana-XXXX -c grafana -n tanzu-system-monitoring

Update the Grafana Extension

Update the Grafana extension that is deployed to a Tanzu Kubernetes cluster.

  1. Get the current Grafana data values from the grafana-data-values secret.
    kubectl get secret grafana-data-values -n tanzu-system-monitoring -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > grafana-data-values.yaml
    
  2. Update the Grafana data values in grafana-data-values.yaml. See Configure the Grafana Extension.
  3. Update the Grafana data values secret.
    kubectl create secret generic grafana-data-values --from-file=values.yaml=grafana-data-values.yaml -n tanzu-system-monitoring -o yaml --dry-run | kubectl replace -f-
    
    The Grafana extension is reconciled with the updated data values.
    Note: By default, kapp-controller will sync apps every 5 minutes. The update should take effect in 5 minutes or less. If you want the update to take effect immediately, change syncPeriod in grafana-extension.yaml to a lesser value and apply the Grafana extension using kubectl apply -f grafana-extension.yaml.
  4. Check the status of the extension.
    kubectl get app grafana -n tanzu-system-monitoring

    The status should change to Reconcile Succeeded once Grafana is updated.

  5. View detailed status and troubleshoot if necessary.
    kubectl get app grafana -n tanzu-system-monitoring -o yaml

Delete the Grafana Extension

Delete the Grafana extension from a Tanzu Kubernetes cluster.
Note: Complete the steps in order. Do not delete the namespace, service account, and role objects before the Grafana app is fully deleted. Doing so can lead to system errors.
Note: The Prometheus and Grafana extensions are deployed to the same namespace: tanzu-system-monitoring. If you have deployed both extensions to the same cluster, delete each extension before you delete the namespace.
  1. Change directory to the Grafana extension.
    cd /tkg-extensions-v1.3.1+vmware.1/extensions/monitoring/grafana
  2. Delete the Grafana app.
    kubectl delete app grafana -n tanzu-system-monitoring

    Expected result: app.kappctrl.k14s.io "grafana" deleted.

  3. Verify that the Grafana app is deleted.
    kubectl get app grafana -n tanzu-system-monintoring

    Expected result: apps.kappctrl.k14s.io "grafana" not found.

  4. Delete the tanzu-system-monitoring namespace and the Grafana service account and role objects.
    Warning: Do not perform this step if Prometheus is deployed.
    kubectl delete -f namespace-role.yaml
  5. If you want to redeploy Grafana, remove the secret grafana-data-values.
    kubectl delete secret grafana-data-values -n tanzu-system-monitoring

    Expected result: secret "grafana-data-values" deleted.

Upgrade the Grafana Extension

If you have an existing Grafana extension deployed, you can upgrade it to use the latest version.
  1. Export the Grafana configmap and save it as backup.
    kubectl get configmap grafana -n tanzu-system-monitoring -o 'go-template={{ index .data "grafana.yaml" }}' > grafana-configmap.yaml
  2. Delete the existing Grafana extension. See Delete the Grafana Extension.
  3. Deploy the Grafana extension. See Deploy the Grafana Extension for Visualization and Analytics.

Configure the Grafana Extension

The Grafana configuration is set in /tkg-extensions-v1.3.1+vmware.1/extensions/monitoring/grafana/grafana-data-values.yaml.
Table 1. Grafana Configuration Parameters
Parameter Description Type Default
monitoring.namespace Namespace where Prometheus will be deployed string tanzu-system-monitoring
monitoring.create_namespace The flag indicates whether to create the namespace specified by monitoring.namespace boolean false
monitoring.grafana.cluster_role.apiGroups api group defined for grafana clusterrole list [""]
monitoring.grafana.cluster_role.resources resources defined for grafana clusterrole list ["configmaps", "secrets"]
monitoring.grafana.cluster_role.verbs access permission defined for clusterrole list ["get", "watch", "list"]
monitoring.grafana.config.grafana_ini Grafana configuration file details config file grafana.ini

In this file, grafana_net URL is used to integrate with Grafana, for example, to import the dashboard directly from Grafana.com.

monitoring.grafana.config.datasource.type Grafana datasource type string prometheus
monitoring.grafana.config.datasource.access access mode. proxy or direct (Server or Browser in the UI) string proxy
monitoring.grafana.config.datasource.isDefault mark as default Grafana datasource boolean true
monitoring.grafana.config.provider_yaml Config file to define grafana dashboard provider yaml file provider.yaml
monitoring.grafana.service.type Type of service to expose Grafana. Supported Values: ClusterIP, NodePort, LoadBalancer string vSphere: NodePort, aws/azure: LoadBalancer
monitoring.grafana.pvc.storage_class Define access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany string ReadWriteOnce
monitoring.grafana.pvc.storage Define storage size for persistent volume claim string 2Gi
monitoring.grafana.deployment.replicas Number of grafana replicas integer 1
monitoring.grafana.image.repository Location of the repository with the Grafana image. The default is the public VMware registry. Change this value if you are using a private repository (e.g., air-gapped environment). string projects.registry.vmware.com/tkg/grafana
monitoring.grafana.image.name Name of Grafana image string grafana
monitoring.grafana.image.tag Grafana image tag. This value may need to be updated if you are upgrading the version. string v7.3.5_vmware.1
monitoring.grafana.image.pullPolicy Grafana image pull policy string IfNotPresent
monitoring.grafana.secret.type Secret type defined for Grafana dashboard string Opaque
monitoring.grafana.secret.admin_user username to access Grafana dashboard string YWRtaW4=

Value is base64 encoded; to decode: echo "xxxxxx" | base64 --decode

monitoring.grafana.secret.admin_password password to access Grafana dashboard string null
monitoring.grafana.secret.ldap_toml If using ldap auth, ldap configuration file path string ""
monitoring.grafana_init_container.image.repository Repository containing grafana init container image. The default is the public VMware registry. Change this value if you are using a private repository (e.g., air-gapped environment). string projects.registry.vmware.com/tkg/grafana
monitoring.grafana_init_container.image.name Name of grafana init container image string k8s-sidecar
monitoring.grafana_init_container.image.tag Grafana init container image tag. This value may need to be updated if you are upgrading the version. string 0.1.99
monitoring.grafana_init_container.image.pullPolicy grafana init container image pull policy string IfNotPresent
monitoring.grafana_sc_dashboard.image.repository Repository containing the Grafana dashboard image. The default is the public VMware registry. Change this value if you are using a private repository (e.g., air-gapped environment). string projects.registry.vmware.com/tkg/grafana
monitoring.grafana_sc_dashboard.image.name Name of grafana dashboard image string k8s-sidecar
monitoring.grafana_sc_dashboard.image.tag Grafana dashboard image tag. This value may need to be updated if you are upgrading the version. string 0.1.99
monitoring.grafana_sc_dashboard.image.pullPolicy grafana dashboard image pull policy string IfNotPresent
monitoring.grafana.ingress.enabled Enable/disable ingress for grafana boolean true
monitoring.grafana.ingress.virtual_host_fqdn Hostname for accessing grafana string grafana.system.tanzu
monitoring.grafana.ingress.prefix Path prefix for grafana string /
monitoring.grafana.ingress.tlsCertificate.tls.crt Optional cert for ingress if you want to use your own TLS cert. A self signed cert is generated by default string Generated cert
monitoring.grafana.ingress.tlsCertificate.tls.key Optional cert private key for ingress if you want to use your own TLS cert. string Generated cert key