This topic describes how to configure Telegraf in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI).

Overview

You can configure Telegraf to collect metrics from TKGI API, control plane node, and worker node VMs and send the metrics to a monitoring service, such as Wavefront or Datadog.

For more information about these metrics, see Metrics: Telegraf in Monitoring TKGI and TKGI-Provisioned Clusters.

Collect Metrics Using Telegraf

To collect metrics using Telegraf:

  1. Create a configuration file for your output plugin. See Create a Configuration File below.
  2. Configure Telegraf in the Tanzu Kubernetes Grid Integrated Edition tile. See Configure Telegraf in the Tile below.

Create a Configuration File

To connect a monitoring service to TKGI, you must create a configuration file for the service. The configuration file is written in a TOML format and consists of key-value pairs. After you create your configuration file, you can enter the file into the Tanzu Kubernetes Grid Integrated Edition tile to connect the service.

To create a configuration file for your monitoring service:

  1. Locate the required format for your monitoring service in the README.md file for your service in telegraf in GitHub. For example, if you want to collect metrics from etcd, the etcd documentation recommends using the open-source Prometheus monitoring service.

  2. Create your configuration file using the required format of your monitoring service. For example, if you want to create a configuration file for an HTTP output plugin, create a file similar to the following:

    [[outputs.http]]
       url="https://example.com"
       method="POST"
       data_format="json"
    [[processors.override]]
      [processors.override.tags]
        director = "bosh-director-1"
    

    Note: You can add tags to your configuration file to label etcd metrics. For example, the above code snippet adds a bosh-director-1 tag to the etcd metrics. If you have multiple BOSH Directors, VMware recommends adding tags to filter your metrics in your monitoring service.

Configure Telegraf in the Tile

To configure TKGI to use Telegraf for metric collection:

  1. Navigate to the Tanzu Kubernetes Grid Integrated Edition tile > Settings > Host Monitoring.

  2. Under Enable Telegraf Outputs?, select Yes.
    Host Monitoring pane

  3. Configure the Telegraf checkboxes as described in the table below.

    Components you enable in this step will be visible to TKGI admins only.

    Enable this checkbox… …to send these metrics to your monitoring service
    Enable node exporter on TKGI API Node Exporter metrics from the TKGI API VM
    Enable node exporter on control plane Node Exporter metrics from Kubernetes control plane nodes
    Include etcd metrics etcd server and debugging metrics
    Enable node exporter on worker Node Exporter metrics from Kubernetes worker nodes
    Include Kubernetes Controller Manager metrics Kubernetes controller manager metrics
    • These metrics provide information about the state of each cluster.
    Include Kubernetes API Server metrics Kubernetes API server metrics
    Include kubelet metrics kubelet metrics for all workloads running in all your Kubernetes clusters
    • If you enable Include kubelet metrics, be prepared for a high volume of metrics.

  4. In Setup Telegraf Outputs, replace the default value [[outputs.discard]] with the contents of the configuration file that you created in Create a Configuration File above. See the following example for an HTTP output plugin:

    [[outputs.http]]
       url="https://example.com"
       method="POST"
       data_format="json"
    [[processors.override]]
      [processors.override.tags]
        director = "bosh-director-1"
    
  5. In Setup Telegraf Agent, replace the default Telegraf agent property values with your custom values for interval, buffering and debugging related properties. For more information about the configurable Telegraf agent properties, see Agent configuration in the Telegraf documentation.
  6. Click Save.

  7. To deploy the Tanzu Kubernetes Grid Integrated Edition tile, return to the Ops Manager Installation Dashboard and click Review Pending Changes > Apply Changes.

Troubleshoot etcd

VMware recommends working with Support to troubleshoot control plane/etcd node VMs. The monitoring and metrics data you retrieve from the control plane/etcd node VMs can help the Support team diagnose and troubleshoot errors.

check-circle-line exclamation-circle-line close-line
Scroll to top icon