This topic describes how to configure Telegraf in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI).

Overview

You can configure Telegraf to collect metrics from TKGI API, control plane node, and worker node VMs and send the metrics to a monitoring service, such as Wavefront or Datadog.

For more information about collected metrics, see Metrics: Telegraf in Monitoring TKGI and TKGI-Provisioned Clusters.

Collect Metrics Using Telegraf

To collect metrics using Telegraf:

  1. Create a configuration file for your output plugin. See Create a Configuration File below.
  2. Configure Telegraf in the Tanzu Kubernetes Grid Integrated Edition tile. See Configure Telegraf in the Tile below.

Create a Configuration File

To connect a monitoring service to TKGI, you must create a configuration file for the service. The configuration file is written in a TOML format and consists of key-value pairs. After you create your configuration file, you can enter the file into the Tanzu Kubernetes Grid Integrated Edition tile to connect the service.

To create a configuration file for your monitoring service:

  1. Locate the required format for your monitoring service in the README.md file for your service in Telegraf in GitHub. For example, if you want to collect metrics from etcd, the etcd documentation recommends using the open-source Prometheus monitoring service.

  2. Create your configuration file using the required format of your monitoring service. For example, if you want to create a configuration file for an HTTP output plugin, create a file similar to the following:

    [[outputs.http]]
       url="https://example.com"
       method="POST"
       data_format="json"
    [[processors.override]]
      [processors.override.tags]
        director = "bosh-director-1"
    

    Note: You can add tags to your configuration file to label etcd metrics. For example, the above code snippet adds a bosh-director-1 tag to the etcd metrics. If you have multiple BOSH Directors, VMware recommends adding tags to filter your metrics in your monitoring service.

Configure Telegraf in the Tile

To configure TKGI to use Telegraf for metric collection:

  1. Navigate to the Tanzu Kubernetes Grid Integrated Edition tile > Settings > Host Monitoring.

  2. Under Enable Telegraf Outputs?, select Yes.
    Host Monitoring pane

  3. Configure Telegraf output settings as described in the table below.

    Configuration Setting Setting Description and Usage
    Prometheus input plugin Metric version Controls the metrics mapping from Prometheus to Telegraf when scraping metrics using the Prometheus input plugin. The Prometheus input plugin scrapes the following metrics: node_exporter, kube_apiserver, kube_controller_manager, kube_scheduler, and etcd metrics.

    Your Prometheus client must be configured with the matching metric_version setting. For more information, see Prometheus Input Plugin in the Telegraf GitHub repository.
    Enable node exporter on TKGI API Enable to send Node Exporter metrics from the TKGI API VM.
    Enable node exporter on control plane Enable to send Node Exporter metrics from Kubernetes control plane nodes.
    Include etcd metrics Enable to send etcd server and debugging metrics.
    Enable node exporter on worker Enable to send Node Exporter metrics from Kubernetes worker nodes.
    Include Kubernetes Controller Manager metrics Enable to send Kubernetes controller manager metrics.
    • These metrics provide information about the state of each cluster.
    Include Kubernetes API Server metrics Enable to send Kubernetes API Server metrics.
    Include Kubernetes Scheduler metrics Enable to send Kubernetes Scheduler metrics. For more information, see Configure Include Kubernetes Scheduler Metrics.
    Include kubelet metrics Enable to send kubelet metrics for all workloads running in all your Kubernetes clusters.
    • If you enable Include kubelet metrics, be prepared for a high volume of metrics.

    Include Telegraf metrics when Telegraf enabled Enable to send Telegraf process memory status, agent metrics, and write metrics. For more information, see Telegraf Internal Input Plugin in the Telegraf GitHub repository.

    Note: The Telegraf output configuration options are visible to TKGI admins only.

    Components you enable in this step will be visible to TKGI admins only.

  4. In Setup Telegraf Outputs, replace the default value [[outputs.discard]] with the contents of the configuration file that you created in Create a Configuration File above. See the following example for an HTTP output plugin:

    [[outputs.http]]
       url="https://example.com"
       method="POST"
       data_format="json"
    [[processors.override]]
      [processors.override.tags]
        director = "bosh-director-1"
    
  5. In Setup Telegraf Agent, replace the default Telegraf agent property values with your custom values for interval, buffering and debugging related properties. For more information about the configurable Telegraf agent properties, see Agent configuration in the Telegraf documentation.

  6. Click Save.

  7. To deploy the Tanzu Kubernetes Grid Integrated Edition tile, return to the Ops Manager Installation Dashboard and click Review Pending Changes > Apply Changes.


Troubleshoot etcd

VMware recommends working with Support to troubleshoot control plane/etcd node VMs. The monitoring and metrics data you retrieve from the control plane/etcd node VMs can help the Support team diagnose and troubleshoot errors.

check-circle-line exclamation-circle-line close-line
Scroll to top icon