This topic describes how to configure Telegraf in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI).
You can configure Telegraf to collect metrics from TKGI API, control plane node, and worker node VMs and send the metrics to a monitoring service, such as Wavefront or Datadog.
For more information about these metrics, see Metrics: Telegraf in Monitoring TKGI and TKGI-Provisioned Clusters.
To collect metrics using Telegraf:
To connect a monitoring service to TKGI, you must create a configuration file for the service. The configuration file is written in a TOML format and consists of key-value pairs. After you create your configuration file, you can enter the file into the Tanzu Kubernetes Grid Integrated Edition tile to connect the service.
To create a configuration file for your monitoring service:
Locate the required format for your monitoring service in the README.md
file for your service in telegraf in GitHub. For example, if you want to collect metrics from etcd, the etcd documentation recommends using the open-source Prometheus monitoring service.
Create your configuration file using the required format of your monitoring service. For example, if you want to create a configuration file for an HTTP output plugin, create a file similar to the following:
[[outputs.http]]
url="https://example.com"
method="POST"
data_format="json"
[[processors.override]]
[processors.override.tags]
director = "bosh-director-1"
Note: You can add tags to your configuration file to label etcd metrics. For example, the above code snippet adds a bosh-director-1
tag to the etcd metrics. If you have multiple BOSH Directors, VMware recommends adding tags to filter your metrics in your monitoring service.
To configure TKGI to use Telegraf for metric collection:
Navigate to the Tanzu Kubernetes Grid Integrated Edition tile > Settings > Host Monitoring.
Under Enable Telegraf Outputs?, select Yes.
Configure Telegraf output settings as described in the table below.
Configuration Setting | Description…to send these metrics to your monitoring service |
---|---|
Prometheus input plugin Metric version | Controls the metrics mapping from Prometheus to telegraf when scraping metrics using the Prometheus input plugin. The Prometheus input plugin scrapes the following metrics: node_exporter , kube_apiserver , kube_controller_manager , kube_scheduler , and etcd metrics . Requires TKGI v1.13.7 or later. Your Prometheus client must be configured with the matching metric_version setting. For more information, see Prometheus Input Plugin in the telegraf GitHub repository. |
Enable node exporter on TKGI API | Enable to send Node Exporter metrics from the TKGI API VM. |
Enable node exporter on control plane | Enable to send Node Exporter metrics from Kubernetes control plane nodes. |
Include etcd metrics | Enable to send etcd server and debugging metrics. |
Enable node exporter on worker | Enable to send Node Exporter metrics from Kubernetes worker nodes. |
Include Kubernetes Controller Manager metrics | Enable to send Kubernetes controller manager metrics.
|
Include Kubernetes API Server metrics | Enable to send Kubernetes API Server metrics. |
Include kubelet metrics | Enable to send kubelet metrics for all workloads running in all your Kubernetes clusters.
|
Note: The telegraf output configuration options are visible to TKGI admins only.
In Setup Telegraf Outputs, replace the default value [[outputs.discard]]
with the contents of the configuration file that you created in Create a Configuration File above. See the following example for an HTTP output plugin:
[[outputs.http]]
url="https://example.com"
method="POST"
data_format="json"
[[processors.override]]
[processors.override.tags]
director = "bosh-director-1"
Note: In TKGI v1.13.6 and earlier, if you use the Prometheus Output plugin, your Prometheus Client must be configured with metric_version=2
. For Telegraf Prometheus Output plugin configuration information, see Configuration in the Telegraf GitHub repository.
In Setup Telegraf Agent, replace the default Telegraf agent property values with your custom values for interval, buffering and debugging related properties. For more information about the configurable Telegraf agent properties, see Agent configuration in the Telegraf documentation.
Click Save.
To deploy the Tanzu Kubernetes Grid Integrated Edition tile, return to the Ops Manager Installation Dashboard and click Review Pending Changes > Apply Changes.
VMware recommends working with Support to troubleshoot control plane/etcd node VMs. The monitoring and metrics data you retrieve from the control plane/etcd node VMs can help the Support team diagnose and troubleshoot errors.