This topic describes how to configure Telegraf in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI).
You can configure Telegraf to collect metrics from TKGI API, control plane node, and worker node VMs and send the metrics to a monitoring service, such as Wavefront or Datadog.
For more information about these metrics, see Metrics: Telegraf in Monitoring TKGI and TKGI-Provisioned Clusters.
To collect metrics using Telegraf:
To connect a monitoring service to TKGI, you must create a configuration file for the service. The configuration file is written in a TOML format and consists of key-value pairs. After you create your configuration file, you can enter the file into the Tanzu Kubernetes Grid Integrated Edition tile to connect the service.
To create a configuration file for your monitoring service:
Locate the required format for your monitoring service in the README.md
file for your service in telegraf in GitHub. For example, if you want to collect metrics from etcd, the etcd documentation recommends using the open-source Prometheus monitoring service.
Create your configuration file using the required format of your monitoring service. For example, if you want to create a configuration file for an HTTP output plugin, create a file similar to the following:
[[outputs.http]]
url="https://example.com"
method="POST"
data_format="json"
[[processors.override]]
[processors.override.tags]
director = "bosh-director-1"
Note: You can add tags to your configuration file to label etcd metrics. For example, the above code snippet adds a bosh-director-1
tag to the etcd metrics. If you have multiple BOSH Directors, VMware recommends adding tags to filter your metrics in your monitoring service.
To configure TKGI to use Telegraf for metric collection:
Navigate to the Tanzu Kubernetes Grid Integrated Edition tile > Settings > Host Monitoring.
Under Enable Telegraf Outputs?, select Yes.
Configure the Telegraf checkboxes as described in the table below.
Components you enable in this step will be visible to TKGI admins only.
Enable this checkbox… | …to send these metrics to your monitoring service |
---|---|
Enable node exporter on TKGI API | Node Exporter metrics from the TKGI API VM |
Enable node exporter on control plane | Node Exporter metrics from Kubernetes control plane nodes |
Include etcd metrics | etcd server and debugging metrics |
Enable node exporter on worker | Node Exporter metrics from Kubernetes worker nodes |
Include Kubernetes Controller Manager metrics | Kubernetes controller manager metrics
|
Include Kubernetes API Server metrics | Kubernetes API server metrics |
Include kubelet metrics | kubelet metrics for all workloads running in all your Kubernetes clusters
|
In Setup Telegraf Outputs, replace the default value [[outputs.discard]]
with the contents of the configuration file that you created in Create a Configuration File above. See the following example for an HTTP output plugin:
[[outputs.http]]
url="https://example.com"
method="POST"
data_format="json"
[[processors.override]]
[processors.override.tags]
director = "bosh-director-1"
Click Save.
To deploy the Tanzu Kubernetes Grid Integrated Edition tile, return to the Ops Manager Installation Dashboard and click Review Pending Changes > Apply Changes.
VMware recommends working with Support to troubleshoot control plane/etcd node VMs. The monitoring and metrics data you retrieve from the control plane/etcd node VMs can help the Support team diagnose and troubleshoot errors.