Skip to content

PanKaker/observability-telegraf

 
 

Repository files navigation

observability-telegraf

This repo contains observability-telegraf which is a containerized version of Telegraf agent.

Design goal is to have configured container that contains running Telegraf agent with certain plugins.

Minimum requirements

Pre-configuration

Pre-configuration is needed for a container to read metrics from specific plugins:

Plugin is based on Linux Kernel modules that expose specific metrics over sysfs or devfs interfaces. The following dependencies are expected by plugin:

  • intel-rapl module which exposes Intel Runtime Power Limiting metrics over sysfs (/sys/devices/virtual/powercap/intel-rapl),
  • msr kernel module that provides access to processor model specific registers over devfs (/dev/cpu/cpu%d/msr),
  • cpufreq kernel module - which exposes per-CPU Frequency over sysfs (/sys/devices/system/cpu/cpu%d/cpufreq/scaling_cur_freq).
  • intel-uncore-frequency module exposes Intel uncore frequency metrics over sysfs (/sys/devices/system/cpu/intel_uncore_frequency),

Minimum kernel version required is 3.13 to satisfy most of requirements, for uncore_frequency metrics intel-uncore-frequency module is required (available since kernel 5.6).

Please make sure that kernel modules are loaded and running (cpufreq is integrated in kernel). Modules might have to be manually enabled by using modprobe. Depending on the kernel version, run commands:

# kernel 5.x.x:
sudo modprobe rapl
sudo modprobe msr
sudo modprobe intel_rapl_common
sudo modprobe intel_rapl_msr

# also for kernel >= 5.6.0
sudo modprobe intel-uncore-frequency

# kernel 4.x.x:
sudo modprobe msr
sudo modprobe intel_rapl

The Redfish plugin needs hardware servers for which DMTF's Redfish is enabled.

  • The DPDK plugin needs external application built with Data Plane Development Kit.
  • ./telegraf-intel-docker.sh has default location of DPDK socket -/var/run/dpdk/rte, if DPDK socket is located somewhere else, user must specify this in running stage providing --dpdk_socket_path flag. Providing path to a directory that contains the hosts' own Docker socket file is not recommended.

The plugin requires JSON files with event definitions to work properly. Those can be specified in ./telegraf-intel-docker.sh by providing --pmu_events parameter. Providing path to a directory that contains the hosts' own Docker socket file is not recommended.

More information about event definitions and where to get them should be found in plugin's README.

If rasdaemon exists on the host OS, please make sure rasdaemon version on host matches exactly v0.6.7 (as the container does). Then mount the rasdaemon library directory to the container, so that both versions are kept in sync: ./telegraf-intel-docker.sh --use-host-rasdaemon. An alternative is to remove rasdaemon from the host OS.

Installation

From source

  1. Install Docker 20.10.6. or newer. Docker installation guide
  2. Clone Telegraf Intel Docker repository. Cloning this repo into /tmp or any privileged directory is not recommended.
  3. Go into cloned repository cd telegraf_intel_docker.
  4. Run ./telegraf-intel-docker.sh build-run <image-name> <container-name> from source file directory to build and run Docker container in background. Provide valid image and container names in place of <image-name> and <container-name>.

How to use it

  • See available options with:

    ./telegraf-intel-docker.sh

  • Build and run Telegraf Intel Docker container:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name>

  • Build and run with DPDK socket path:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name> --dpdk_socket_path <socket-path>

  • Build and run with mounted rasdaemon folder:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name> --use-host-rasdaemon

  • Build and run with path to directory with PMU events definitions:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name> --pmu_events <events definition path>

  • Build Telegraf Intel Docker image:

    ./telegraf-intel-docker.sh build <image-name>

  • Run with DPDK socket path:

    ./telegraf-intel-docker.sh run <image-name> <container-name> --dpdk_socket_path <socket-path>

  • Run with mounted rasdaemon folder:

    ./telegraf-intel-docker.sh run <image-name> <container-name> --use-host-rasdaemon

  • Run with path to directory with PMU events definitions:

    ./telegraf-intel-docker.sh run <image-name> <container-name> --pmu_events <events definition path>

  • Restart Telegraf Intel Docker container (e.g. for reload Telegraf configuration file):

    ./telegraf-intel-docker.sh restart <image-name> <container-name>

  • Stop and remove all Telegraf Intel Docker container, and images linked to it:

    ./telegraf-intel-docker.sh remove <image-name> <container-name>

  • Remove Telegraf Intel Docker images:

    ./telegraf-intel-docker.sh remove-build <image-name>

  • Enter Telegraf Intel Docker container via the bash:

    ./telegraf-intel-docker.sh enter <container-name>

  • See Telegraf logs with:

    ./telegraf-intel-docker.sh logs <container-name>

Changing Telegraf configuration file

What is Telegraf configuration file?

  • Telegraf's configuration file is written using TOML and is composed of three sections: global tags, agent settings, and plugins.
  • Plugins can be loaded, unloaded or configured in configuration file.

To change Telegraf configuration file:

  • From source file directory edit Telegraf configuration file using text editor (e.g. nano):

    nano telegraf/telegraf.conf

  • Use script to reload Telegraf configuration file and load new plugins:

    ./telegraf-intel-docker.sh restart <image-name> <container-name>

  • Verify Telegraf logs to check that everything works as expected:

    ./telegraf-intel-docker.sh logs <container-name>

Usage example

  • Creating and running Telegraf Docker image:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name>

    This command will create and run Telegraf docker image with given name.

  • To see logs from Telegraf in the container:

    ./telegraf-intel-docker.sh logs <container-name>

    To exit viewing logs press: CTRL + C.

  • To load new Telegraf configuration file:

    ./telegraf-intel-docker.sh restart <image-name> <container-name> - This will restart the container, and run it with the new configuration.

  • To build and run the container with DPDK socket path:

    ./telegraf-intel-docker.sh build-run <image-name> <container-name> --dpdk_socket_path /var/run/dpdk/rte


Available plugins

Input plugins

List of supported Telegraf input plugins.

Enabled by default

The following plugins should work on a majority of the host's configurations.

  1. CGroup
  2. CPU
  3. Disk
  4. Disk IO
  5. DNS Query
  6. ETH Tool
  7. Hugepages
  8. IP Tables
  9. Kernel VMStat
  10. Mem
  11. Net
  12. Ping
  13. Smart
  14. System
  15. Temp

Disabled by default

Some plugins need special attention regarding host's configuration. Observability Telegraf supports them, so they can be enabled by uncommenting associated config fields in telegraf/telegraf.conf file. Please ensure configuration requirements are properly fulfilled for plugins listed below.

  1. Intel PowerStat
  2. Intel RDT
  3. Intel PMU
  4. DPDK
  5. IPMI Sensor
  6. RAS
  7. Redfish

Output plugins

List of supported Telegraf output plugins enabled by default.

  1. File
  2. Prometheus client

Changelog

1.2.0

  • Update telegraf version: 1.21.3 -> 1.24.3
  • Update version of pqos (intel_cmt_cat): 4.2.0 -> 4.4.1
  • Add Hugepages plugin (enabled by default)
  • Add new features: uncore_freq and max_turbo_freq for Powerstat plugin
  • Update the final alpine image: 3.15 -> 3.16

About

This repository contains sources of observability-telegraf which is a containerized version of InfluxData's Telegraf agent. Docker Hub: https://hub.docker.com/r/intel/observability-telegraf

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 82.1%
  • Dockerfile 17.9%