This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

cc-metric-collector's collectors

Documentation of cc-metric-collector’s collectors

CCMetric collectors

This folder contains the collectors for the cc-metric-collector.

Configuration

{
    "collector_type" : {
        <collector specific configuration>
    }
}

In contrast to the configuration files for sinks and receivers, the collectors configuration is not a list but a set of dicts. This is required because we didn’t manage to partially read the type before loading the remaining configuration. We are eager to change this to the same format.

Available collectors

Todos

  • Aggreate metrics to higher topology entity (sum hwthread metrics to socket metric, …). Needs to be configurable

Contributing own collectors

A collector reads data from any source, parses it to metrics and submits these metrics to the metric-collector. A collector provides three function:

  • Name() string: Return the name of the collector
  • Init(config json.RawMessage) error: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, …
  • Initialized() bool: Check if a collector is successfully initialized
  • Read(duration time.Duration, output chan ccMetric.CCMetric): Read, parse and submit data to the output channel as CCMetric. If the collector has to measure anything for some duration, use the provided function argument duration.
  • Close(): Closes down the collector.

It is recommanded to call setup() in the Init() function.

Finally, the collector needs to be registered in the collectorManager.go. There is a list of collectors called AvailableCollectors which is a map (collector_type_string -> pointer to MetricCollector interface). Add a new entry with a descriptive name and the new collector.

Sample collector

package collectors

import (
    "encoding/json"
    "time"

    lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

// Struct for the collector-specific JSON config
type SampleCollectorConfig struct {
    ExcludeMetrics []string `json:"exclude_metrics"`
}

type SampleCollector struct {
    metricCollector
    config SampleCollectorConfig
}

func (m *SampleCollector) Init(config json.RawMessage) error {
    // Check if already initialized
    if m.init {
        return nil
    }

    m.name = "SampleCollector"
    m.setup()
    if len(config) > 0 {
        err := json.Unmarshal(config, &m.config)
        if err != nil {
            return err
        }
    }
    m.meta = map[string]string{"source": m.name, "group": "Sample"}

    m.init = true
    return nil
}

func (m *SampleCollector) Read(interval time.Duration, output chan lp.CCMetric) {
    if !m.init {
        return
    }
    // tags for the metric, if type != node use proper type and type-id
    tags := map[string]string{"type" : "node"}

    x, err := GetMetric()
    if err != nil {
        cclog.ComponentError(m.name, fmt.Sprintf("Read(): %v", err))
    }

    // Each metric has exactly one field: value !
    value := map[string]interface{}{"value": int64(x)}
    if y, err := lp.New("sample_metric", tags, m.meta, value, time.Now()); err == nil {
        output <- y
    }
}

func (m *SampleCollector) Close() {
    m.init = false
    return
}

1 - BeeGFS on Demand collector

Toplevel beegfsmetaMetric

BeeGFS on Demand collector

This Collector is to collect BeeGFS on Demand (BeeOND) metadata clientstats.

  "beegfs_meta": {
	"beegfs_path": "/usr/bin/beegfs-ctl",
    "exclude_filesystem": [
      "/mnt/ignore_me"
    ],
    "exclude_metrics": [     
          "ack",
          "entInf",
          "fndOwn"
    ]
  }

The BeeGFS On Demand (BeeOND) collector uses the beegfs-ctl command to read performance metrics for BeeGFS filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the beegfs-ctl command can be configured with the beegfs_path option in the configuration.

When using the exclude_metrics option, the excluded metrics are summed as other.

Important: The metrics listed below, are similar to the naming of BeeGFS. The Collector prefixes these with beegfs_cstorage(beegfs client storage).

For example beegfs metric open-> beegfs_cstorage_open

Available Metrics:

  • sum
  • ack
  • close
  • entInf
  • fndOwn
  • mkdir
  • create
  • rddir
  • refrEnt
  • mdsInf
  • rmdir
  • rmLnk
  • mvDirIns
  • mvFiIns
  • open
  • ren
  • sChDrct
  • sAttr
  • sDirPat
  • stat
  • statfs
  • trunc
  • symlnk
  • unlnk
  • lookLI
  • statLI
  • revalLI
  • openLI
  • createLI
  • hardlnk
  • flckAp
  • flckEn
  • flckRg
  • dirparent
  • listXA
  • getXA
  • rmXA
  • setXA
  • mirror

The collector adds a filesystem tag to all metrics

2 - BeeGFS on Demand collector

Toplevel beegfsstorageMetric

BeeGFS on Demand collector

This Collector is to collect BeeGFS on Demand (BeeOND) storage stats.

  "beegfs_storage": {
	"beegfs_path": "/usr/bin/beegfs-ctl",
    "exclude_filesystem": [
      "/mnt/ignore_me"
    ],
    "exclude_metrics": [     
          "ack",
		  "storInf",
		  "unlnk"
    ]
  }

The BeeGFS On Demand (BeeOND) collector uses the beegfs-ctl command to read performance metrics for BeeGFS filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the beegfs-ctl command can be configured with the beegfs_path option in the configuration.

When using the exclude_metrics option, the excluded metrics are summed as other.

Important: The metrics listed below, are similar to the naming of BeeGFS. The Collector prefixes these with beegfs_cstorage_(beegfs client meta). For example beegfs metric open-> beegfs_cstorage_

Note: BeeGFS FS offers many Metadata Information. Probably it makes sense to exlcude most of them. Nevertheless, these excluded metrics will be summed as beegfs_cstorage_other.

Available Metrics:

  • “sum”
  • “ack”
  • “sChDrct”
  • “getFSize”
  • “sAttr”
  • “statfs”
  • “trunc”
  • “close”
  • “fsync”
  • “ops-rd”
  • “MiB-rd/s”
  • “ops-wr”
  • “MiB-wr/s”
  • “endbg”
  • “hrtbeat”
  • “remNode”
  • “storInf”
  • “unlnk”

The collector adds a filesystem tag to all metrics

3 - cpufreq_cpuinfo collector

Toplevel cpufreqCpuinfoMetric

cpufreq_cpuinfo collector

  "cpufreq_cpuinfo": {}

The cpufreq_cpuinfo collector reads the clock frequency from /proc/cpuinfo and outputs a handful hwthread metrics.

Metrics:

  • cpufreq

4 - cpufreq_cpuinfo collector

Toplevel cpufreqMetric

cpufreq_cpuinfo collector

  "cpufreq": {
    "exclude_metrics": []
  }

The cpufreq collector reads the clock frequency from /sys/devices/system/cpu/cpu*/cpufreq and outputs a handful hwthread metrics.

Metrics:

  • cpufreq

5 - cpustat collector

Toplevel cpustatMetric

cpustat collector

  "cpustat": {
    "exclude_metrics": [
      "cpu_idle"
    ]
  }

The cpustat collector reads data from /proc/stat and outputs a handful node and hwthread metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

  • cpu_user with unit=Percent
  • cpu_nice with unit=Percent
  • cpu_system with unit=Percent
  • cpu_idle with unit=Percent
  • cpu_iowait with unit=Percent
  • cpu_irq with unit=Percent
  • cpu_softirq with unit=Percent
  • cpu_steal with unit=Percent
  • cpu_guest with unit=Percent
  • cpu_guest_nice with unit=Percent
  • cpu_used = cpu_* - cpu_idle with unit=Percent
  • num_cpus

6 - customcmd collector

Toplevel customCmdMetric

customcmd collector

  "customcmd": {
    "exclude_metrics": [
      "mymetric"
    ],
    "files" : [
      "/var/run/myapp.metrics"
    ],
    "commands" : [
      "/usr/local/bin/getmetrics.pl"
    ]
  }

The customcmd collector reads data from files and the output of executed commands. The files and commands can output multiple metrics (separated by newline) but the have to be in the InfluxDB line protocol. If a metric is not parsable, it is skipped. If a metric is not required, it can be excluded from forwarding it to the sink.

7 - diskstat collector

Toplevel diskstatMetric

diskstat collector

  "diskstat": {
    "exclude_metrics": [
      "disk_total"
    ],
  }

The diskstat collector reads data from /proc/self/mounts and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics per device (with device tag):

  • disk_total (unit GBytes)
  • disk_free (unit GBytes)

Global metrics:

  • part_max_used (unit percent)

8 - gpfs collector

Toplevel gpfsMetric

gpfs collector

  "ibstat": {
    "mmpmon_path": "/path/to/mmpmon",
    "exclude_filesystem": [
      "fs1"
    ],
    "send_bandwidths": true,
    "send_total_values": true
  }

The gpfs collector uses the mmpmon command to read performance metrics for GPFS / IBM Spectrum Scale filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the mmpmon command can be configured with the mmpmon_path option in the configuration. If nothing is set, the collector searches in $PATH for mmpmon.

Metrics:

  • gpfs_bytes_read
  • gpfs_bytes_written
  • gpfs_num_opens
  • gpfs_num_closes
  • gpfs_num_reads
  • gpfs_num_writes
  • gpfs_num_readdirs
  • gpfs_num_inode_updates
  • gpfs_bytes_total = gpfs_bytes_read + gpfs_bytes_written (if send_total_values == true)
  • gpfs_iops = gpfs_num_reads + gpfs_num_writes (if send_total_values == true)
  • gpfs_metaops = gpfs_num_inode_updates + gpfs_num_closes + gpfs_num_opens + gpfs_num_readdirs (if send_total_values == true)
  • gpfs_bw_read (if send_bandwidths == true)
  • gpfs_bw_write (if send_bandwidths == true)

The collector adds a filesystem tag to all metrics

9 - ibstat collector

Toplevel infinibandMetric

ibstat collector

  "ibstat": {
    "exclude_devices": [
      "mlx4"
    ],
    "send_abs_values": true,
    "send_derived_values": true
  }

The ibstat collector includes all Infiniband devices that can be found below /sys/class/infiniband/ and where any of the ports provides a LID file (/sys/class/infiniband/<dev>/ports/<port>/lid)

The devices can be filtered with the exclude_devices option in the configuration.

For each found LID the collector reads data through the sysfs files below /sys/class/infiniband/<device>. (See: https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-class-infiniband)

Metrics:

  • ib_recv
  • ib_xmit
  • ib_recv_pkts
  • ib_xmit_pkts
  • ib_total = ib_recv + ib_xmit (if send_total_values == true)
  • ib_total_pkts = ib_recv_pkts + ib_xmit_pkts (if send_total_values == true)
  • ib_recv_bw (if send_derived_values == true)
  • ib_xmit_bw (if send_derived_values == true)
  • ib_recv_pkts_bw (if send_derived_values == true)
  • ib_xmit_pkts_bw (if send_derived_values == true)

The collector adds a device tag to all metrics

10 - iostat collector

Toplevel iostatMetric

iostat collector

  "iostat": {
    "exclude_metrics": [
      "read_ms"
    ],
  }

The iostat collector reads data from /proc/diskstats and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

  • io_reads
  • io_reads_merged
  • io_read_sectors
  • io_read_ms
  • io_writes
  • io_writes_merged
  • io_writes_sectors
  • io_writes_ms
  • io_ioops
  • io_ioops_ms
  • io_ioops_weighted_ms
  • io_discards
  • io_discards_merged
  • io_discards_sectors
  • io_discards_ms
  • io_flushes
  • io_flushes_ms

The device name is added as tag device. For more details, see https://www.kernel.org/doc/html/latest/admin-guide/iostats.html

11 - ipmistat collector

Toplevel ipmiMetric

ipmistat collector

  "ipmistat": {
    "ipmitool_path": "/path/to/ipmitool",
    "ipmisensors_path": "/path/to/ipmi-sensors",
  }

The ipmistat collector reads data from ipmitool (ipmitool sensor) or ipmi-sensors (ipmi-sensors --sdr-cache-recreate --comma-separated-output).

The metrics depend on the output of the underlying tools but contain temperature, power and energy metrics.

12 - likwid collector

Toplevel likwidMetric

likwid collector

The likwid collector is probably the most complicated collector. The LIKWID library is included as static library with direct access mode. The direct access mode is suitable if the daemon is executed by a root user. The static library does not contain the performance groups, so all information needs to be provided in the configuration.

  "likwid": {
    "force_overwrite" : false,
    "invalid_to_zero" : false,
    "liblikwid_path" : "/path/to/liblikwid.so",
    "accessdaemon_path" : "/folder/that/contains/likwid-accessD",
    "access_mode" : "direct or accessdaemon or perf_event",
    "lockfile_path" : "/var/run/likwid.lock",
    "eventsets": [
      {
        "events" : {
          "COUNTER0": "EVENT0",
          "COUNTER1": "EVENT1"
        },
        "metrics" : [
          {
            "name": "sum_01",
            "calc": "COUNTER0 + COUNTER1",
            "publish": false,
            "unit": "myunit",
            "type": "hwthread"
          }
        ]
      }
    ],
    "globalmetrics" : [
      {
        "name": "global_sum",
        "calc": "sum_01",
        "publish": true,
        "unit": "myunit",
        "type": "hwthread"
      }
    ]
  }

The likwid configuration consists of two parts, the eventsets and globalmetrics:

  • An event set list itself has two parts, the events and a set of derivable metrics. Each of the events is a counter:event pair in LIKWID’s syntax. The metrics are a list of formulas to derive the metric value from the measurements of the events’ values. Each metric has a name, the formula, a type and a publish flag. There is an optional unit field. Counter names can be used like variables in the formulas, so PMC0+PMC1 sums the measurements for the both events configured in the counters PMC0 and PMC1. You can optionally use time for the measurement time and inverseClock for 1.0/baseCpuFrequency. The type tells the LikwidCollector whether it is a metric for each hardware thread (cpu) or each CPU socket (socket). You may specify a unit for the metric with unit. The last one is the publishing flag. It tells the LikwidCollector whether a metric should be sent to the router or is only used internally to compute a global metric.
  • The globalmetrics are metrics which require data from multiple event set measurements to be derived. The inputs are the metrics in the event sets. Similar to the metrics in the event sets, the global metrics are defined by a name, a formula, a type and a publish flag. See event set metrics for details. The only difference is that there is no access to the raw event measurements anymore but only to the metrics. Also time and inverseClock cannot be used anymore. So, the idea is to derive a metric in the eventsets section and reuse it in the globalmetrics part. If you need a metric only for deriving the global metrics, disable forwarding of the event set metrics ("publish": false). Be aware that the combination might be misleading because the “behavior” of a metric changes over time and the multiple measurements might count different computing phases. Similar to the metrics in the eventset, you can specify a metric unit with the unit field.

Additional options:

  • force_overwrite: Same as setting LIKWID_FORCE=1. In case counters are already in-use, LIKWID overwrites their configuration to do its measurements
  • invalid_to_zero: In some cases, the calculations result in NaN or Inf. With this option, all NaN and Inf values are replaces with 0.0. See below in seperate section
  • access_mode: Specify LIKWID access mode: direct for direct register access as root user or accessdaemon. The access mode perf_event is current untested.
  • accessdaemon_path: Folder of the accessDaemon likwid-accessD (like /usr/local/sbin)
  • liblikwid_path: Location of liblikwid.so including file name like /usr/local/lib/liblikwid.so
  • lockfile_path: Location of LIKWID’s lock file if multiple tools should access the hardware counters. Default /var/run/likwid.lock

Available metric types

Hardware performance counters are scattered all over the system nowadays. A counter coveres a specific part of the system. While there are hardware thread specific counter for CPU cycles, instructions and so on, some others are specific for a whole CPU socket/package. To address that, the LikwidCollector provides the specification of a type for each metric.

  • hwthread : One metric per CPU hardware thread with the tags "type" : "hwthread" and "type-id" : "$hwthread_id"
  • socket : One metric per CPU socket/package with the tags "type" : "socket" and "type-id" : "$socket_id"

Note: You cannot specify socket type for a metric that is measured at hwthread type, so some kind of expert knowledge or lookup work in the Likwid Wiki is required. Get the type of each counter from the Architecture pages and as soon as one counter in a metric is socket-specific, the whole metric is socket-specific.

As a guideline:

  • All counters FIXCx, PMCy and TMAz have the type hwthread
  • All counters names containing BOX have the type socket
  • All PWRx counters have type socket, except "PWR1" : "RAPL_CORE_ENERGY" has hwthread type
  • All DFCx counters have type socket

Help with the configuration

The configuration for the likwid collector is quite complicated. Most users don’t use LIKWID with the event:counter notation but rely on the performance groups defined by the LIKWID team for each architecture. In order to help with the likwid collector configuration, we included a script scripts/likwid_perfgroup_to_cc_config.py that creates the configuration of an eventset from a performance group (using a LIKWID installation in $PATH):

$ likwid-perfctr -i
[...]
short name: ICX
[...]
$ likwid-perfctr -a
[...]
MEM_DP
MEM
FLOPS_SP
CLOCK
[...]
$ scripts/likwid_perfgroup_to_cc_config.py ICX MEM_DP
{
  "events": {
    "FIXC0": "INSTR_RETIRED_ANY",
    "FIXC1": "CPU_CLK_UNHALTED_CORE",
    "..." : "..."
  },
  "metrics" : [
    {
      "calc": "time",
      "name": "Runtime (RDTSC) [s]",
      "publish": true,
      "unit": "seconds"
      "type": "hwthread"
    },
    {
      "..." : "..."
    }
  ]
}

You can copy this JSON and add it to the eventsets list. If you specify multiple event sets, you can add globally derived metrics in the extra global_metrics section with the metric names as variables.

Mixed usage between daemon and users

LIKWID checks the file /var/run/likwid.lock before performing any interfering operations. Who is allowed to access the counters is determined by the owner of the file. If it does not exist, it is created for the current user. So, if you want to temporarly allow counter access to a user (e.g. in a job):

Before (SLURM prolog, …)

chown $JOBUSER /var/run/likwid.lock

After (SLURM epilog, …)

chown $CCUSER /var/run/likwid.lock

invalid_to_zero option

In some cases LIKWID returns 0.0 for some events that are further used in processing and maybe used as divisor in a calculation. After evaluation of a metric, the result might be NaN or +-Inf. These resulting metrics are commonly not created and forwarded to the router because the InfluxDB line protocol does not support these special floating-point values. If you want to have them sent, this option forces these metric values to be 0.0 instead.

One might think this does not happen often but often used metrics in the world of performance engineering like Instructions-per-Cycle (IPC) or more frequently the actual CPU clock are derived with events like CPU_CLK_UNHALTED_CORE (Intel) which do not increment in halted state (as the name implies). In there are different power management systems in a chip which can cause a hardware thread to go in such a state. Moreover, if no cycles are executed by the core, also many other events are not incremented as well (like INSTR_RETIRED_ANY for retired instructions and part of IPC).

lockfile_path option

LIKWID can be configured with a lock file with which the access to the performance monitoring registers can be disabled (only the owner of the lock file is allowed to access the registers). When the lockfile_path option is set, the collector subscribes to changes to this file to stop monitoring if the owner of the lock file changes. This feature is useful when users should be able to perform own hardware performance counter measurements through LIKWID or any other tool.

send_*_total values option

  • send_core_total_values: Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per CPU core.
  • send_socket_total_values Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per CPU socket.
  • send_node_total_values Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per node.

Example configuration

AMD Zen3

  "likwid": {
    "force_overwrite" : false,
    "invalid_to_zero" : false,
    "eventsets": [
      {
        "events": {
          "FIXC1": "ACTUAL_CPU_CLOCK",
          "FIXC2": "MAX_CPU_CLOCK",
          "PMC0": "RETIRED_INSTRUCTIONS",
          "PMC1": "CPU_CLOCKS_UNHALTED",
          "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
          "PMC3": "MERGE",
          "DFC0": "DRAM_CHANNEL_0",
          "DFC1": "DRAM_CHANNEL_1",
          "DFC2": "DRAM_CHANNEL_2",
          "DFC3": "DRAM_CHANNEL_3"
        },
        "metrics": [
          {
            "name": "ipc",
            "calc": "PMC0/PMC1",
            "type": "hwthread",
            "publish": true
          },
          {
            "name": "flops_any",
            "calc": "0.000001*PMC2/time",
            "unit": "MFlops/s",
            "type": "hwthread",
            "publish": true
          },
          {
            "name": "clock",
            "calc": "0.000001*(FIXC1/FIXC2)/inverseClock",
            "type": "hwthread",
            "unit": "MHz",
            "publish": true
          },
          {
            "name": "mem1",
            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
            "unit": "Mbyte/s",
            "type": "socket",
            "publish": false
          }
        ]
      },
      {
        "events": {
          "DFC0": "DRAM_CHANNEL_4",
          "DFC1": "DRAM_CHANNEL_5",
          "DFC2": "DRAM_CHANNEL_6",
          "DFC3": "DRAM_CHANNEL_7",
          "PWR0": "RAPL_CORE_ENERGY",
          "PWR1": "RAPL_PKG_ENERGY"
        },
        "metrics": [
          {
            "name": "pwr_core",
            "calc": "PWR0/time",
            "unit": "Watt"
            "type": "socket",
            "publish": true
          },
          {
            "name": "pwr_pkg",
            "calc": "PWR1/time",
            "type": "socket",
            "unit": "Watt"
            "publish": true
          },
          {
            "name": "mem2",
            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
            "unit": "Mbyte/s",
            "type": "socket",
            "publish": false
          }
        ]
      }
    ],
    "globalmetrics": [
      {
        "name": "mem_bw",
        "calc": "mem1+mem2",
        "type": "socket",
        "unit": "Mbyte/s",
        "publish": true
      }
    ]
  }

How to get the eventsets and metrics from LIKWID

The likwid collector reads hardware performance counters at a hwthread and socket level. The configuration looks quite complicated but it is basically copy&paste from LIKWID’s performance groups. The collector made multiple iterations and tried to use the performance groups but it lacked flexibility. The current way of configuration provides most flexibility.

The logic is as following: There are multiple eventsets, each consisting of a list of counters+events and a list of metrics. If you compare a common performance group with the example setting above, there is not much difference:

EVENTSET                         ->   "events": {
FIXC1 ACTUAL_CPU_CLOCK           ->     "FIXC1": "ACTUAL_CPU_CLOCK",
FIXC2 MAX_CPU_CLOCK              ->     "FIXC2": "MAX_CPU_CLOCK",
PMC0  RETIRED_INSTRUCTIONS       ->     "PMC0" : "RETIRED_INSTRUCTIONS",
PMC1  CPU_CLOCKS_UNHALTED        ->     "PMC1" : "CPU_CLOCKS_UNHALTED",
PMC2  RETIRED_SSE_AVX_FLOPS_ALL  ->     "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
PMC3  MERGE                      ->     "PMC3": "MERGE",
                                 ->   }

The metrics are following the same procedure:

METRICS                          ->   "metrics": [
IPC   PMC0/PMC1                  ->     {
                                 ->       "name" : "IPC",
                                 ->       "calc" : "PMC0/PMC1",
                                 ->       "type": "hwthread",
                                 ->       "publish": true
                                 ->     }
                                 ->   ]

The script scripts/likwid_perfgroup_to_cc_config.py might help you.

13 - loadavg collector

Toplevel loadavgMetric

loadavg collector

  "loadavg": {
    "exclude_metrics": [
      "proc_run"
    ]
  }

The loadavg collector reads data from /proc/loadavg and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

  • load_one
  • load_five
  • load_fifteen
  • proc_run
  • proc_total

14 - lustrestat collector

Toplevel lustreMetric

lustrestat collector

  "lustrestat": {
    "lctl_command": "/path/to/lctl",
    "exclude_metrics": [
      "setattr",
      "getattr"
    ],
    "send_abs_values" : true,
    "send_derived_values" : true,
    "send_diff_values": true,
    "use_sudo": false
  }

The lustrestat collector uses the lctl application with the get_param option to get all llite metrics (Lustre client). The llite metrics are only available for root users. If password-less sudo is configured, you can enable sudo in the configuration.

Metrics:

  • lustre_read_bytes (unit bytes)
  • lustre_read_requests (unit requests)
  • lustre_write_bytes (unit bytes)
  • lustre_write_requests (unit requests)
  • lustre_open
  • lustre_close
  • lustre_getattr
  • lustre_setattr
  • lustre_statfs
  • lustre_inode_permission
  • lustre_read_bw (if send_derived_values == true, unit bytes/sec)
  • lustre_write_bw (if send_derived_values == true, unit bytes/sec)
  • lustre_read_requests_rate (if send_derived_values == true, unit requests/sec)
  • lustre_write_requests_rate (if send_derived_values == true, unit requests/sec)
  • lustre_read_bytes_diff (if send_diff_values == true, unit bytes)
  • lustre_read_requests_diff (if send_diff_values == true, unit requests)
  • lustre_write_bytes_diff (if send_diff_values == true, unit bytes)
  • lustre_write_requests_diff (if send_diff_values == true, unit requests)
  • lustre_open_diff (if send_diff_values == true)
  • lustre_close_diff (if send_diff_values == true)
  • lustre_getattr_diff (if send_diff_values == true)
  • lustre_setattr_diff (if send_diff_values == true)
  • lustre_statfs_diff (if send_diff_values == true)
  • lustre_inode_permission_diff (if send_diff_values == true)

This collector adds an device tag.

15 - memstat collector

Toplevel memstatMetric

memstat collector

  "memstat": {
    "exclude_metrics": [
      "mem_used"
    ]
  }

The memstat collector reads data from /proc/meminfo and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

  • mem_total
  • mem_sreclaimable
  • mem_slab
  • mem_free
  • mem_buffers
  • mem_cached
  • mem_available
  • mem_shared
  • swap_total
  • swap_free
  • mem_used = mem_total - (mem_free + mem_buffers + mem_cached)

16 - netstat collector

Toplevel netstatMetric

netstat collector

  "netstat": {
    "include_devices": [
      "eth0"
    ],
    "send_abs_values" : true,
    "send_derived_values" : true
  }

The netstat collector reads data from /proc/net/dev and outputs a handful node metrics. With the include_devices list you can specify which network devices should be measured. Note: Most other collectors use an exclude list instead of an include list.

Metrics:

  • net_bytes_in (unit=bytes)
  • net_bytes_out (unit=bytes)
  • net_pkts_in (unit=packets)
  • net_pkts_out (unit=packets)
  • net_bytes_in_bw (unit=bytes/sec if send_derived_values == true)
  • net_bytes_out_bw (unit=bytes/sec if send_derived_values == true)
  • net_pkts_in_bw (unit=packets/sec if send_derived_values == true)
  • net_pkts_out_bw (unit=packets/sec if send_derived_values == true)

The device name is added as tag stype=network,stype-id=<device>.

17 - nfs3stat collector

Toplevel nfs3Metric

nfs3stat collector

  "nfs3stat": {
    "nfsstat" : "/path/to/nfsstat",
    "exclude_metrics": [
      "nfs3_total"
    ]
  }

The nfs3stat collector reads data from nfsstat command and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.

Metrics:

  • nfs3_total
  • nfs3_null
  • nfs3_getattr
  • nfs3_setattr
  • nfs3_lookup
  • nfs3_access
  • nfs3_readlink
  • nfs3_read
  • nfs3_write
  • nfs3_create
  • nfs3_mkdir
  • nfs3_symlink
  • nfs3_remove
  • nfs3_rmdir
  • nfs3_rename
  • nfs3_link
  • nfs3_readdir
  • nfs3_readdirplus
  • nfs3_fsstat
  • nfs3_fsinfo
  • nfs3_pathconf
  • nfs3_commit

18 - nfs4stat collector

Toplevel nfs4Metric

nfs4stat collector

  "nfs4stat": {
    "nfsstat" : "/path/to/nfsstat",
    "exclude_metrics": [
      "nfs4_total"
    ]
  }

The nfs4stat collector reads data from nfsstat command and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.

Metrics:

  • nfs4_total
  • nfs4_null
  • nfs4_read
  • nfs4_write
  • nfs4_commit
  • nfs4_open
  • nfs4_open_conf
  • nfs4_open_noat
  • nfs4_open_dgrd
  • nfs4_close
  • nfs4_setattr
  • nfs4_fsinfo
  • nfs4_renew
  • nfs4_setclntid
  • nfs4_confirm
  • nfs4_lock
  • nfs4_lockt
  • nfs4_locku
  • nfs4_access
  • nfs4_getattr
  • nfs4_lookup
  • nfs4_lookup_root
  • nfs4_remove
  • nfs4_rename
  • nfs4_link
  • nfs4_symlink
  • nfs4_create
  • nfs4_pathconf
  • nfs4_statfs
  • nfs4_readlink
  • nfs4_readdir
  • nfs4_server_caps
  • nfs4_delegreturn
  • nfs4_getacl
  • nfs4_setacl
  • nfs4_rel_lkowner
  • nfs4_exchange_id
  • nfs4_create_session
  • nfs4_destroy_session
  • nfs4_sequence
  • nfs4_get_lease_time
  • nfs4_reclaim_comp
  • nfs4_secinfo_no
  • nfs4_bind_conn_to_ses

19 - nfsiostat collector

Toplevel nfsiostatMetric

nfsiostat collector

  "nfsiostat": {
    "exclude_metrics": [
      "nfsio_oread"
    ],
    "exclude_filesystems" : [
        "/mnt",
    ],
    "use_server_as_stype": false
  }

The nfsiostat collector reads data from /proc/self/mountstats and outputs a handful node metrics for each NFS filesystem. If a metric or filesystem is not required, it can be excluded from forwarding it to the sink.

Metrics:

  • nfsio_nread: Bytes transferred by normal read() calls
  • nfsio_nwrite: Bytes transferred by normal write() calls
  • nfsio_oread: Bytes transferred by read() calls with O_DIRECT
  • nfsio_owrite: Bytes transferred by write() calls with O_DIRECT
  • nfsio_pageread: Pages transferred by read() calls
  • nfsio_pagewrite: Pages transferred by write() calls
  • nfsio_nfsread: Bytes transferred for reading from the server
  • nfsio_nfswrite: Pages transferred by writing to the server

The nfsiostat collector adds the mountpoint to the tags as stype=filesystem,stype-id=<mountpoint>. If the server address should be used instead of the mountpoint, use the use_server_as_stype config setting.

20 - numastat collector

Toplevel numastatsMetric

numastat collector

  "numastats": {}

The numastat collector reads data from /sys/devices/system/node/node*/numastat and outputs a handful memoryDomain metrics. See: https://www.kernel.org/doc/html/latest/admin-guide/numastat.html

Metrics:

  • numastats_numa_hit: A process wanted to allocate memory from this node, and succeeded.
  • numastats_numa_miss: A process wanted to allocate memory from another node, but ended up with memory from this node.
  • numastats_numa_foreign: A process wanted to allocate on this node, but ended up with memory from another node.
  • numastats_local_node: A process ran on this node’s CPU, and got memory from this node.
  • numastats_other_node: A process ran on a different node’s CPU, and got memory from this node.
  • numastats_interleave_hit: Interleaving wanted to allocate from this node and succeeded.

21 - nvidia collector

Toplevel nvidiaMetric

nvidia collector

  "nvidia": {
    "exclude_devices": [
      "0","1", "0000000:ff:01.0"
    ],
    "exclude_metrics": [
      "nv_fb_mem_used",
      "nv_fan"
    ],
    "process_mig_devices": false,
    "use_pci_info_as_type_id": true,
    "add_pci_info_tag": false,
    "add_uuid_meta": false,
    "add_board_number_meta": false,
    "add_serial_meta": false,
    "use_uuid_for_mig_device": false,
    "use_slice_for_mig_device": false
  }

The nvidia collector can be configured to leave out specific devices with the exclude_devices option. It takes IDs as supplied to the NVML with nvmlDeviceGetHandleByIndex() or the PCI address in NVML format (%08X:%02X:%02X.0). Metrics (listed below) that should not be sent to the MetricRouter can be excluded with the exclude_metrics option. Commonly only the physical GPUs are monitored. If MIG devices should be analyzed as well, set process_mig_devices (adds stype=mig,stype-id=<mig_index>). With the options use_uuid_for_mig_device and use_slice_for_mig_device, the <mig_index> can be replaced with the UUID (e.g. MIG-6a9f7cc8-6d5b-5ce0-92de-750edc4d8849) or the MIG slice name (e.g. 1g.5gb).

The metrics sent by the nvidia collector use accelerator as type tag. For the type-id, it uses the device handle index by default. With the use_pci_info_as_type_id option, the PCI ID is used instead. If both values should be added as tags, activate the add_pci_info_tag option. It uses the device handle index as type-id and adds the PCI ID as separate pci_identifier tag.

Optionally, it is possible to add the UUID, the board part number and the serial to the meta informations. They are not sent to the sinks (if not configured otherwise).

Metrics:

  • nv_util
  • nv_mem_util
  • nv_fb_mem_total
  • nv_fb_mem_used
  • nv_bar1_mem_total
  • nv_bar1_mem_used
  • nv_temp
  • nv_fan
  • nv_ecc_mode
  • nv_perf_state
  • nv_power_usage
  • nv_graphics_clock
  • nv_sm_clock
  • nv_mem_clock
  • nv_video_clock
  • nv_max_graphics_clock
  • nv_max_sm_clock
  • nv_max_mem_clock
  • nv_max_video_clock
  • nv_ecc_uncorrected_error
  • nv_ecc_corrected_error
  • nv_power_max_limit
  • nv_encoder_util
  • nv_decoder_util
  • nv_remapped_rows_corrected
  • nv_remapped_rows_uncorrected
  • nv_remapped_rows_pending
  • nv_remapped_rows_failure
  • nv_compute_processes
  • nv_graphics_processes
  • nv_violation_power
  • nv_violation_thermal
  • nv_violation_sync_boost
  • nv_violation_board_limit
  • nv_violation_low_util
  • nv_violation_reliability
  • nv_violation_below_app_clock
  • nv_violation_below_base_clock
  • nv_nvlink_crc_flit_errors
  • nv_nvlink_crc_errors
  • nv_nvlink_ecc_errors
  • nv_nvlink_replay_errors
  • nv_nvlink_recovery_errors

Some metrics add the additional sub type tag (stype) like the nv_nvlink_* metrics set stype=nvlink,stype-id=<link_number>.

22 - rapl collector

Toplevel raplMetric

rapl collector

This collector reads running average power limit (RAPL) monitoring attributes to compute average power consumption metrics. See https://www.kernel.org/doc/html/latest/power/powercap/powercap.html#monitoring-attributes.

The Likwid metric collector provides similar functionality.

  "rapl": {
    "exclude_device_by_id": ["0:1", "0:2"],
    "exclude_device_by_name": ["psys"]
  }

Metrics:

  • rapl_average_power: average power consumption in Watt. The average is computed over the entire runtime from the last measurement to the current measurement

23 - rocm_smi collector

Toplevel rocmsmiMetric

rocm_smi collector

  "rocm_smi": {
    "exclude_devices": [
      "0","1", "0000000:ff:01.0"
    ],
    "exclude_metrics": [
      "rocm_mm_util",
      "rocm_temp_vrsoc"
    ],
    "use_pci_info_as_type_id": true,
    "add_pci_info_tag": false,
    "add_serial_meta": false,
  }

The rocm_smi collector can be configured to leave out specific devices with the exclude_devices option. It takes logical IDs in the list of available devices or the PCI address similar to NVML format (%08X:%02X:%02X.0). Metrics (listed below) that should not be sent to the MetricRouter can be excluded with the exclude_metrics option.

The metrics sent by the rocm_smi collector use accelerator as type tag. For the type-id, it uses the device handle index by default. With the use_pci_info_as_type_id option, the PCI ID is used instead. If both values should be added as tags, activate the add_pci_info_tag option. It uses the device handle index as type-id and adds the PCI ID as separate pci_identifier tag.

Optionally, it is possible to add the serial to the meta informations. They are not sent to the sinks (if not configured otherwise).

Metrics:

  • rocm_gfx_util
  • rocm_umc_util
  • rocm_mm_util
  • rocm_avg_power
  • rocm_temp_mem
  • rocm_temp_hotspot
  • rocm_temp_edge
  • rocm_temp_vrgfx
  • rocm_temp_vrsoc
  • rocm_temp_vrmem
  • rocm_gfx_clock
  • rocm_soc_clock
  • rocm_u_clock
  • rocm_v0_clock
  • rocm_v1_clock
  • rocm_d0_clock
  • rocm_d1_clock
  • rocm_temp_hbm

Some metrics add the additional sub type tag (stype) like the rocm_temp_hbm metrics set stype=device,stype-id=<HBM_slice_number>.

24 - schedstat collector

Toplevel schedstatMetric

schedstat collector

  "schedstat": {
  }

The schedstat collector reads data from /proc/schedstat and calculates a load value, separated by hwthread. This might be useful to detect bad cpu pinning on shared nodes etc.

Metric:

  • cpu_load_core

25 - self collector

Toplevel selfMetric

self collector

  "self": {
    "read_mem_stats" : true,
    "read_goroutines" : true,
    "read_cgo_calls" : true,
    "read_rusage" : true
  }

The self collector reads the data from the runtime and syscall packages, so monitors the execution of the cc-metric-collector itself.

Metrics:

  • If read_mem_stats == true:
    • total_alloc: The metric reports cumulative bytes allocated for heap objects.
    • heap_alloc: The metric reports bytes of allocated heap objects.
    • heap_sys: The metric reports bytes of heap memory obtained from the OS.
    • heap_idle: The metric reports bytes in idle (unused) spans.
    • heap_inuse: The metric reports bytes in in-use spans.
    • heap_released: The metric reports bytes of physical memory returned to the OS.
    • heap_objects: The metric reports the number of allocated heap objects.
  • If read_goroutines == true:
    • num_goroutines: The metric reports the number of goroutines that currently exist.
  • If read_cgo_calls == true:
    • num_cgo_calls: The metric reports the number of cgo calls made by the current process.
  • If read_rusage == true:
    • rusage_user_time: The metric reports the amount of time that this process has been scheduled in user mode.
    • rusage_system_time: The metric reports the amount of time that this process has been scheduled in kernel mode.
    • rusage_vol_ctx_switch: The metric reports the amount of voluntary context switches.
    • rusage_invol_ctx_switch: The metric reports the amount of involuntary context switches.
    • rusage_signals: The metric reports the number of signals received.
    • rusage_major_pgfaults: The metric reports the number of major faults the process has made which have required loading a memory page from disk.
    • rusage_minor_pgfaults: The metric reports the number of minor faults the process has made which have not required loading a memory page from disk.

26 - tempstat collector

Toplevel tempMetric

tempstat collector

  "tempstat": {
    "tag_override" : {
        "<device like hwmon1>" : {
            "type" : "socket",
            "type-id" : "0"
        }
    },
    "exclude_metrics": [
      "metric1",
      "metric2"
    ]
  }

The tempstat collector reads the data from /sys/class/hwmon/<device>/tempX_{input,label}

Metrics:

  • temp_*: The metric name is taken from the label files.

27 - topprocs collector

Toplevel topprocsMetric

topprocs collector

  "topprocs": {
    "num_procs": 5
  }

The topprocs collector reads the TopX processes (sorted by CPU utilization, ps -Ao comm --sort=-pcpu).

In contrast to most other collectors, the metric value is a string.