This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

cc-metric-collector

Documentation of cc-metric-collector

1: cc-metric-collector's collectors

1.1: BeeGFS on Demand collector
1.2: BeeGFS on Demand collector
1.3: cpufreq_cpuinfo collector
1.4: cpufreq_cpuinfo collector
1.5: cpustat collector
1.6: customcmd collector
1.7: diskstat collector
1.8: gpfs collector
1.9: ibstat collector
1.10: iostat collector
1.11: ipmistat collector
1.12: likwid collector
1.13: loadavg collector
1.14: lustrestat collector
1.15: memstat collector
1.16: netstat collector
1.17: nfs3stat collector
1.18: nfs4stat collector
1.19: nfsiostat collector
1.20: numastat collector
1.21: nvidia collector
1.22: rapl collector
1.23: rocm_smi collector
1.24: schedstat collector
1.25: self collector
1.26: tempstat collector
1.27: topprocs collector

2: cc-metric-collector's message processor

3: cc-metric-collector's receivers

3.1: http receiver
3.2: IPMI Receiver
3.3: nats receiver
3.4: prometheus receiver
3.5: Redfish receiver

4: cc-metric-collector's router

5: cc-metric-collector's sinks

5.1: ganglia sink
5.2: http sink
5.3: influxasync sink
5.4: influxdb sink
5.5: libganglia sink
5.6: nats sink
5.7: prometheus sink
5.8: stdout sink

cc-metric-collector

A node agent for measuring, processing and forwarding node level metrics. It is part of the ClusterCockpit ecosystem.

The metric collector sends (and receives) metric in the InfluxDB line protocol as it provides flexibility while providing a separation between tags (like index columns in relational databases) and fields (like data columns).

There is a single timer loop that triggers all collectors serially, collects the collectors’ data and sends the metrics to the sink. This is done as all data is submitted with a single time stamp. The sinks currently use mostly blocking APIs.

The receiver runs as a go routine side-by-side with the timer loop and asynchronously forwards received metrics to the sink.

Configuration

Configuration is implemented using a single json document that is distributed over network and may be persisted as file. Supported metrics are documented here.

There is a main configuration file with basic settings that point to the other configuration files for the different components.

{
  "sinks": "sinks.json",
  "collectors" : "collectors.json",
  "receivers" : "receivers.json",
  "router" : "router.json",
  "interval": "10s",
  "duration": "1s"
}

The interval defines how often the metrics should be read and send to the sink. The duration tells collectors how long one measurement has to take. This is important for some collectors, like the likwid collector. For more information, see here.

See the component READMEs for their configuration:

Installation

$ git clone git@github.com:ClusterCockpit/cc-metric-collector.git
$ make (downloads LIKWID, builds it as static library with 'direct' accessmode and copies all required files for the collector)
$ go get (requires at least golang 1.16)
$ make

For more information, see here.

Running

$ ./cc-metric-collector --help
Usage of metric-collector:
  -config string
    	Path to configuration file (default "./config.json")
  -log string
    	Path for logfile (default "stderr")
  -once
    	Run all collectors only once

Scenarios

The metric collector was designed with flexibility in mind, so it can be used in many scenarios. Here are a few:

flowchart TD
  subgraph a ["Cluster A"]
  nodeA[NodeA with CC collector]
  nodeB[NodeB with CC collector]
  nodeC[NodeC with CC collector]
  end
  a --> db[(Database)]
  db <--> ccweb("Webfrontend")

flowchart TD
  subgraph a [ClusterA]
  direction LR
  nodeA[NodeA with CC collector]
  nodeB[NodeB with CC collector]
  nodeC[NodeC with CC collector]
  end
  subgraph b [ClusterB]
  direction LR
  nodeD[NodeD with CC collector]
  nodeE[NodeE with CC collector]
  nodeF[NodeF with CC collector]
  end
  a --> ccrecv{"CC collector as receiver"}
  b --> ccrecv
  ccrecv --> db[("Database1")]
  ccrecv -.-> db2[("Database2")]
  db <-.-> ccweb("Webfrontend")

Contributing

The ClusterCockpit ecosystem is designed to be used by different HPC computing centers. Since configurations and setups differ between the centers, the centers likely have to put some work into the cc-metric-collector to gather all desired metrics.

You are free to open an issue to request a collector but we would also be happy about PRs.

Contact

1 - cc-metric-collector's collectors

Documentation of cc-metric-collector’s collectors

CCMetric collectors

This folder contains the collectors for the cc-metric-collector.

Configuration

{
    "collector_type" : {
        <collector specific configuration>
    }
}

In contrast to the configuration files for sinks and receivers, the collectors configuration is not a list but a set of dicts. This is required because we didn’t manage to partially read the type before loading the remaining configuration. We are eager to change this to the same format.

Available collectors

Todos

Aggreate metrics to higher topology entity (sum hwthread metrics to socket metric, …). Needs to be configurable

Contributing own collectors

A collector reads data from any source, parses it to metrics and submits these metrics to the metric-collector. A collector provides three function:

Name() string: Return the name of the collector
Init(config json.RawMessage) error: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, …
Initialized() bool: Check if a collector is successfully initialized
Read(duration time.Duration, output chan ccMetric.CCMetric): Read, parse and submit data to the output channel as CCMetric. If the collector has to measure anything for some duration, use the provided function argument duration.
Close(): Closes down the collector.

It is recommanded to call setup() in the Init() function.

Finally, the collector needs to be registered in the collectorManager.go. There is a list of collectors called AvailableCollectors which is a map (collector_type_string -> pointer to MetricCollector interface). Add a new entry with a descriptive name and the new collector.

Sample collector

package collectors

import (
    "encoding/json"
    "time"

    lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

// Struct for the collector-specific JSON config
type SampleCollectorConfig struct {
    ExcludeMetrics []string `json:"exclude_metrics"`
}

type SampleCollector struct {
    metricCollector
    config SampleCollectorConfig
}

func (m *SampleCollector) Init(config json.RawMessage) error {
    // Check if already initialized
    if m.init {
        return nil
    }

    m.name = "SampleCollector"
    m.setup()
    if len(config) > 0 {
        err := json.Unmarshal(config, &m.config)
        if err != nil {
            return err
        }
    }
    m.meta = map[string]string{"source": m.name, "group": "Sample"}

    m.init = true
    return nil
}

func (m *SampleCollector) Read(interval time.Duration, output chan lp.CCMetric) {
    if !m.init {
        return
    }
    // tags for the metric, if type != node use proper type and type-id
    tags := map[string]string{"type" : "node"}

    x, err := GetMetric()
    if err != nil {
        cclog.ComponentError(m.name, fmt.Sprintf("Read(): %v", err))
    }

    // Each metric has exactly one field: value !
    value := map[string]interface{}{"value": int64(x)}
    if y, err := lp.New("sample_metric", tags, m.meta, value, time.Now()); err == nil {
        output <- y
    }
}

func (m *SampleCollector) Close() {
    m.init = false
    return
}

1.1 - BeeGFS on Demand collector

Toplevel beegfsmetaMetric

`BeeGFS on Demand` collector

This Collector is to collect BeeGFS on Demand (BeeOND) metadata clientstats.

  "beegfs_meta": {
	"beegfs_path": "/usr/bin/beegfs-ctl",
    "exclude_filesystem": [
      "/mnt/ignore_me"
    ],
    "exclude_metrics": [     
          "ack",
          "entInf",
          "fndOwn"
    ]
  }

The BeeGFS On Demand (BeeOND) collector uses the beegfs-ctl command to read performance metrics for BeeGFS filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the beegfs-ctl command can be configured with the beegfs_path option in the configuration.

When using the exclude_metrics option, the excluded metrics are summed as other.

Important: The metrics listed below, are similar to the naming of BeeGFS. The Collector prefixes these with beegfs_cstorage(beegfs client storage).

For example beegfs metric open-> beegfs_cstorage_open

Available Metrics:

sum
ack
close
entInf
fndOwn
mkdir
create
rddir
refrEnt
mdsInf
rmdir
rmLnk
mvDirIns
mvFiIns
open
ren
sChDrct
sAttr
sDirPat
stat
statfs
trunc
symlnk
unlnk
lookLI
statLI
revalLI
openLI
createLI
hardlnk
flckAp
flckEn
flckRg
dirparent
listXA
getXA
rmXA
setXA
mirror

The collector adds a filesystem tag to all metrics

1.2 - BeeGFS on Demand collector

Toplevel beegfsstorageMetric

`BeeGFS on Demand` collector

This Collector is to collect BeeGFS on Demand (BeeOND) storage stats.

  "beegfs_storage": {
	"beegfs_path": "/usr/bin/beegfs-ctl",
    "exclude_filesystem": [
      "/mnt/ignore_me"
    ],
    "exclude_metrics": [     
          "ack",
		  "storInf",
		  "unlnk"
    ]
  }

The BeeGFS On Demand (BeeOND) collector uses the beegfs-ctl command to read performance metrics for BeeGFS filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the beegfs-ctl command can be configured with the beegfs_path option in the configuration.

When using the exclude_metrics option, the excluded metrics are summed as other.

Important: The metrics listed below, are similar to the naming of BeeGFS. The Collector prefixes these with beegfs_cstorage_(beegfs client meta). For example beegfs metric open-> beegfs_cstorage_

Note: BeeGFS FS offers many Metadata Information. Probably it makes sense to exlcude most of them. Nevertheless, these excluded metrics will be summed as beegfs_cstorage_other.

Available Metrics:

“sum”
“ack”
“sChDrct”
“getFSize”
“sAttr”
“statfs”
“trunc”
“close”
“fsync”
“ops-rd”
“MiB-rd/s”
“ops-wr”
“MiB-wr/s”
“endbg”
“hrtbeat”
“remNode”
“storInf”
“unlnk”

The collector adds a filesystem tag to all metrics

1.3 - cpufreq_cpuinfo collector

Toplevel cpufreqCpuinfoMetric

`cpufreq_cpuinfo` collector

  "cpufreq_cpuinfo": {}

The cpufreq_cpuinfo collector reads the clock frequency from /proc/cpuinfo and outputs a handful hwthread metrics.

Metrics:

cpufreq

1.4 - cpufreq_cpuinfo collector

Toplevel cpufreqMetric

`cpufreq_cpuinfo` collector

  "cpufreq": {
    "exclude_metrics": []
  }

The cpufreq collector reads the clock frequency from /sys/devices/system/cpu/cpu*/cpufreq and outputs a handful hwthread metrics.

Metrics:

cpufreq

1.5 - cpustat collector

Toplevel cpustatMetric

`cpustat` collector

  "cpustat": {
    "exclude_metrics": [
      "cpu_idle"
    ]
  }

The cpustat collector reads data from /proc/stat and outputs a handful node and hwthread metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

cpu_user with unit=Percent
cpu_nice with unit=Percent
cpu_system with unit=Percent
cpu_idle with unit=Percent
cpu_iowait with unit=Percent
cpu_irq with unit=Percent
cpu_softirq with unit=Percent
cpu_steal with unit=Percent
cpu_guest with unit=Percent
cpu_guest_nice with unit=Percent
cpu_used = cpu_* - cpu_idle with unit=Percent
num_cpus

1.6 - customcmd collector

Toplevel customCmdMetric

`customcmd` collector

  "customcmd": {
    "exclude_metrics": [
      "mymetric"
    ],
    "files" : [
      "/var/run/myapp.metrics"
    ],
    "commands" : [
      "/usr/local/bin/getmetrics.pl"
    ]
  }

The customcmd collector reads data from files and the output of executed commands. The files and commands can output multiple metrics (separated by newline) but the have to be in the InfluxDB line protocol. If a metric is not parsable, it is skipped. If a metric is not required, it can be excluded from forwarding it to the sink.

1.7 - diskstat collector

Toplevel diskstatMetric

`diskstat` collector

  "diskstat": {
    "exclude_metrics": [
      "disk_total"
    ],
  }

The diskstat collector reads data from /proc/self/mounts and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics per device (with device tag):

disk_total (unit GBytes)
disk_free (unit GBytes)

Global metrics:

part_max_used (unit percent)

1.8 - gpfs collector

Toplevel gpfsMetric

`gpfs` collector

  "ibstat": {
    "mmpmon_path": "/path/to/mmpmon",
    "exclude_filesystem": [
      "fs1"
    ],
    "send_bandwidths": true,
    "send_total_values": true
  }

The gpfs collector uses the mmpmon command to read performance metrics for GPFS / IBM Spectrum Scale filesystems.

The reported filesystems can be filtered with the exclude_filesystem option in the configuration.

The path to the mmpmon command can be configured with the mmpmon_path option in the configuration. If nothing is set, the collector searches in $PATH for mmpmon.

Metrics:

gpfs_bytes_read
gpfs_bytes_written
gpfs_num_opens
gpfs_num_closes
gpfs_num_reads
gpfs_num_writes
gpfs_num_readdirs
gpfs_num_inode_updates
gpfs_bytes_total = gpfs_bytes_read + gpfs_bytes_written (if send_total_values == true)
gpfs_iops = gpfs_num_reads + gpfs_num_writes (if send_total_values == true)
gpfs_metaops = gpfs_num_inode_updates + gpfs_num_closes + gpfs_num_opens + gpfs_num_readdirs (if send_total_values == true)
gpfs_bw_read (if send_bandwidths == true)
gpfs_bw_write (if send_bandwidths == true)

The collector adds a filesystem tag to all metrics

1.9 - ibstat collector

Toplevel infinibandMetric

`ibstat` collector

  "ibstat": {
    "exclude_devices": [
      "mlx4"
    ],
    "send_abs_values": true,
    "send_derived_values": true
  }

The ibstat collector includes all Infiniband devices that can be found below /sys/class/infiniband/ and where any of the ports provides a LID file (/sys/class/infiniband/<dev>/ports/<port>/lid)

The devices can be filtered with the exclude_devices option in the configuration.

For each found LID the collector reads data through the sysfs files below /sys/class/infiniband/<device>. (See: https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-class-infiniband)

Metrics:

ib_recv
ib_xmit
ib_recv_pkts
ib_xmit_pkts
ib_total = ib_recv + ib_xmit (if send_total_values == true)
ib_total_pkts = ib_recv_pkts + ib_xmit_pkts (if send_total_values == true)
ib_recv_bw (if send_derived_values == true)
ib_xmit_bw (if send_derived_values == true)
ib_recv_pkts_bw (if send_derived_values == true)
ib_xmit_pkts_bw (if send_derived_values == true)

The collector adds a device tag to all metrics

1.10 - iostat collector

Toplevel iostatMetric

`iostat` collector

  "iostat": {
    "exclude_metrics": [
      "read_ms"
    ],
  }

The iostat collector reads data from /proc/diskstats and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

io_reads
io_reads_merged
io_read_sectors
io_read_ms
io_writes
io_writes_merged
io_writes_sectors
io_writes_ms
io_ioops
io_ioops_ms
io_ioops_weighted_ms
io_discards
io_discards_merged
io_discards_sectors
io_discards_ms
io_flushes
io_flushes_ms

The device name is added as tag device. For more details, see https://www.kernel.org/doc/html/latest/admin-guide/iostats.html

1.11 - ipmistat collector

Toplevel ipmiMetric

`ipmistat` collector

  "ipmistat": {
    "ipmitool_path": "/path/to/ipmitool",
    "ipmisensors_path": "/path/to/ipmi-sensors",
  }

The ipmistat collector reads data from ipmitool (ipmitool sensor) or ipmi-sensors (ipmi-sensors --sdr-cache-recreate --comma-separated-output).

The metrics depend on the output of the underlying tools but contain temperature, power and energy metrics.

1.12 - likwid collector

Toplevel likwidMetric

`likwid` collector

The likwid collector is probably the most complicated collector. The LIKWID library is included as static library with direct access mode. The direct access mode is suitable if the daemon is executed by a root user. The static library does not contain the performance groups, so all information needs to be provided in the configuration.

  "likwid": {
    "force_overwrite" : false,
    "invalid_to_zero" : false,
    "liblikwid_path" : "/path/to/liblikwid.so",
    "accessdaemon_path" : "/folder/that/contains/likwid-accessD",
    "access_mode" : "direct or accessdaemon or perf_event",
    "lockfile_path" : "/var/run/likwid.lock",
    "eventsets": [
      {
        "events" : {
          "COUNTER0": "EVENT0",
          "COUNTER1": "EVENT1"
        },
        "metrics" : [
          {
            "name": "sum_01",
            "calc": "COUNTER0 + COUNTER1",
            "publish": false,
            "unit": "myunit",
            "type": "hwthread"
          }
        ]
      }
    ],
    "globalmetrics" : [
      {
        "name": "global_sum",
        "calc": "sum_01",
        "publish": true,
        "unit": "myunit",
        "type": "hwthread"
      }
    ]
  }

The likwid configuration consists of two parts, the eventsets and globalmetrics:

An event set list itself has two parts, the events and a set of derivable metrics. Each of the events is a counter:event pair in LIKWID’s syntax. The metrics are a list of formulas to derive the metric value from the measurements of the events’ values. Each metric has a name, the formula, a type and a publish flag. There is an optional unit field. Counter names can be used like variables in the formulas, so PMC0+PMC1 sums the measurements for the both events configured in the counters PMC0 and PMC1. You can optionally use time for the measurement time and inverseClock for 1.0/baseCpuFrequency. The type tells the LikwidCollector whether it is a metric for each hardware thread (cpu) or each CPU socket (socket). You may specify a unit for the metric with unit. The last one is the publishing flag. It tells the LikwidCollector whether a metric should be sent to the router or is only used internally to compute a global metric.
The globalmetrics are metrics which require data from multiple event set measurements to be derived. The inputs are the metrics in the event sets. Similar to the metrics in the event sets, the global metrics are defined by a name, a formula, a type and a publish flag. See event set metrics for details. The only difference is that there is no access to the raw event measurements anymore but only to the metrics. Also time and inverseClock cannot be used anymore. So, the idea is to derive a metric in the eventsets section and reuse it in the globalmetrics part. If you need a metric only for deriving the global metrics, disable forwarding of the event set metrics ("publish": false). Be aware that the combination might be misleading because the “behavior” of a metric changes over time and the multiple measurements might count different computing phases. Similar to the metrics in the eventset, you can specify a metric unit with the unit field.

Additional options:

force_overwrite: Same as setting LIKWID_FORCE=1. In case counters are already in-use, LIKWID overwrites their configuration to do its measurements
invalid_to_zero: In some cases, the calculations result in NaN or Inf. With this option, all NaN and Inf values are replaces with 0.0. See below in seperate section
access_mode: Specify LIKWID access mode: direct for direct register access as root user or accessdaemon. The access mode perf_event is current untested.
accessdaemon_path: Folder of the accessDaemon likwid-accessD (like /usr/local/sbin)
liblikwid_path: Location of liblikwid.so including file name like /usr/local/lib/liblikwid.so
lockfile_path: Location of LIKWID’s lock file if multiple tools should access the hardware counters. Default /var/run/likwid.lock

Available metric types

Hardware performance counters are scattered all over the system nowadays. A counter coveres a specific part of the system. While there are hardware thread specific counter for CPU cycles, instructions and so on, some others are specific for a whole CPU socket/package. To address that, the LikwidCollector provides the specification of a type for each metric.

hwthread : One metric per CPU hardware thread with the tags "type" : "hwthread" and "type-id" : "$hwthread_id"
socket : One metric per CPU socket/package with the tags "type" : "socket" and "type-id" : "$socket_id"

Note: You cannot specify socket type for a metric that is measured at hwthread type, so some kind of expert knowledge or lookup work in the Likwid Wiki is required. Get the type of each counter from the Architecture pages and as soon as one counter in a metric is socket-specific, the whole metric is socket-specific.

As a guideline:

All counters FIXCx, PMCy and TMAz have the type hwthread
All counters names containing BOX have the type socket
All PWRx counters have type socket, except "PWR1" : "RAPL_CORE_ENERGY" has hwthread type
All DFCx counters have type socket

Help with the configuration

The configuration for the likwid collector is quite complicated. Most users don’t use LIKWID with the event:counter notation but rely on the performance groups defined by the LIKWID team for each architecture. In order to help with the likwid collector configuration, we included a script scripts/likwid_perfgroup_to_cc_config.py that creates the configuration of an eventset from a performance group (using a LIKWID installation in $PATH):

$ likwid-perfctr -i
[...]
short name: ICX
[...]
$ likwid-perfctr -a
[...]
MEM_DP
MEM
FLOPS_SP
CLOCK
[...]
$ scripts/likwid_perfgroup_to_cc_config.py ICX MEM_DP
{
  "events": {
    "FIXC0": "INSTR_RETIRED_ANY",
    "FIXC1": "CPU_CLK_UNHALTED_CORE",
    "..." : "..."
  },
  "metrics" : [
    {
      "calc": "time",
      "name": "Runtime (RDTSC) [s]",
      "publish": true,
      "unit": "seconds"
      "type": "hwthread"
    },
    {
      "..." : "..."
    }
  ]
}

You can copy this JSON and add it to the eventsets list. If you specify multiple event sets, you can add globally derived metrics in the extra global_metrics section with the metric names as variables.

Mixed usage between daemon and users

LIKWID checks the file /var/run/likwid.lock before performing any interfering operations. Who is allowed to access the counters is determined by the owner of the file. If it does not exist, it is created for the current user. So, if you want to temporarly allow counter access to a user (e.g. in a job):

Before (SLURM prolog, …)

chown $JOBUSER /var/run/likwid.lock

After (SLURM epilog, …)

chown $CCUSER /var/run/likwid.lock

`invalid_to_zero` option

In some cases LIKWID returns 0.0 for some events that are further used in processing and maybe used as divisor in a calculation. After evaluation of a metric, the result might be NaN or +-Inf. These resulting metrics are commonly not created and forwarded to the router because the InfluxDB line protocol does not support these special floating-point values. If you want to have them sent, this option forces these metric values to be 0.0 instead.

One might think this does not happen often but often used metrics in the world of performance engineering like Instructions-per-Cycle (IPC) or more frequently the actual CPU clock are derived with events like CPU_CLK_UNHALTED_CORE (Intel) which do not increment in halted state (as the name implies). In there are different power management systems in a chip which can cause a hardware thread to go in such a state. Moreover, if no cycles are executed by the core, also many other events are not incremented as well (like INSTR_RETIRED_ANY for retired instructions and part of IPC).

`lockfile_path` option

LIKWID can be configured with a lock file with which the access to the performance monitoring registers can be disabled (only the owner of the lock file is allowed to access the registers). When the lockfile_path option is set, the collector subscribes to changes to this file to stop monitoring if the owner of the lock file changes. This feature is useful when users should be able to perform own hardware performance counter measurements through LIKWID or any other tool.

`send_*_total values` option

send_core_total_values: Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per CPU core.
send_socket_total_values Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per CPU socket.
send_node_total_values Metrics, which are usually collected on a per hardware thread basis, are additionally summed up per node.

Example configuration

AMD Zen3

  "likwid": {
    "force_overwrite" : false,
    "invalid_to_zero" : false,
    "eventsets": [
      {
        "events": {
          "FIXC1": "ACTUAL_CPU_CLOCK",
          "FIXC2": "MAX_CPU_CLOCK",
          "PMC0": "RETIRED_INSTRUCTIONS",
          "PMC1": "CPU_CLOCKS_UNHALTED",
          "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
          "PMC3": "MERGE",
          "DFC0": "DRAM_CHANNEL_0",
          "DFC1": "DRAM_CHANNEL_1",
          "DFC2": "DRAM_CHANNEL_2",
          "DFC3": "DRAM_CHANNEL_3"
        },
        "metrics": [
          {
            "name": "ipc",
            "calc": "PMC0/PMC1",
            "type": "hwthread",
            "publish": true
          },
          {
            "name": "flops_any",
            "calc": "0.000001*PMC2/time",
            "unit": "MFlops/s",
            "type": "hwthread",
            "publish": true
          },
          {
            "name": "clock",
            "calc": "0.000001*(FIXC1/FIXC2)/inverseClock",
            "type": "hwthread",
            "unit": "MHz",
            "publish": true
          },
          {
            "name": "mem1",
            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
            "unit": "Mbyte/s",
            "type": "socket",
            "publish": false
          }
        ]
      },
      {
        "events": {
          "DFC0": "DRAM_CHANNEL_4",
          "DFC1": "DRAM_CHANNEL_5",
          "DFC2": "DRAM_CHANNEL_6",
          "DFC3": "DRAM_CHANNEL_7",
          "PWR0": "RAPL_CORE_ENERGY",
          "PWR1": "RAPL_PKG_ENERGY"
        },
        "metrics": [
          {
            "name": "pwr_core",
            "calc": "PWR0/time",
            "unit": "Watt"
            "type": "socket",
            "publish": true
          },
          {
            "name": "pwr_pkg",
            "calc": "PWR1/time",
            "type": "socket",
            "unit": "Watt"
            "publish": true
          },
          {
            "name": "mem2",
            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
            "unit": "Mbyte/s",
            "type": "socket",
            "publish": false
          }
        ]
      }
    ],
    "globalmetrics": [
      {
        "name": "mem_bw",
        "calc": "mem1+mem2",
        "type": "socket",
        "unit": "Mbyte/s",
        "publish": true
      }
    ]
  }

How to get the eventsets and metrics from LIKWID

The likwid collector reads hardware performance counters at a hwthread and socket level. The configuration looks quite complicated but it is basically copy&paste from LIKWID’s performance groups. The collector made multiple iterations and tried to use the performance groups but it lacked flexibility. The current way of configuration provides most flexibility.

The logic is as following: There are multiple eventsets, each consisting of a list of counters+events and a list of metrics. If you compare a common performance group with the example setting above, there is not much difference:

EVENTSET                         ->   "events": {
FIXC1 ACTUAL_CPU_CLOCK           ->     "FIXC1": "ACTUAL_CPU_CLOCK",
FIXC2 MAX_CPU_CLOCK              ->     "FIXC2": "MAX_CPU_CLOCK",
PMC0  RETIRED_INSTRUCTIONS       ->     "PMC0" : "RETIRED_INSTRUCTIONS",
PMC1  CPU_CLOCKS_UNHALTED        ->     "PMC1" : "CPU_CLOCKS_UNHALTED",
PMC2  RETIRED_SSE_AVX_FLOPS_ALL  ->     "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
PMC3  MERGE                      ->     "PMC3": "MERGE",
                                 ->   }

The metrics are following the same procedure:

METRICS                          ->   "metrics": [
IPC   PMC0/PMC1                  ->     {
                                 ->       "name" : "IPC",
                                 ->       "calc" : "PMC0/PMC1",
                                 ->       "type": "hwthread",
                                 ->       "publish": true
                                 ->     }
                                 ->   ]

The script scripts/likwid_perfgroup_to_cc_config.py might help you.

1.13 - loadavg collector

Toplevel loadavgMetric

`loadavg` collector

  "loadavg": {
    "exclude_metrics": [
      "proc_run"
    ]
  }

The loadavg collector reads data from /proc/loadavg and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

load_one
load_five
load_fifteen
proc_run
proc_total

1.14 - lustrestat collector

Toplevel lustreMetric

`lustrestat` collector

  "lustrestat": {
    "lctl_command": "/path/to/lctl",
    "exclude_metrics": [
      "setattr",
      "getattr"
    ],
    "send_abs_values" : true,
    "send_derived_values" : true,
    "send_diff_values": true,
    "use_sudo": false
  }

The lustrestat collector uses the lctl application with the get_param option to get all llite metrics (Lustre client). The llite metrics are only available for root users. If password-less sudo is configured, you can enable sudo in the configuration.

Metrics:

lustre_read_bytes (unit bytes)
lustre_read_requests (unit requests)
lustre_write_bytes (unit bytes)
lustre_write_requests (unit requests)
lustre_open
lustre_close
lustre_getattr
lustre_setattr
lustre_statfs
lustre_inode_permission
lustre_read_bw (if send_derived_values == true, unit bytes/sec)
lustre_write_bw (if send_derived_values == true, unit bytes/sec)
lustre_read_requests_rate (if send_derived_values == true, unit requests/sec)
lustre_write_requests_rate (if send_derived_values == true, unit requests/sec)
lustre_read_bytes_diff (if send_diff_values == true, unit bytes)
lustre_read_requests_diff (if send_diff_values == true, unit requests)
lustre_write_bytes_diff (if send_diff_values == true, unit bytes)
lustre_write_requests_diff (if send_diff_values == true, unit requests)
lustre_open_diff (if send_diff_values == true)
lustre_close_diff (if send_diff_values == true)
lustre_getattr_diff (if send_diff_values == true)
lustre_setattr_diff (if send_diff_values == true)
lustre_statfs_diff (if send_diff_values == true)
lustre_inode_permission_diff (if send_diff_values == true)

This collector adds an device tag.

1.15 - memstat collector

Toplevel memstatMetric

`memstat` collector

  "memstat": {
    "exclude_metrics": [
      "mem_used"
    ]
  }

The memstat collector reads data from /proc/meminfo and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink.

Metrics:

mem_total
mem_sreclaimable
mem_slab
mem_free
mem_buffers
mem_cached
mem_available
mem_shared
swap_total
swap_free
mem_used = mem_total - (mem_free + mem_buffers + mem_cached)

1.16 - netstat collector

Toplevel netstatMetric

`netstat` collector

  "netstat": {
    "include_devices": [
      "eth0"
    ],
    "send_abs_values" : true,
    "send_derived_values" : true
  }

The netstat collector reads data from /proc/net/dev and outputs a handful node metrics. With the include_devices list you can specify which network devices should be measured. Note: Most other collectors use an exclude list instead of an include list.

Metrics:

net_bytes_in (unit=bytes)
net_bytes_out (unit=bytes)
net_pkts_in (unit=packets)
net_pkts_out (unit=packets)
net_bytes_in_bw (unit=bytes/sec if send_derived_values == true)
net_bytes_out_bw (unit=bytes/sec if send_derived_values == true)
net_pkts_in_bw (unit=packets/sec if send_derived_values == true)
net_pkts_out_bw (unit=packets/sec if send_derived_values == true)

The device name is added as tag stype=network,stype-id=<device>.

1.17 - nfs3stat collector

Toplevel nfs3Metric

`nfs3stat` collector

  "nfs3stat": {
    "nfsstat" : "/path/to/nfsstat",
    "exclude_metrics": [
      "nfs3_total"
    ]
  }

The nfs3stat collector reads data from nfsstat command and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.

Metrics:

nfs3_total
nfs3_null
nfs3_getattr
nfs3_setattr
nfs3_lookup
nfs3_access
nfs3_readlink
nfs3_read
nfs3_write
nfs3_create
nfs3_mkdir
nfs3_symlink
nfs3_remove
nfs3_rmdir
nfs3_rename
nfs3_link
nfs3_readdir
nfs3_readdirplus
nfs3_fsstat
nfs3_fsinfo
nfs3_pathconf
nfs3_commit

1.18 - nfs4stat collector

Toplevel nfs4Metric

`nfs4stat` collector

  "nfs4stat": {
    "nfsstat" : "/path/to/nfsstat",
    "exclude_metrics": [
      "nfs4_total"
    ]
  }

The nfs4stat collector reads data from nfsstat command and outputs a handful node metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.

Metrics:

nfs4_total
nfs4_null
nfs4_read
nfs4_write
nfs4_commit
nfs4_open
nfs4_open_conf
nfs4_open_noat
nfs4_open_dgrd
nfs4_close
nfs4_setattr
nfs4_fsinfo
nfs4_renew
nfs4_setclntid
nfs4_confirm
nfs4_lock
nfs4_lockt
nfs4_locku
nfs4_access
nfs4_getattr
nfs4_lookup
nfs4_lookup_root
nfs4_remove
nfs4_rename
nfs4_link
nfs4_symlink
nfs4_create
nfs4_pathconf
nfs4_statfs
nfs4_readlink
nfs4_readdir
nfs4_server_caps
nfs4_delegreturn
nfs4_getacl
nfs4_setacl
nfs4_rel_lkowner
nfs4_exchange_id
nfs4_create_session
nfs4_destroy_session
nfs4_sequence
nfs4_get_lease_time
nfs4_reclaim_comp
nfs4_secinfo_no
nfs4_bind_conn_to_ses

1.19 - nfsiostat collector

Toplevel nfsiostatMetric

`nfsiostat` collector

  "nfsiostat": {
    "exclude_metrics": [
      "nfsio_oread"
    ],
    "exclude_filesystems" : [
        "/mnt",
    ],
    "use_server_as_stype": false
  }

The nfsiostat collector reads data from /proc/self/mountstats and outputs a handful node metrics for each NFS filesystem. If a metric or filesystem is not required, it can be excluded from forwarding it to the sink.

Metrics:

nfsio_nread: Bytes transferred by normal read() calls
nfsio_nwrite: Bytes transferred by normal write() calls
nfsio_oread: Bytes transferred by read() calls with O_DIRECT
nfsio_owrite: Bytes transferred by write() calls with O_DIRECT
nfsio_pageread: Pages transferred by read() calls
nfsio_pagewrite: Pages transferred by write() calls
nfsio_nfsread: Bytes transferred for reading from the server
nfsio_nfswrite: Pages transferred by writing to the server

The nfsiostat collector adds the mountpoint to the tags as stype=filesystem,stype-id=<mountpoint>. If the server address should be used instead of the mountpoint, use the use_server_as_stype config setting.

1.20 - numastat collector

Toplevel numastatsMetric

`numastat` collector

  "numastats": {}

The numastat collector reads data from /sys/devices/system/node/node*/numastat and outputs a handful memoryDomain metrics. See: https://www.kernel.org/doc/html/latest/admin-guide/numastat.html

Metrics:

numastats_numa_hit: A process wanted to allocate memory from this node, and succeeded.
numastats_numa_miss: A process wanted to allocate memory from another node, but ended up with memory from this node.
numastats_numa_foreign: A process wanted to allocate on this node, but ended up with memory from another node.
numastats_local_node: A process ran on this node’s CPU, and got memory from this node.
numastats_other_node: A process ran on a different node’s CPU, and got memory from this node.
numastats_interleave_hit: Interleaving wanted to allocate from this node and succeeded.

1.21 - nvidia collector

Toplevel nvidiaMetric

`nvidia` collector

  "nvidia": {
    "exclude_devices": [
      "0","1", "0000000:ff:01.0"
    ],
    "exclude_metrics": [
      "nv_fb_mem_used",
      "nv_fan"
    ],
    "process_mig_devices": false,
    "use_pci_info_as_type_id": true,
    "add_pci_info_tag": false,
    "add_uuid_meta": false,
    "add_board_number_meta": false,
    "add_serial_meta": false,
    "use_uuid_for_mig_device": false,
    "use_slice_for_mig_device": false
  }

The nvidia collector can be configured to leave out specific devices with the exclude_devices option. It takes IDs as supplied to the NVML with nvmlDeviceGetHandleByIndex() or the PCI address in NVML format (%08X:%02X:%02X.0). Metrics (listed below) that should not be sent to the MetricRouter can be excluded with the exclude_metrics option. Commonly only the physical GPUs are monitored. If MIG devices should be analyzed as well, set process_mig_devices (adds stype=mig,stype-id=<mig_index>). With the options use_uuid_for_mig_device and use_slice_for_mig_device, the <mig_index> can be replaced with the UUID (e.g. MIG-6a9f7cc8-6d5b-5ce0-92de-750edc4d8849) or the MIG slice name (e.g. 1g.5gb).

The metrics sent by the nvidia collector use accelerator as type tag. For the type-id, it uses the device handle index by default. With the use_pci_info_as_type_id option, the PCI ID is used instead. If both values should be added as tags, activate the add_pci_info_tag option. It uses the device handle index as type-id and adds the PCI ID as separate pci_identifier tag.

Optionally, it is possible to add the UUID, the board part number and the serial to the meta informations. They are not sent to the sinks (if not configured otherwise).

Metrics:

nv_util
nv_mem_util
nv_fb_mem_total
nv_fb_mem_used
nv_bar1_mem_total
nv_bar1_mem_used
nv_temp
nv_fan
nv_ecc_mode
nv_perf_state
nv_power_usage
nv_graphics_clock
nv_sm_clock
nv_mem_clock
nv_video_clock
nv_max_graphics_clock
nv_max_sm_clock
nv_max_mem_clock
nv_max_video_clock
nv_ecc_uncorrected_error
nv_ecc_corrected_error
nv_power_max_limit
nv_encoder_util
nv_decoder_util
nv_remapped_rows_corrected
nv_remapped_rows_uncorrected
nv_remapped_rows_pending
nv_remapped_rows_failure
nv_compute_processes
nv_graphics_processes
nv_violation_power
nv_violation_thermal
nv_violation_sync_boost
nv_violation_board_limit
nv_violation_low_util
nv_violation_reliability
nv_violation_below_app_clock
nv_violation_below_base_clock
nv_nvlink_crc_flit_errors
nv_nvlink_crc_errors
nv_nvlink_ecc_errors
nv_nvlink_replay_errors
nv_nvlink_recovery_errors

Some metrics add the additional sub type tag (stype) like the nv_nvlink_* metrics set stype=nvlink,stype-id=<link_number>.

1.22 - rapl collector

Toplevel raplMetric

`rapl` collector

This collector reads running average power limit (RAPL) monitoring attributes to compute average power consumption metrics. See https://www.kernel.org/doc/html/latest/power/powercap/powercap.html#monitoring-attributes.

The Likwid metric collector provides similar functionality.

  "rapl": {
    "exclude_device_by_id": ["0:1", "0:2"],
    "exclude_device_by_name": ["psys"]
  }

Metrics:

rapl_average_power: average power consumption in Watt. The average is computed over the entire runtime from the last measurement to the current measurement

1.23 - rocm_smi collector

Toplevel rocmsmiMetric

`rocm_smi` collector

  "rocm_smi": {
    "exclude_devices": [
      "0","1", "0000000:ff:01.0"
    ],
    "exclude_metrics": [
      "rocm_mm_util",
      "rocm_temp_vrsoc"
    ],
    "use_pci_info_as_type_id": true,
    "add_pci_info_tag": false,
    "add_serial_meta": false,
  }

The rocm_smi collector can be configured to leave out specific devices with the exclude_devices option. It takes logical IDs in the list of available devices or the PCI address similar to NVML format (%08X:%02X:%02X.0). Metrics (listed below) that should not be sent to the MetricRouter can be excluded with the exclude_metrics option.

The metrics sent by the rocm_smi collector use accelerator as type tag. For the type-id, it uses the device handle index by default. With the use_pci_info_as_type_id option, the PCI ID is used instead. If both values should be added as tags, activate the add_pci_info_tag option. It uses the device handle index as type-id and adds the PCI ID as separate pci_identifier tag.

Optionally, it is possible to add the serial to the meta informations. They are not sent to the sinks (if not configured otherwise).

Metrics:

rocm_gfx_util
rocm_umc_util
rocm_mm_util
rocm_avg_power
rocm_temp_mem
rocm_temp_hotspot
rocm_temp_edge
rocm_temp_vrgfx
rocm_temp_vrsoc
rocm_temp_vrmem
rocm_gfx_clock
rocm_soc_clock
rocm_u_clock
rocm_v0_clock
rocm_v1_clock
rocm_d0_clock
rocm_d1_clock
rocm_temp_hbm

Some metrics add the additional sub type tag (stype) like the rocm_temp_hbm metrics set stype=device,stype-id=<HBM_slice_number>.

1.24 - schedstat collector

Toplevel schedstatMetric

`schedstat` collector

  "schedstat": {
  }

The schedstat collector reads data from /proc/schedstat and calculates a load value, separated by hwthread. This might be useful to detect bad cpu pinning on shared nodes etc.

Metric:

cpu_load_core

1.25 - self collector

Toplevel selfMetric

`self` collector

  "self": {
    "read_mem_stats" : true,
    "read_goroutines" : true,
    "read_cgo_calls" : true,
    "read_rusage" : true
  }

The self collector reads the data from the runtime and syscall packages, so monitors the execution of the cc-metric-collector itself.

Metrics:

If read_mem_stats == true:
- total_alloc: The metric reports cumulative bytes allocated for heap objects.
- heap_alloc: The metric reports bytes of allocated heap objects.
- heap_sys: The metric reports bytes of heap memory obtained from the OS.
- heap_idle: The metric reports bytes in idle (unused) spans.
- heap_inuse: The metric reports bytes in in-use spans.
- heap_released: The metric reports bytes of physical memory returned to the OS.
- heap_objects: The metric reports the number of allocated heap objects.
If read_goroutines == true:
- num_goroutines: The metric reports the number of goroutines that currently exist.
If read_cgo_calls == true:
- num_cgo_calls: The metric reports the number of cgo calls made by the current process.
If read_rusage == true:
- rusage_user_time: The metric reports the amount of time that this process has been scheduled in user mode.
- rusage_system_time: The metric reports the amount of time that this process has been scheduled in kernel mode.
- rusage_vol_ctx_switch: The metric reports the amount of voluntary context switches.
- rusage_invol_ctx_switch: The metric reports the amount of involuntary context switches.
- rusage_signals: The metric reports the number of signals received.
- rusage_major_pgfaults: The metric reports the number of major faults the process has made which have required loading a memory page from disk.
- rusage_minor_pgfaults: The metric reports the number of minor faults the process has made which have not required loading a memory page from disk.

1.26 - tempstat collector

Toplevel tempMetric

`tempstat` collector

  "tempstat": {
    "tag_override" : {
        "<device like hwmon1>" : {
            "type" : "socket",
            "type-id" : "0"
        }
    },
    "exclude_metrics": [
      "metric1",
      "metric2"
    ]
  }

The tempstat collector reads the data from /sys/class/hwmon/<device>/tempX_{input,label}

Metrics:

temp_*: The metric name is taken from the label files.

1.27 - topprocs collector

Toplevel topprocsMetric

`topprocs` collector

  "topprocs": {
    "num_procs": 5
  }

The topprocs collector reads the TopX processes (sorted by CPU utilization, ps -Ao comm --sort=-pcpu).

In contrast to most other collectors, the metric value is a string.

2 - cc-metric-collector's message processor

Documentation of cc-metric-collector’s message processor

Message Processor Component

Multiple parts of in the ClusterCockit ecosystem require the processing of CCMessages. The main CC application using it is cc-metric-collector. The processing part there was originally in the metric router, the central hub connecting collectors (reading local data), receivers (receiving remote data) and sinks (sending data). Already in early stages, the lack of flexibility caused some trouble:

The sysadmins wanted to keep operating their Ganglia based monitoring infrastructure while we developed the CC stack. Ganglia wants the core metrics with a specific name and resolution (right unit prefix) but there was no conversion of the data in the CC stack, so CC frontend developers wanted a different resolution for some metrics. The issue was basically the mem_used metric showing the currently used memory of the node. Ganglia wants it in kByte as provided by the Linux operating system but CC wanted it in GByte.

With the message processor, the Ganglia sinks can apply the unit prefix changes individually and name the metrics as required by Ganglia.

For developers

Whenever you receive or are about to send a message out, you should provide some processing.

Configuration of component

New operations can be added to the message processor at runtime. Of course, they can also be removed again. For the initial setup, having a configuration file or some fields in a configuration file for the processing.

The message processor uses the following configuration

{
	"drop_messages": [
		"name_of_message_to_drop"
	],
	"drop_messages_if": [
		"condition_when_to_drop_message",
		"name == 'drop_this'",
		"tag.hostname == 'this_host'",
		"meta.unit != 'MB'"
	],
	"rename_messages" : {
		"old_message_name" : "new_message_name"
	},
	"rename_messages_if": {
		"condition_when_to_rename_message" : "new_name"
	},
	"add_tags_if": [
		{
			"if" : "condition_when_to_add_tag",
			"key": "name_for_new_tag",
			"value": "new_tag_value"
		}
	],
	"delete_tags_if": [
		{
			"if" : "condition_when_to_delete_tag",
			"key": "name_of_tag"
		}
	],
	"add_meta_if": [
		{
			"if" : "condition_when_to_add_meta_info",
			"key": "name_for_new_meta_info",
			"value": "new_meta_info_value"
		}
	],
	"delete_meta_if": [
		{
			"if" : "condition_when_to_delete_meta_info",
			"key": "name_of_meta_info"
		}
	],
	"add_field_if": [
		{
			"if" : "condition_when_to_add_field",
			"key": "name_for_new_field",
			"value": "new_field_value_but_only_string_at_the_moment"
		}
	],
	"delete_field_if": [
		{
			"if" : "condition_when_to_delete_field",
			"key": "name_of_field"
		}
	],
	"move_tag_to_meta_if": [
		{
			"if" : "condition_when_to_move_tag_to_meta_info_including_its_value",
			"key": "name_of_tag",
			"value": "name_of_meta_info"
		}
	],
	"move_tag_to_field_if": [
		{
			"if" : "condition_when_to_move_tag_to_fields_including_its_value",
			"key": "name_of_tag",
			"value": "name_of_field"
		}
	],
	"move_meta_to_tag_if": [
		{
			"if" : "condition_when_to_move_meta_info_to_tags_including_its_value",
			"key": "name_of_meta_info",
			"value": "name_of_tag"
		}
	],
	"move_meta_to_field_if": [
		{
			"if" : "condition_when_to_move_meta_info_to_fields_including_its_value",
			"key": "name_of_tag",
			"value": "name_of_meta_info"
		}
	],
	"move_field_to_tag_if": [
		{
			"if" : "condition_when_to_move_field_to_tags_including_its_stringified_value",
			"key": "name_of_field",
			"value": "name_of_tag"
		}
	],
	"move_field_to_meta_if": [
		{
			"if" : "condition_when_to_move_field_to_meta_info_including_its_stringified_value",
			"key": "name_of_field",
			"value": "name_of_meta_info"
		}
	],
	"drop_by_message_type": [
		"metric",
		"event",
		"log",
		"control"
	],
	"change_unit_prefix": {
		"name == 'metric_with_wrong_unit_prefix'" : "G",
		"only_if_messagetype == 'metric'": "T"
	},
	"normalize_units": true,
	"add_base_env": {
		"MY_CONSTANT_FOR_CUSTOM_CONDITIONS": 1.0,
		"output_value_for_test_metrics": 42.0,
	},
	"stage_order": [
		"rename_messages_if",
		"drop_messages"
	]
}

The options change_unit_prefix and normalize_units are only applied to CCMetrics. It is not possible to delete the field related to each message type as defined in cc-specification. In short:

CCMetrics always have to have a field named value
CCEvents always have to have a field named event
CCLogs always have to have a field named log
CCControl messages always have to have a field named control

With add_base_env, one can specifiy mykey=myvalue pairs that can be used in conditions like tag.type == mykey.

The order in which each message is processed, can be specified with the stage_order option. The stage names are the keys in the JSON configuration, thus change_unit_prefix, move_field_to_meta_if, etc. Stages can be listed multiple times.

Using the component

In order to load the configuration from a json.RawMessage:

mp, err := NewMessageProcessor()
if err != nil {
	log.Error("failed to create new message processor")
}
mp.FromConfigJSON(configJson)

After initialization and adding the different operations, the ProcessMessage() function applies all operations and returns whether the message should be dropped.

m := lp.CCMetric{}

x, err := mp.ProcessMessage(m)
if err != nil {
	// handle error
}
if x != nil {
    // process x further
} else {
	// this message got dropped
}

Single operations can be added and removed at runtime

type MessageProcessor interface {
	// Functions to set the execution order of the processing stages
	SetStages([]string) error
	DefaultStages() []string
	// Function to add variables to the base evaluation environment
	AddBaseEnv(env map[string]interface{}) error
	// Functions to add and remove rules
	AddDropMessagesByName(name string) error
	RemoveDropMessagesByName(name string)
	AddDropMessagesByCondition(condition string) error
	RemoveDropMessagesByCondition(condition string)
	AddRenameMetricByCondition(condition string, name string) error
	RemoveRenameMetricByCondition(condition string)
	AddRenameMetricByName(from, to string) error
	RemoveRenameMetricByName(from string)
	SetNormalizeUnits(settings bool)
	AddChangeUnitPrefix(condition string, prefix string) error
	RemoveChangeUnitPrefix(condition string)
	AddAddTagsByCondition(condition, key, value string) error
	RemoveAddTagsByCondition(condition string)
	AddDeleteTagsByCondition(condition, key, value string) error
	RemoveDeleteTagsByCondition(condition string)
	AddAddMetaByCondition(condition, key, value string) error
	RemoveAddMetaByCondition(condition string)
	AddDeleteMetaByCondition(condition, key, value string) error
	RemoveDeleteMetaByCondition(condition string)
	AddMoveTagToMeta(condition, key, value string) error
	RemoveMoveTagToMeta(condition string)
	AddMoveTagToFields(condition, key, value string) error
	RemoveMoveTagToFields(condition string)
	AddMoveMetaToTags(condition, key, value string) error
	RemoveMoveMetaToTags(condition string)
	AddMoveMetaToFields(condition, key, value string) error
	RemoveMoveMetaToFields(condition string)
	AddMoveFieldToTags(condition, key, value string) error
	RemoveMoveFieldToTags(condition string)
	AddMoveFieldToMeta(condition, key, value string) error
	RemoveMoveFieldToMeta(condition string)
	// Read in a JSON configuration
	FromConfigJSON(config json.RawMessage) error
	ProcessMessage(m lp2.CCMessage) (lp2.CCMessage, error)
	// Processing functions for legacy CCMetric and current CCMessage
	ProcessMetric(m lp.CCMetric) (lp2.CCMessage, error)
}

Syntax for evaluatable terms

The message processor uses gval for evaluating the terms. It provides a basic set of operators like string comparison and arithmetic operations.

Accessible for operations are

name of the message
timestamp or time of the message
type, type-id of the message (also tag_type, tag_type-id and tag_typeid)
stype, stype-id of the message (if message has theses tags, also tag_stype, tag_stype-id and tag_stypeid)
value for a CCMetric message (also field_value)
event for a CCEvent message (also field_event)
control for a CCControl message (also field_control)
log for a CCLog message (also field_log)
messagetype or msgtype. Possible values event, metric, log and control.

Generally, all tags are accessible with tag_<tagkey>, tags_<tagkey> or tags.<tagkey>. Similarly for all fields with field[s]?[_.]<fieldkey>. For meta information meta[_.]<metakey> (there is no metas[_.]<metakey>).

The syntax of expr is accepted with some additions:

Comparing strings: ==, !=, str matches regex (use % instead of \!)
Combining conditions: &&, ||
Comparing numbers: ==, !=, <, >, <=, >=
Test lists: <value> in <list>
Topological tests: tag_type-id in getCpuListOfType("socket", "1") (test if the metric belongs to socket 1 in local node topology)

Often the operations are written in JSON files for loading them at startup. In JSON, some characters are not allowed. Therefore, the term syntax reflects that:

use '' instead of "" for strings
for the regexes, use % instead of \

For operations that should be applied on all messages, use the condition true.

Overhead

The operations taking conditions are pre-processed, which is commonly the time consuming part but, of course, with each added operation, the time to process a message increases. Moreover, the processing creates a copy of the message.

3 - cc-metric-collector's receivers

Documentation of cc-metric-collector’s receivers

CCMetric receivers

This folder contains the ReceiveManager and receiver implementations for the cc-metric-collector.

Configuration

The configuration file for the receivers is a list of configurations. The type field in each specifies which receiver to initialize.

{
  "myreceivername" : {
    "type": "receiver-type",
    <receiver-specific configuration>
  }
}

This allows to specify

Available receivers

nats: Receive metrics from the NATS network
prometheus: Scrape data from a Prometheus client
http: Listen for HTTP Post requests transporting metrics in InfluxDB line protocol
ipmi: Read IPMI sensor readings
redfish Use the Redfish (specification) to query thermal and power metrics

Contributing own receivers

A receiver contains a few functions and is derived from the type Receiver (in metricReceiver.go):

For an example, check the sample receiver

3.1 - http receiver

Toplevel httpReceiver

`http` receiver

The http receiver can be used receive metrics through HTTP POST requests.

Configuration structure

{
  "<name>": {
    "type": "http",
    "address" : "",
    "port" : "8080",
    "path" : "/write",
    "idle_timeout": "120s",
    "username": "myUser",
    "password": "myPW"
  }
}

type: makes the receiver a http receiver
address: Listen address
port: Listen port
path: URL path for the write endpoint
idle_timeout: Maximum amount of time to wait for the next request when keep-alives are enabled should be larger than the measurement interval to keep the connection open
keep_alives_enabled: Controls whether HTTP keep-alives are enabled. By default, keep-alives are enabled.
username: username for basic authentication
password: password for basic authentication

The HTTP endpoint listens to http://<address>:<port>/<path>

Debugging

Install curl

Use curl to send message to http receiver

curl http://localhost:8080/write \
--user "myUser:myPW" \
--data \
"myMetric,hostname=myHost,type=hwthread,type-id=0,unit=Hz value=400000i 1694777161164284635
myMetric,hostname=myHost,type=hwthread,type-id=1,unit=Hz value=400001i 1694777161164284635"

3.2 - IPMI Receiver

Toplevel ipmiReceiver

IPMI Receiver

The IPMI Receiver uses ipmi-sensors from the FreeIPMI project to read IPMI sensor readings and sensor data repository (SDR) information. The available metrics depend on the sensors provided by the hardware vendor but typically contain temperature, fan speed, voltage and power metrics.

Configuration structure

{
    "<IPMI receiver name>": {
        "type": "ipmi",
        "interval": "30s",
        "fanout": 256,
        "username": "<Username>",
        "password": "<Password>",
        "endpoint": "ipmi-sensors://%h-bmc",
        "exclude_metrics": [ "fan_speed", "voltage" ],
        "client_config": [
            {
                "host_list": "n[1,2-4]"
            },
            {
                "host_list": "n[5-6]",
                "driver_type": "LAN",
                "cli_options": [ "--workaround-flags=..." ],
                "password": "<Password 2>"
            }
        ]
    }
}

Global settings:

interval: How often the IPMI sensor metrics should be read and send to the sink (default: 30 s)

Global and per IPMI device settings (per IPMI device settings overwrite the global settings):

exclude_metrics: list of excluded metrics e.g. fan_speed, power, temperature, utilization, voltage
fanout: Maximum number of simultaneous IPMI connections (default: 64)
driver_type: Out of band IPMI driver (default: LAN_2_0)
username: User name to authenticate with
password: Password to use for authentication
endpoint: URL of the IPMI device (placeholder %h gets replaced by the hostname)

Per IPMI device settings:

host_list: List of hosts with the same client configuration
cli_options: Additional command line options for ipmi-sensors

3.3 - nats receiver

Toplevel natsReceiver

`nats` receiver

The nats receiver can be used receive metrics from the NATS network. The nats receiver subscribes to the topic database and listens on address and port for metrics in the InfluxDB line protocol.

Configuration structure

{
  "<name>": {
    "type": "nats",
    "address" : "nats-server.example.org",
    "port" : "4222",
    "subject" : "subject",
    "user": "natsuser",
    "password": "natssecret",
    "nkey_file": "/path/to/nkey_file"
  }
}

type: makes the receiver a nats receiver
address: Address of the NATS control server
port: Port of the NATS control server
subject: Subscribes to this subject and receive metrics
user: Connect to nats using this user
password: Connect to nats using this password
nkey_file: Path to credentials file with NKEY

Debugging

Install NATS server and command line client

Start NATS server

nats-server --net nats-server.example.org --port 4222

Check NATS server works as expected

nats --server=nats-server-db.example.org:4222 server check

Use NATS command line client to subscribe to all messages
```
nats --server=nats-server-db.example.org:4222 sub ">"
```

Use NATS command line client to send message to NATS receiver

nats --server=nats-server-db.example.org:4222 pub subject \
"myMetric,hostname=myHost,type=hwthread,type-id=0,unit=Hz value=400000i 1694777161164284635
myMetric,hostname=myHost,type=hwthread,type-id=1,unit=Hz value=400001i 1694777161164284635"

3.4 - prometheus receiver

Toplevel prometheusReceiver

`prometheus` receiver

The prometheus receiver can be used to scrape the metrics of a single prometheus client. It does not use any official Golang library but making simple HTTP get requests and parse the response.

Configuration structure

{
  "<name>": {
    "type": "prometheus",
    "address" : "testpromhost",
    "port" : "12345",
    "path" : "/prometheus",
    "interval": "5s",
    "ssl" : true,
  }
}

type: makes the receiver a prometheus receiver
address: Hostname or IP of the Prometheus agent
port: Port of Prometheus agent
path: Path to the Prometheus endpoint
interval: Scrape the Prometheus endpoint in this interval (default ‘5s’)
ssl: Use SSL or not

The receiver requests data from http(s)://<address>:<port>/<path>.

3.5 - Redfish receiver

Toplevel redfishReceiver

Redfish receiver

The Redfish receiver uses the Redfish (specification) to query thermal and power metrics. Thermal metrics may include various fan speeds and temperatures. Power metrics may include the current power consumption of various hardware components. It may also include the minimum, maximum and average power consumption of these components in a given time interval. The receiver will poll each configured redfish device once in a given interval. Multiple devices can be accessed in parallel to increase throughput.

Configuration structure

{
    "<redfish receiver name>": {
        "type": "redfish",
        "username": "<Username>",
        "password": "<Password>",
        "endpoint": "https://%h-bmc",
        "exclude_metrics": [ "min_consumed_watts" ],
        "client_config": [
            {
                "host_list": "n[1,2-4]"
            },
            {
                "host_list": "n5",
                "disable_power_metrics": true,
                "disable_processor_metrics": true,
                "disable_thermal_metrics": true
            },
            {
                "host_list": "n6" ],
                "username": "<Username 2>",
                "password": "<Password 2>",
                "endpoint": "https://%h-BMC",
                "disable_sensor_metrics": true
            }
        ]
    }
}

Global settings:

fanout: Maximum number of simultaneous redfish connections (default: 64)
interval: How often the redfish power metrics should be read and send to the sink (default: 30 s)
http_insecure: Control whether a client verifies the server’s certificate (default: true == do not verify server’s certificate)
http_timeout: Time limit for requests made by this HTTP client (default: 10 s)

Global and per redfish device settings (per redfish device settings overwrite the global settings):

disable_power_metrics: disable collection of power metrics (/redfish/v1/Chassis/{ChassisId}/Power)
disable_processor_metrics: disable collection of processor metrics (/redfish/v1/Systems/{ComputerSystemId}/Processors/{ProcessorId}/ProcessorMetrics)
disable_sensors: disable collection of fan, power and thermal sensor metrics (/redfish/v1/Chassis/{ChassisId}/Sensors/{SensorId})
disable_thermal_metrics: disable collection of thermal metrics (/redfish/v1/Chassis/{ChassisId}/Thermal)
exclude_metrics: list of excluded metrics
username: User name to authenticate with
password: Password to use for authentication
endpoint: URL of the redfish service (placeholder %h gets replaced by the hostname)

Per redfish device settings:

host_list: List of hosts with the same client configuration

4 - cc-metric-collector's router

Documentation of cc-metric-collector’s router

CC Metric Router

The CCMetric router sits in between the collectors and the sinks and can be used to add and remove tags to/from traversing [CCMessages](https://pkg.go.dev/github.com/ClusterCockpit/cc-energy-manager@v0.0.0-20240919152819-92a17f2da4f7/pkg/cc-message.

Configuration

Note: Use the message processor configuration with option process_messages.

{
    "num_cache_intervals" : 1,
    "interval_timestamp" : true,
    "hostname_tag" : "hostname",
    "max_forward" : 50,
    "process_messages": {
      "see": "pkg/messageProcessor/README.md"
    },
    "add_tags" : [
        {
            "key" : "cluster",
            "value" : "testcluster",
            "if" : "*"
        },
        {
            "key" : "test",
            "value" : "testing",
            "if" : "name == 'temp_package_id_0'"
        }
    ],
    "delete_tags" : [
        {
            "key" : "unit",
            "value" : "*",
            "if" : "*"
        }
    ],
    "interval_aggregates" : [
        {
            "name" : "temp_cores_avg",
            "if" : "match('temp_core_%d+', metric.Name())",
            "function" : "avg(values)",
            "tags" : {
                "type" : "node"
            },
            "meta" : {
                "group": "IPMI",
                "unit": "degC",
                "source": "TempCollector"
            }
        }
    ],
    "drop_metrics" : [
        "not_interesting_metric_at_all"
    ],
    "drop_metrics_if" : [
        "match('temp_core_%d+', metric.Name())"
    ],
    "rename_metrics" : {
        "metric_12345" : "mymetric"
    },
    "normalize_units" : true,
    "change_unit_prefix" : {
      "mem_used" : "G",
      "mem_total" : "G"
    }
}

There are three main options add_tags, delete_tags and interval_timestamp. add_tags and delete_tags are lists consisting of dicts with key, value and if. The value can be omitted in the delete_tags part as it only uses the key for removal. The interval_timestamp setting means that a unique timestamp is applied to all metrics traversing the router during an interval.

Note: Use the message processor configuration (option process_messages) instead of add_tags, delete_tags, drop_metrics, drop_metrics_if, rename_metrics, normalize_units and change_unit_prefix. These options are deprecated and will be removed in future versions. Until then, they are added to the message processor.

Processing order in the router

Add the hostname_tag tag (if sent by collectors or cache)
If interval_timestamp == true, change time of metrics
Check if metric should be dropped (drop_metrics and drop_metrics_if)
Add tags from add_tags
Delete tags from del_tags
Rename metric based on rename_metrics and store old name as oldname in meta information
Add tags from add_tags (if you used the new name in the if condition)
Delete tags from del_tags (if you used the new name in the if condition)
Send to sinks
Move to cache (if num_cache_intervals > 0)

The `interval_timestamp` option

The collectors’ Read() functions are not called simultaneously and therefore the metrics gathered in an interval can have different timestamps. If you want to avoid that and have a common timestamp (the beginning of the interval), set this option to true and the MetricRouter sets the time.

The `num_cache_intervals` option

If the MetricRouter should buffer metrics of intervals in a MetricCache, this option specifies the number of past intervals that should be kept. If num_cache_intervals = 0, the cache is disabled. With num_cache_intervals = 1, only the metrics of the last interval are buffered.

A num_cache_intervals > 0 is required to use the interval_aggregates option.

The `hostname_tag` option

By default, the router tags metrics with the hostname for all locally created metrics. The default tag name is hostname, but it can be changed if your organization wants anything else

The `max_forward` option

Every time the router receives a metric through any of the channels, it tries to directly read up to max_forward metrics from the same channel. This was done as the router thread would go to sleep and wake up with every arriving metric. The default are 50 metrics at once and max_forward needs to greater than 1.

The `rename_metrics` option

deprecated

In the ClusterCockpit world we specified a set of standard metrics. Since some collectors determine the metric names based on files, execuables and libraries, they might change from system to system (or installation to installtion, OS to OS, …). In order to get the common names, you can rename incoming metrics before sending them to the sink. If the metric name matches the oldname, it is changed to newname

{
  "oldname" : "newname",
  "clock_mhz" : "clock"
}

Conditional manipulation of tags (`add_tags` and `del_tags`)

deprecated

Common config format:

{
    "key" : "test",
    "value" : "testing",
    "if" : "name == 'temp_package_id_0'"
}

The `del_tags` option

deprecated

The collectors are free to add whatever key=value pair to the metric tags (although the usage of tags should be minimized). If you want to delete a tag afterwards, you can do that. When the if condition matches on a metric, the key is removed from the metric’s tags.

If you want to remove a tag for all metrics, use the condition wildcard *. The value field can be omitted in the del_tags case.

Never delete tags:

hostname
type
type-id

The `add_tags` option

deprecated

In some cases, metrics should be tagged or an existing tag changed based on some condition. This can be done in the add_tags section. When the if condition evaluates to true, the tag key is added or gets changed to the new value.

If the CCMetric name is equal to temp_package_id_0, it adds an additional tag test=testing to the metric.

For this metric, a more useful example would be:

[
  {
    "key" : "type",
    "value" : "socket",
    "if" : "name == 'temp_package_id_0'"
  },
  {
    "key" : "type-id",
    "value" : "0",
    "if" : "name == 'temp_package_id_0'"
  },
]

The metric temp_package_id_0 corresponds to the tempature of the first CPU socket (=package). With the above configuration, the tags would reflect that because commonly the TempCollector submits only node metrics.

In order to match all metrics, you can use *, so in order to add a flag per default. This is useful to attached system-specific tags like cluster=testcluster:

{
    "key" : "cluster",
    "value" : "testcluster",
    "if" : "*"
}

Dropping metrics

In some cases, you want to drop a metric and don’t get it forwarded to the sinks. There are two options based on the required specification:

Based only on the metric name -> drop_metrics section
An evaluable condition with more overhead -> drop_metrics_if section

The `drop_metrics` section

deprecated

The argument is a list of metric names. No futher checks are performed, only a comparison of the metric name

{
  "drop_metrics" : [
      "drop_metric_1",
      "drop_metric_2"
  ]
}

The example drops all metrics with the name drop_metric_1 and drop_metric_2.

The `drop_metrics_if` section

deprecated

This option takes a list of evaluable conditions and performs them one after the other on all metrics incoming from the collectors and the metric cache (aka interval_aggregates).

{
  "drop_metrics_if" : [
      "match('drop_metric_%d+', name)",
      "match('cpu', type) && type-id == 0"
  ]
}

The first line is comparable with the example in drop_metrics, it drops all metrics starting with drop_metric_ and ending with a number. The second line drops all metrics of the first hardware thread (not recommended)

Manipulating the metric units

The `normalize_units` option

deprecated

The cc-metric-collector tries to read the data from the system as it is reported. If available, it tries to read the metric unit from the system as well (e.g. from /proc/meminfo). The problem is that, depending on the source, the metric units are named differently. Just think about byte, Byte, B, bytes, … The cc-units package provides us a normalization option to use the same metric unit name for all metrics. It this option is set to true, all unit meta tags are normalized.

The `change_unit_prefix` section

deprecated

It is often the case that metrics are reported by the system using a rather outdated unit prefix (like /proc/meminfo still uses kByte despite current memory sizes are in the GByte range). If you want to change the prefix of a unit, you can do that with the help of cc-units. The setting works on the metric name and requires the new prefix for the metric. The cc-units package determines the scaling factor.

Aggregate metric values of the current interval with the `interval_aggregates` option

Note: interval_aggregates works only if num_cache_intervals > 0 and is experimental

In some cases, you need to derive new metrics based on the metrics arriving during an interval. This can be done in the interval_aggregates section. The logic is similar to the other metric manipulation and filtering options. A cache stores all metrics that arrive during an interval. At the beginning of the next interval, the list of metrics is submitted to the MetricAggregator. It derives new metrics and submits them back to the MetricRouter, so they are sent in the next interval but have the timestamp of the previous interval beginning.

"interval_aggregates" : [
  {
    "name" : "new_metric_name",
    "if" : "match('sub_metric_%d+', metric.Name())",
    "function" : "avg(values)",
    "tags" : {
      "key" : "value",
      "type" : "node"
    },
    "meta" : {
      "key" : "value",
      "group": "IPMI",
      "unit": "<copy>",
    }
  }
]

The above configuration, collects all metric values for metrics evaluating if to true. Afterwards it calculates the average avg of the values (list of all metrics’ field value) and creates a new CCMetric with the name new_metric_name and adds the tags in tags and the meta information in meta. The special value <copy> searches the input metrics and copies the value of the first match of key to the new CCMetric.

If you are not interested in the input metrics sub_metric_%d+ at all, you can add the same condition used here to the drop_metrics_if section to drop them.

Use cases for interval_aggregates:

Combine multiple metrics of the a collector to a new one like the MemstatCollector does it for mem_used)):

  {
    "name" : "mem_used",
    "if" : "source == 'MemstatCollector'",
    "function" : "sum(mem_total) - (sum(mem_free) + sum(mem_buffers) + sum(mem_cached))",
    "tags" : {
      "type" : "node"
    },
    "meta" : {
      "group": "<copy>",
      "unit": "<copy>",
      "source": "<copy>"
    }
  }

Order of operations

The router performs the above mentioned options in a specific order. In order to get the logic you want for a specific metric, it is crucial to know the processing order:

Add the hostname tag (c)
Manipulate the timestamp to the interval timestamp (c,r)
Drop metrics based on drop_metrics and drop_metrics_if (c,r)
Add tags based on add_tags (c,r)
Delete tags based on del_tags (c,r)
Rename metric based on rename_metric (c,r)
- Add tags based on add_tags to still work if the configuration uses the new name (c,r)
- Delete tags based on del_tags to still work if the configuration uses the new name (c,r)
Normalize units when normalize_units is set (c,r)
Convert unit prefix based on change_unit_prefix (c,r)

Legend:

‘c’ if metric is coming from a collector
‘r’ if metric is coming from a receiver

5 - cc-metric-collector's sinks

Documentation of cc-metric-collector’s sinks

CCMetric sinks

This folder contains the SinkManager and sink implementations for the cc-metric-collector.

Available sinks:

stdout: Print all metrics to stdout, stderr or a file
http: Send metrics to an HTTP server as POST requests
influxdb: Send metrics to an InfluxDB database
influxasync: Send metrics to an InfluxDB database with non-blocking write API
nats: Publish metrics to the NATS network overlay system
ganglia: Publish metrics in the Ganglia Monitoring System using the gmetric CLI tool
libganglia: Publish metrics in the Ganglia Monitoring System directly using libganglia.so
prometeus: Publish metrics for the Prometheus Monitoring System

Configuration

The configuration file for the sinks is a list of configurations. The type field in each specifies which sink to initialize.

{
  "mystdout" : {
    "type" : "stdout",
    "meta_as_tags" : [
    	"unit"
    ]
  },
  "metricstore" : {
    "type" : "http",
    "host" : "localhost",
    "port" : "4123",
    "database" : "ccmetric",
    "password" : "<jwt token>"
  }
}

Contributing own sinks

A sink contains five functions and is derived from the type sink:

Init(name string, config json.RawMessage) error
Write(point CCMetric) error
Flush() error
Close()
New<Typename>(name string, config json.RawMessage) (Sink, error) (calls the Init() function)

The data structures should be set up in Init() like opening a file or server connection. The Write() function writes/sends the data. For non-blocking sinks, the Flush() method tells the sink to drain its internal buffers. The Close() function should tear down anything created in Init().

Finally, the sink needs to be registered in the sinkManager.go. There is a list of sinks called AvailableSinks which is a map (sink_type_string -> pointer to sink interface). Add a new entry with a descriptive name and the new sink.

Sample sink

package sinks

import (
	"encoding/json"
	"log"
	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

type SampleSinkConfig struct {
	defaultSinkConfig  // defines JSON tags for 'name' and 'meta_as_tags'
}

type SampleSink struct {
	sink              // declarate 'name' and 'meta_as_tags'
	config StdoutSinkConfig // entry point to the SampleSinkConfig
}

// Initialize the sink by giving it a name and reading in the config JSON
func (s *SampleSink) Init(name string, config json.RawMessage) error {
	s.name = fmt.Sprintf("SampleSink(%s)", name)   // Always specify a name here
  // Read in the config JSON
	if len(config) > 0 {
		err := json.Unmarshal(config, &s.config)
		if err != nil {
			return err
		}
	}
	return nil
}

// Code to submit a single CCMetric to the sink
func (s *SampleSink) Write(point lp.CCMetric) error {
	log.Print(point)
	return nil
}

// If the sink uses batched sends internally, you can tell to flush its buffers
func (s *SampleSink) Flush() error {
	return nil
}


// Close sink: close network connection, close files, close libraries, ...
func (s *SampleSink) Close() {}


// New function to create a new instance of the sink
func NewSampleSink(name string, config json.RawMessage) (Sink, error) {
	s := new(SampleSink)
	err := s.Init(name, config)
	return s, err
}

5.1 - ganglia sink

Toplevel gangliaSink

`ganglia` sink

The ganglia sink uses the gmetric tool of the Ganglia Monitoring System to submit the metrics

Configuration structure

{
  "<name>": {
    "type": "ganglia",
    "gmetric_path" : "/path/to/gmetric",
    "add_ganglia_group" : true,
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an ganglia sink
gmetric_path: Path to gmetric executable (optional). If not given, the sink searches in $PATH for gmetric.
add_ganglia_group: Add --group=X based on meta information to the gmetric call. Some old versions of gmetric do not support the --group option.
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

5.2 - http sink

Toplevel httpSink

`http` sink

The http sink uses POST requests to a HTTP server to submit the metrics in the InfluxDB line-protocol format. It uses JSON web tokens for authentification. The sink creates batches of metrics before sending, to reduce the HTTP traffic.

Configuration structure

{
  "<name>": {
    "type": "http",
    "url" : "https://my-monitoring.example.com:1234/api/write",
    "jwt" : "blabla.blabla.blabla",
    "username": "myUser",
    "password": "myPW",
    "timeout": "5s",
    "idle_connection_timeout" : "5s",
    "flush_delay": "2s",
    "batch_size": 1000,
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an http sink
url: The full URL of the endpoint
jwt: JSON web tokens for authentication (Using the Bearer scheme)
username: username for basic authentication
password: password for basic authentication
timeout: General timeout for the HTTP client (default ‘5s’)
max_retries: Maximum number of retries to connect to the http server
idle_connection_timeout: Timeout for idle connections (default ‘120s’). Should be larger than the measurement interval to keep the connection open
flush_delay: Batch all writes arriving in during this duration (default ‘1s’, batching can be disabled by setting it to 0)
batch_size: Maximal batch size. If batch_size is reached before the end of flush_delay, the metrics are sent without further delay
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Using `http` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

5.3 - influxasync sink

Toplevel influxAsyncSink

`influxasync` sink

The influxasync sink uses the official InfluxDB golang client to write the metrics to an InfluxDB database in a non-blocking fashion. It provides only support for V2 write endpoints (InfluxDB 1.8.0 or later).

Configuration structure

{
  "<name>": {
    "type": "influxasync",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "organization": "myorg",
    "ssl": true,
    "batch_size": 200,
    "retry_interval" : "1s",
    "retry_exponential_base" : 2,
    "precision": "s",
    "max_retries": 20,
    "max_retry_time" : "168h",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an influxdb sink
database: All metrics are written to this bucket
host: Hostname of the InfluxDB database server
port: Portnumber (as string) of the InfluxDB database server
user: Username for basic authentification
password: Password for basic authentification
organization: Organization in the InfluxDB
ssl: Use SSL connection
batch_size: batch up metrics internally, default 100
retry_interval: Base retry interval for failed write requests, default 1s
retry_exponential_base: The retry interval is exponentially increased with this base, default 2
max_retries: Maximal number of retry attempts
max_retry_time: Maximal time to retry failed writes, default 168h (one week)
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

For information about the calculation of the retry interval settings, see offical influxdb-client-go documentation

Using `influxasync` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

5.4 - influxdb sink

Toplevel influxSink

`influxdb` sink

The influxdb sink uses the official InfluxDB golang client to write the metrics to an InfluxDB database in a blocking fashion. It provides only support for V2 write endpoints (InfluxDB 1.8.0 or later).

Configuration structure

{
  "<name>": {
    "type": "influxdb",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "organization": "myorg",
    "ssl": true,
    "flush_delay" : "1s",
    "batch_size" : 1000,
    "use_gzip": true,
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an influxdb sink
database: All metrics are written to this bucket
host: Hostname of the InfluxDB database server
port: Port number (as string) of the InfluxDB database server
user: Username for basic authentication
password: Password for basic authentication
organization: Organization in the InfluxDB
ssl: Use SSL connection
flush_delay: Group metrics coming in to a single batch
batch_size: Maximal batch size. If batch_size is reached before the end of flush_delay, the metrics are sent without further delay
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Influx client options:

batch_size: Maximal batch size
meta_as_tags: move meta information keys to tags (optional)
http_request_timeout: HTTP request timeout
retry_interval: retry interval
max_retry_interval: maximum delay between each retry attempt
retry_exponential_base: base for the exponential retry delay
max_retries: maximum count of retry attempts of failed writes
max_retry_time: maximum total retry timeout
use_gzip: Specify whether to use GZip compression in write requests

Using `influxdb` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

5.5 - libganglia sink

Toplevel libgangliaSink

`libganglia` sink

The libganglia sink interacts directly with the library of the Ganglia Monitoring System to submit the metrics. Consequently, it needs to be installed on all nodes. But this is commonly the case if you want to use Ganglia, because it requires at least a node daemon (gmond or ganglia-monitor) to work.

The libganglia sink has probably less overhead compared to the ganglia sink because it does not require any process generation but initializes the environment and UDP connections only once.

Configuration structure

{
  "<name>": {
    "type": "libganglia",
    "gmetric_config" : "/path/to/gmetric/config",
    "cluster_name": "MyCluster",
    "add_ganglia_group" : true,
    "add_type_to_name": true,
    "add_units" : true,
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an libganglia sink
gmond_config: Path to the Ganglia configuration file gmond.conf (default: /etc/ganglia/gmond.conf)
cluster_name: Set a cluster name for the metric. If not set, it is taken from gmond_config
add_ganglia_group: Add a Ganglia metric group based on meta information. Some old versions of gmetric do not support the --group option
add_type_to_name: Ganglia commonly uses only node-level metrics but with cc-metric-collector, there are metrics for cpus, memory domains, CPU sockets and the whole node. In order to get eeng, this option prefixes the metric name with <type><type-id>_ or device_ depending on the metric tags and meta information. For metrics of the whole node type=node, no prefix is added
add_units: Add metric value unit if there is a unit entry in the metric tags or meta information
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Ganglia Installation

My development system is Ubuntu 20.04. To install the required libraries with apt:

$ sudo apt install libganglia1

The libganglia.so gets installed in /usr/lib. The Ganglia headers libganglia1-dev are not required.

I added a Makefile in the sinks subfolder that searches for the library in /usr and creates a symlink (sinks/libganglia.so) for running/building the cc-metric-collector. So just type make before running/building in the main folder or the sinks subfolder.

5.6 - nats sink

Toplevel natsSink

`nats` sink

The nats sink publishes all metrics into a NATS network. The publishing key is the database name provided in the configuration file

Configuration structure

{
  "<name>": {
    "type": "nats",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "nkey_file": "/path/to/nkey_file",
    "flush_delay": "10s",
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an nats sink
database: All metrics are published with this subject
host: Hostname of the NATS server
port: Port number (as string) of the NATS server
user: Username for basic authentication
password: Password for basic authentication
nkey_file: Path to credentials file with NKEY
flush_delay: Maximum time until metrics are sent out (default ‘5s’)
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Using `nats` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

5.7 - prometheus sink

Toplevel prometheusSink

`prometheus` sink

The prometheus sink publishes all metrics via an HTTP server ready to be scraped by a Prometheus server. It creates gauge metrics for all node metrics and gauge vectors for all metrics with a subtype like ‘device’, ‘cpu’ or ‘socket’.

Configuration structure

{
  "<name>": {
    "type": "prometheus",
    "host": "localhost",
    "port": "8080",
    "path": "metrics",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an prometheus sink
host: The HTTP server gets bound to that IP/hostname
port: Portnumber (as string) for the HTTP server
path: Path where the metrics should be servered. The metrics will be published at host:port/path
group_as_namespace: Most metrics contain a group as meta information like ‘memory’, ’load’. With this the metric names are extended to group_name if possible.
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

5.8 - stdout sink

Toplevel stdoutSink

`stdout` sink

The stdout sink is the most simple sink provided by cc-metric-collector. It writes all metrics in InfluxDB line-procol format to the configurable output file or the common special files stdout and stderr.

Configuration structure

{
  "<name>": {
    "type": "stdout",
    "meta_as_tags" : [],
    "output_file" : "mylogfile.log",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an stdout sink
meta_as_tags: print meta information as tags in the output (optional)
output_file: Write all data to the selected file (optional). There are two ‘special’ files: stdout and stderr. If this option is not provided, the default value is stdout
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

cc-metric-collector

cc-metric-collector

Configuration

Installation

Running

Scenarios

Contributing

Contact

1 - cc-metric-collector's collectors

CCMetric collectors

Configuration

Available collectors

Todos

Contributing own collectors

Sample collector

1.1 - BeeGFS on Demand collector

BeeGFS on Demand collector

1.2 - BeeGFS on Demand collector

BeeGFS on Demand collector

1.3 - cpufreq_cpuinfo collector

cpufreq_cpuinfo collector

1.4 - cpufreq_cpuinfo collector

cpufreq_cpuinfo collector

1.5 - cpustat collector

cpustat collector

1.6 - customcmd collector

customcmd collector

1.7 - diskstat collector

diskstat collector

1.8 - gpfs collector

gpfs collector

1.9 - ibstat collector

ibstat collector

1.10 - iostat collector

iostat collector

1.11 - ipmistat collector

ipmistat collector

1.12 - likwid collector

likwid collector

Available metric types

Help with the configuration

Mixed usage between daemon and users

invalid_to_zero option

lockfile_path option

send_*_total values option

Example configuration

AMD Zen3

How to get the eventsets and metrics from LIKWID

1.13 - loadavg collector

loadavg collector

1.14 - lustrestat collector

lustrestat collector

1.15 - memstat collector

memstat collector

1.16 - netstat collector

netstat collector

1.17 - nfs3stat collector

nfs3stat collector

1.18 - nfs4stat collector

nfs4stat collector

1.19 - nfsiostat collector

nfsiostat collector

1.20 - numastat collector

numastat collector

1.21 - nvidia collector

nvidia collector

1.22 - rapl collector

rapl collector

1.23 - rocm_smi collector

rocm_smi collector

1.24 - schedstat collector

schedstat collector

1.25 - self collector

self collector

1.26 - tempstat collector

tempstat collector

1.27 - topprocs collector

topprocs collector

2 - cc-metric-collector's message processor

Message Processor Component

`BeeGFS on Demand` collector

`BeeGFS on Demand` collector

`cpufreq_cpuinfo` collector

`cpufreq_cpuinfo` collector

`cpustat` collector

`customcmd` collector

`diskstat` collector

`gpfs` collector

`ibstat` collector

`iostat` collector

`ipmistat` collector

`likwid` collector

`invalid_to_zero` option

`lockfile_path` option

`send_*_total values` option

`loadavg` collector

`lustrestat` collector

`memstat` collector

`netstat` collector

`nfs3stat` collector

`nfs4stat` collector

`nfsiostat` collector

`numastat` collector

`nvidia` collector

`rapl` collector

`rocm_smi` collector

`schedstat` collector

`self` collector

`tempstat` collector

`topprocs` collector

`http` receiver

`nats` receiver

`prometheus` receiver

The `interval_timestamp` option

The `num_cache_intervals` option

The `hostname_tag` option

The `max_forward` option

The `rename_metrics` option

Conditional manipulation of tags (`add_tags` and `del_tags`)

The `del_tags` option

The `add_tags` option

The `drop_metrics` section

The `drop_metrics_if` section

The `normalize_units` option

The `change_unit_prefix` section

Aggregate metric values of the current interval with the `interval_aggregates` option

`ganglia` sink

`http` sink

Using `http` sink for communication with cc-metric-store

`influxasync` sink

Using `influxasync` sink for communication with cc-metric-store

`influxdb` sink

Using `influxdb` sink for communication with cc-metric-store