This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

cc-metric-collector's sinks

Documentation of cc-metric-collector’s sinks

1: ganglia sink
2: http sink
3: influxasync sink
4: influxdb sink
5: libganglia sink
6: nats sink
7: prometheus sink
8: stdout sink

CCMetric sinks

This folder contains the SinkManager and sink implementations for the cc-metric-collector.

Available sinks:

stdout: Print all metrics to stdout, stderr or a file
http: Send metrics to an HTTP server as POST requests
influxdb: Send metrics to an InfluxDB database
influxasync: Send metrics to an InfluxDB database with non-blocking write API
nats: Publish metrics to the NATS network overlay system
ganglia: Publish metrics in the Ganglia Monitoring System using the gmetric CLI tool
libganglia: Publish metrics in the Ganglia Monitoring System directly using libganglia.so
prometeus: Publish metrics for the Prometheus Monitoring System

Configuration

The configuration file for the sinks is a list of configurations. The type field in each specifies which sink to initialize.

{
  "mystdout" : {
    "type" : "stdout",
    "meta_as_tags" : [
    	"unit"
    ]
  },
  "metricstore" : {
    "type" : "http",
    "host" : "localhost",
    "port" : "4123",
    "database" : "ccmetric",
    "password" : "<jwt token>"
  }
}

Contributing own sinks

A sink contains five functions and is derived from the type sink:

Init(name string, config json.RawMessage) error
Write(point CCMetric) error
Flush() error
Close()
New<Typename>(name string, config json.RawMessage) (Sink, error) (calls the Init() function)

The data structures should be set up in Init() like opening a file or server connection. The Write() function writes/sends the data. For non-blocking sinks, the Flush() method tells the sink to drain its internal buffers. The Close() function should tear down anything created in Init().

Finally, the sink needs to be registered in the sinkManager.go. There is a list of sinks called AvailableSinks which is a map (sink_type_string -> pointer to sink interface). Add a new entry with a descriptive name and the new sink.

Sample sink

package sinks

import (
	"encoding/json"
	"log"
	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

type SampleSinkConfig struct {
	defaultSinkConfig  // defines JSON tags for 'name' and 'meta_as_tags'
}

type SampleSink struct {
	sink              // declarate 'name' and 'meta_as_tags'
	config StdoutSinkConfig // entry point to the SampleSinkConfig
}

// Initialize the sink by giving it a name and reading in the config JSON
func (s *SampleSink) Init(name string, config json.RawMessage) error {
	s.name = fmt.Sprintf("SampleSink(%s)", name)   // Always specify a name here
  // Read in the config JSON
	if len(config) > 0 {
		err := json.Unmarshal(config, &s.config)
		if err != nil {
			return err
		}
	}
	return nil
}

// Code to submit a single CCMetric to the sink
func (s *SampleSink) Write(point lp.CCMetric) error {
	log.Print(point)
	return nil
}

// If the sink uses batched sends internally, you can tell to flush its buffers
func (s *SampleSink) Flush() error {
	return nil
}


// Close sink: close network connection, close files, close libraries, ...
func (s *SampleSink) Close() {}


// New function to create a new instance of the sink
func NewSampleSink(name string, config json.RawMessage) (Sink, error) {
	s := new(SampleSink)
	err := s.Init(name, config)
	return s, err
}

1 - ganglia sink

Toplevel gangliaSink

`ganglia` sink

The ganglia sink uses the gmetric tool of the Ganglia Monitoring System to submit the metrics

Configuration structure

{
  "<name>": {
    "type": "ganglia",
    "gmetric_path" : "/path/to/gmetric",
    "add_ganglia_group" : true,
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an ganglia sink
gmetric_path: Path to gmetric executable (optional). If not given, the sink searches in $PATH for gmetric.
add_ganglia_group: Add --group=X based on meta information to the gmetric call. Some old versions of gmetric do not support the --group option.
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

2 - http sink

Toplevel httpSink

`http` sink

The http sink uses POST requests to a HTTP server to submit the metrics in the InfluxDB line-protocol format. It uses JSON web tokens for authentification. The sink creates batches of metrics before sending, to reduce the HTTP traffic.

Configuration structure

{
  "<name>": {
    "type": "http",
    "url" : "https://my-monitoring.example.com:1234/api/write",
    "jwt" : "blabla.blabla.blabla",
    "username": "myUser",
    "password": "myPW",
    "timeout": "5s",
    "idle_connection_timeout" : "5s",
    "flush_delay": "2s",
    "batch_size": 1000,
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an http sink
url: The full URL of the endpoint
jwt: JSON web tokens for authentication (Using the Bearer scheme)
username: username for basic authentication
password: password for basic authentication
timeout: General timeout for the HTTP client (default ‘5s’)
max_retries: Maximum number of retries to connect to the http server
idle_connection_timeout: Timeout for idle connections (default ‘120s’). Should be larger than the measurement interval to keep the connection open
flush_delay: Batch all writes arriving in during this duration (default ‘1s’, batching can be disabled by setting it to 0)
batch_size: Maximal batch size. If batch_size is reached before the end of flush_delay, the metrics are sent without further delay
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Using `http` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

3 - influxasync sink

Toplevel influxAsyncSink

`influxasync` sink

The influxasync sink uses the official InfluxDB golang client to write the metrics to an InfluxDB database in a non-blocking fashion. It provides only support for V2 write endpoints (InfluxDB 1.8.0 or later).

Configuration structure

{
  "<name>": {
    "type": "influxasync",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "organization": "myorg",
    "ssl": true,
    "batch_size": 200,
    "retry_interval" : "1s",
    "retry_exponential_base" : 2,
    "precision": "s",
    "max_retries": 20,
    "max_retry_time" : "168h",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an influxdb sink
database: All metrics are written to this bucket
host: Hostname of the InfluxDB database server
port: Portnumber (as string) of the InfluxDB database server
user: Username for basic authentification
password: Password for basic authentification
organization: Organization in the InfluxDB
ssl: Use SSL connection
batch_size: batch up metrics internally, default 100
retry_interval: Base retry interval for failed write requests, default 1s
retry_exponential_base: The retry interval is exponentially increased with this base, default 2
max_retries: Maximal number of retry attempts
max_retry_time: Maximal time to retry failed writes, default 168h (one week)
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

For information about the calculation of the retry interval settings, see offical influxdb-client-go documentation

Using `influxasync` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

4 - influxdb sink

Toplevel influxSink

`influxdb` sink

The influxdb sink uses the official InfluxDB golang client to write the metrics to an InfluxDB database in a blocking fashion. It provides only support for V2 write endpoints (InfluxDB 1.8.0 or later).

Configuration structure

{
  "<name>": {
    "type": "influxdb",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "organization": "myorg",
    "ssl": true,
    "flush_delay" : "1s",
    "batch_size" : 1000,
    "use_gzip": true,
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an influxdb sink
database: All metrics are written to this bucket
host: Hostname of the InfluxDB database server
port: Port number (as string) of the InfluxDB database server
user: Username for basic authentication
password: Password for basic authentication
organization: Organization in the InfluxDB
ssl: Use SSL connection
flush_delay: Group metrics coming in to a single batch
batch_size: Maximal batch size. If batch_size is reached before the end of flush_delay, the metrics are sent without further delay
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Influx client options:

batch_size: Maximal batch size
meta_as_tags: move meta information keys to tags (optional)
http_request_timeout: HTTP request timeout
retry_interval: retry interval
max_retry_interval: maximum delay between each retry attempt
retry_exponential_base: base for the exponential retry delay
max_retries: maximum count of retry attempts of failed writes
max_retry_time: maximum total retry timeout
use_gzip: Specify whether to use GZip compression in write requests

Using `influxdb` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

5 - libganglia sink

Toplevel libgangliaSink

`libganglia` sink

The libganglia sink interacts directly with the library of the Ganglia Monitoring System to submit the metrics. Consequently, it needs to be installed on all nodes. But this is commonly the case if you want to use Ganglia, because it requires at least a node daemon (gmond or ganglia-monitor) to work.

The libganglia sink has probably less overhead compared to the ganglia sink because it does not require any process generation but initializes the environment and UDP connections only once.

Configuration structure

{
  "<name>": {
    "type": "libganglia",
    "gmetric_config" : "/path/to/gmetric/config",
    "cluster_name": "MyCluster",
    "add_ganglia_group" : true,
    "add_type_to_name": true,
    "add_units" : true,
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an libganglia sink
gmond_config: Path to the Ganglia configuration file gmond.conf (default: /etc/ganglia/gmond.conf)
cluster_name: Set a cluster name for the metric. If not set, it is taken from gmond_config
add_ganglia_group: Add a Ganglia metric group based on meta information. Some old versions of gmetric do not support the --group option
add_type_to_name: Ganglia commonly uses only node-level metrics but with cc-metric-collector, there are metrics for cpus, memory domains, CPU sockets and the whole node. In order to get eeng, this option prefixes the metric name with <type><type-id>_ or device_ depending on the metric tags and meta information. For metrics of the whole node type=node, no prefix is added
add_units: Add metric value unit if there is a unit entry in the metric tags or meta information
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Ganglia Installation

My development system is Ubuntu 20.04. To install the required libraries with apt:

$ sudo apt install libganglia1

The libganglia.so gets installed in /usr/lib. The Ganglia headers libganglia1-dev are not required.

I added a Makefile in the sinks subfolder that searches for the library in /usr and creates a symlink (sinks/libganglia.so) for running/building the cc-metric-collector. So just type make before running/building in the main folder or the sinks subfolder.

6 - nats sink

Toplevel natsSink

`nats` sink

The nats sink publishes all metrics into a NATS network. The publishing key is the database name provided in the configuration file

Configuration structure

{
  "<name>": {
    "type": "nats",
    "database" : "mymetrics",
    "host": "dbhost.example.com",
    "port": "4222",
    "user": "exampleuser",
    "password" : "examplepw",
    "nkey_file": "/path/to/nkey_file",
    "flush_delay": "10s",
    "precision": "s",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an nats sink
database: All metrics are published with this subject
host: Hostname of the NATS server
port: Port number (as string) of the NATS server
user: Username for basic authentication
password: Password for basic authentication
nkey_file: Path to credentials file with NKEY
flush_delay: Maximum time until metrics are sent out (default ‘5s’)
precision: Precision of the timestamp. Valid values are ’s’, ‘ms’, ‘us’ and ’ns’. (default is ’s’)
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

Using `nats` sink for communication with cc-metric-store

The cc-metric-store only accepts metrics with a timestamp precision in seconds, so it is required to use "precision": "s".

7 - prometheus sink

Toplevel prometheusSink

`prometheus` sink

The prometheus sink publishes all metrics via an HTTP server ready to be scraped by a Prometheus server. It creates gauge metrics for all node metrics and gauge vectors for all metrics with a subtype like ‘device’, ‘cpu’ or ‘socket’.

Configuration structure

{
  "<name>": {
    "type": "prometheus",
    "host": "localhost",
    "port": "8080",
    "path": "metrics",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an prometheus sink
host: The HTTP server gets bound to that IP/hostname
port: Portnumber (as string) for the HTTP server
path: Path where the metrics should be servered. The metrics will be published at host:port/path
group_as_namespace: Most metrics contain a group as meta information like ‘memory’, ’load’. With this the metric names are extended to group_name if possible.
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

8 - stdout sink

Toplevel stdoutSink

`stdout` sink

The stdout sink is the most simple sink provided by cc-metric-collector. It writes all metrics in InfluxDB line-procol format to the configurable output file or the common special files stdout and stderr.

Configuration structure

{
  "<name>": {
    "type": "stdout",
    "meta_as_tags" : [],
    "output_file" : "mylogfile.log",
    "process_messages" : {
      "see" : "docs of message processor for valid fields"
    },
    "meta_as_tags" : []
  }
}

type: makes the sink an stdout sink
meta_as_tags: print meta information as tags in the output (optional)
output_file: Write all data to the selected file (optional). There are two ‘special’ files: stdout and stderr. If this option is not provided, the default value is stdout
process_messages: Process messages with given rules before progressing or dropping, see here (optional)
meta_as_tags: print all meta information as tags in the output (deprecated, optional)

cc-metric-collector's sinks

CCMetric sinks

Available sinks:

Configuration

Contributing own sinks

Sample sink

1 - ganglia sink

ganglia sink

Configuration structure

2 - http sink

http sink

Configuration structure

Using http sink for communication with cc-metric-store

3 - influxasync sink

influxasync sink

Configuration structure

Using influxasync sink for communication with cc-metric-store

4 - influxdb sink

influxdb sink

Configuration structure

Influx client options:

Using influxdb sink for communication with cc-metric-store

5 - libganglia sink

libganglia sink

Configuration structure

Ganglia Installation

6 - nats sink

nats sink

Configuration structure

Using nats sink for communication with cc-metric-store

7 - prometheus sink

prometheus sink

Configuration structure

8 - stdout sink

stdout sink

Configuration structure

`ganglia` sink

`http` sink

Using `http` sink for communication with cc-metric-store

`influxasync` sink

Using `influxasync` sink for communication with cc-metric-store

`influxdb` sink

Using `influxdb` sink for communication with cc-metric-store

`libganglia` sink

`nats` sink

Using `nats` sink for communication with cc-metric-store

`prometheus` sink

`stdout` sink