Reference information regarding the ClusterCockpit component “cc-metric-store” (GitHub Repo).
This is the multi-page printable view of this section. Click here to print.
cc-metric-store
1 - Command Line
This page describes the command line options for the cc-metric-store
executable.
-config <path>
Function: Specifies alternative path to application configuration file.
Default: ./config.json
Example: -config ./configfiles/configuration.json
-dev
Function: Enables the Swagger UI REST API documentation and playground
-gops
Function: Go server listens via github.com/google/gops/agent (for debugging).
-version
Function: Shows version information and exits.
Example config:
{
"metrics": {
"debug_metric": {
"frequency": 60,
"aggregation": "avg"
},
"clock": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_idle": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_iowait": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_irq": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_system": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_user": {
"frequency": 60,
"aggregation": "avg"
},
"nv_mem_util": {
"frequency": 60,
"aggregation": "avg"
},
"nv_temp": {
"frequency": 60,
"aggregation": "avg"
},
"nv_sm_clock": {
"frequency": 60,
"aggregation": "avg"
},
"acc_utilization": {
"frequency": 60,
"aggregation": "avg"
},
"acc_mem_used": {
"frequency": 60,
"aggregation": "sum"
},
"acc_power": {
"frequency": 60,
"aggregation": "sum"
},
"flops_any": {
"frequency": 60,
"aggregation": "sum"
},
"flops_dp": {
"frequency": 60,
"aggregation": "sum"
},
"flops_sp": {
"frequency": 60,
"aggregation": "sum"
},
"ib_recv": {
"frequency": 60,
"aggregation": "sum"
},
"ib_xmit": {
"frequency": 60,
"aggregation": "sum"
},
"ib_recv_pkts": {
"frequency": 60,
"aggregation": "sum"
},
"ib_xmit_pkts": {
"frequency": 60,
"aggregation": "sum"
},
"cpu_power": {
"frequency": 60,
"aggregation": "sum"
},
"core_power": {
"frequency": 60,
"aggregation": "sum"
},
"mem_power": {
"frequency": 60,
"aggregation": "sum"
},
"ipc": {
"frequency": 60,
"aggregation": "avg"
},
"cpu_load": {
"frequency": 60,
"aggregation": null
},
"lustre_close": {
"frequency": 60,
"aggregation": null
},
"lustre_open": {
"frequency": 60,
"aggregation": null
},
"lustre_statfs": {
"frequency": 60,
"aggregation": null
},
"lustre_read_bytes": {
"frequency": 60,
"aggregation": null
},
"lustre_write_bytes": {
"frequency": 60,
"aggregation": null
},
"net_bw": {
"frequency": 60,
"aggregation": null
},
"file_bw": {
"frequency": 60,
"aggregation": null
},
"mem_bw": {
"frequency": 60,
"aggregation": "sum"
},
"mem_cached": {
"frequency": 60,
"aggregation": null
},
"mem_used": {
"frequency": 60,
"aggregation": null
},
"vectorization_ratio": {
"frequency": 60,
"aggregation": "avg"
}
},
"checkpoints": {
"interval": "1h",
"directory": "./var/checkpoints",
"restore": "1h"
},
"archive": {
"interval": "24h",
"directory": "./var/archive"
},
"http-api": {
"address": "localhost:8082",
"https-cert-file": null,
"https-key-file": null
},
"retention-in-memory": "48h",
"nats": null,
"jwt-public-key": "kzfYrYy+TzpanWZHJ5qSdMj5uKUWgq74BWhQG6copP0="
}
2 - Configuration
Configuration options are located in a JSON file. Default path is config.json
in current working directory. Alternative paths to the configuration file can be
specified using the command line switch -config <filename>
.
All durations are specified as string that will be parsed like
this (Allowed suffixes: s
, m
, h
,
…).
Recognized attributes:
metrics
: Map of metric-name to objects with the following properties (required)frequency
: Timestep/Interval/Resolution of this metric (required)aggregation
: Can be"sum"
,"avg"
ornull
(required)null
means aggregation across nodes is forbidden for this metric"sum"
means that values from the child levels are summed up for the parent level"avg"
means that values from the child levels are averaged for the parent level
nats
: (optional)address
: Url of NATS.io server, example: “nats://localhost:4222”creds-file-path
: Path to a NATS credentials filesubscriptions
(array of objects):subscribe-to
: Where to expect the measurements to be publishedcluster-tag
: Default value for the cluster tag
http-api
: (required)address
: Address to bind to, for example0.0.0.0:8080
(required)https-cert-file
andhttps-key-file
: if provided enable HTTPS using those files as certificate/key (optional)
jwt-public-key
: Base64 encoded string, use this to verify requests to the HTTP API (required)retention-on-memory
: Keep all values in memory for at least that amount of time (required)checkpoints
: (required)interval
: Do checkpoints every X seconds/minutes/hours (required)directory
: Path to a directory (required)restore
: After a restart, load the last X seconds/minutes/hours of data back into memory (required)
archive
: (required)interval
: Move and compress all checkpoints not needed anymore every X seconds/minutes/hours (required)directory
: Path to a directory (required)
3 - Metric Store REST API
Authentication
JWT tokens
cc-metric-store
supports only JWT tokens using the EdDSA/Ed25519 signing
method. The token is provided using the Authorization Bearer header.
Example script to test the endpoint:
#Only use JWT token if the JWT authentication has been setup
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
curl -X 'GET' 'http://localhost:8081/api/query/' -H "Authorization: Bearer $JWT" -d "{ \"cluster\": \"alex\", \"from\": 1720879275, \"to\": 1720964715, \"queries\": [{\"metric\": \"cpu_load\",\"host\": \"a0124\"}] }"
NATS
TODO
Usage of Swagger UI
This Swagger UI is also available as part of cc-metric-store
if you start it
with the dev
option:
./cc-metric-store -dev
You may access it at this URL.
Payload format for write endpoint
The data comes in Influx DB line protocol format.
<metric>,cluster=<cluster>,hostname=<hostname>,type=<node/hwthread/etc> value=<value> <epoch_time_in_ns_or_s>
Real example:
proc_run,cluster=fritz,hostname=f2163,type=node value=4i 1725620476214474893
A more detailed description of the ClusterCockpit flavored Influx DB line protocol and their types can be found here in CC specification.
Example script to test endpoint:
#Only use JWT token if the JWT authentication has been setup
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
curl -X 'GET' 'http://localhost:8081/api/write/?cluster=alex' -H "Authorization: Bearer $JWT" -d "proc_run,cluster=fritz,hostname=f2163,type=node value=4i 1725620476214474893"
Usage of Swagger UI
This Swagger UI is also available as part of cc-metric-store
if you start it
with the dev
option:
./cc-metric-store -dev
You may access it at this URL.
Swagger API Reference
Non-Interactive Documentation
This reference is rendered using theswagger-ui
plugin based on the original definition file found in the ClusterCockpit
repository,
but without a serving backend.This means that all interactivity (“Try It Out”) will not return actual data.
However, a Curl
call and a compiled Request URL
will still be displayed, if
an API endpoint is executed.