archive-manager
Categories:
The archive-manager tool provides comprehensive management and maintenance capabilities for ClusterCockpit job archives. It supports validation, cleaning, importing between different archive backends, and general archive operations.
Build
cd tools/archive-manager
go build
Command-Line Options
-s <path>
Function: Specify the source job archive path.
Default: ./var/job-archive
Example: -s /data/job-archive
-config <path>
Function: Specify alternative path to config.json.
Default: ./config.json
Example: -config /etc/clustercockpit/config.json
-validate
Function: Validate a job archive against the JSON schema.
-remove-cluster <cluster>
Function: Remove specified cluster from archive and database.
Example: -remove-cluster oldcluster
-remove-before <date>
Function: Remove all jobs with start time before the specified date.
Format: 2006-Jan-04
Example: -remove-before 2023-Jan-01
-remove-after <date>
Function: Remove all jobs with start time after the specified date.
Format: 2006-Jan-04
Example: -remove-after 2024-Dec-31
-import
Function: Import jobs from source archive to destination archive.
Note: Requires -src-config and -dst-config options.
-convert
Function: Convert an archive between JSON and Parquet formats.
Note: Requires -src-config and -dst-config options. Use -format to specify
the output format.
-format <format>
Function: Output format for archive conversion.
Arguments: json | parquet
Default: json
Example: -format parquet
-max-file-size <n>
Function: Maximum Parquet file size in MB before splitting into a new file.
Only relevant when -format parquet is used.
Default: 512
Example: -max-file-size 256
-src-config <json>
Function: Source archive backend configuration in JSON format.
Example: -src-config '{"kind":"file","path":"./archive"}'
-dst-config <json>
Function: Destination archive backend configuration in JSON format.
Example: -dst-config '{"kind":"sqlite","dbPath":"./archive.db"}'
-loglevel <level>
Function: Sets the logging level.
Arguments: debug | info | warn | err | fatal | crit
Default: info
Example: -loglevel debug
-logdate
Function: Set this flag to add date and time to log messages.
Usage Examples
Validate Archive
./archive-manager -s /data/job-archive -validate
Clean Old Jobs
# Remove jobs older than January 1, 2023
./archive-manager -s /data/job-archive -remove-before 2023-Jan-01
Import Between Archives
# Import from file-based archive to SQLite archive
./archive-manager -import \
-src-config '{"kind":"file","path":"./old-archive"}' \
-dst-config '{"kind":"sqlite","dbPath":"./new-archive.db"}'
Convert Archive Format
# Convert JSON file archive to Parquet format
./archive-manager -convert \
-src-config '{"kind":"file","path":"./job-archive"}' \
-dst-config '{"kind":"s3","endpoint":"http://minio:9000","bucket":"parquet-archive","access-key":"key","secret-key":"secret"}' \
-format parquet
# Convert Parquet archive back to JSON file archive
./archive-manager -convert \
-src-config '{"kind":"s3","endpoint":"http://minio:9000","bucket":"parquet-archive","access-key":"key","secret-key":"secret"}' \
-dst-config '{"kind":"file","path":"./job-archive-restored"}' \
-format json
Archive Information
# Display archive statistics
./archive-manager -s /data/job-archive
Features
- Validation: Verify job archive integrity against JSON schemas
- Cleaning: Remove jobs by date range or cluster
- Import/Export: Transfer jobs between different archive backend types
- Format Conversion: Convert archives between JSON and Parquet formats
- Statistics: Display archive information and job counts
- Progress Tracking: Real-time progress reporting for long operations
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.