Skip to main content

Metrics Reference

rabbitmq-backup exposes Prometheus metrics via an HTTP endpoint. Metrics are collected using the prometheus-client crate (version 0.24) and served in Prometheus text exposition format.

Enabling Metrics

Add a metrics section to your configuration file:

metrics:
enabled: true
port: 8080
bind_address: "0.0.0.0"
path: /metrics

The metrics server starts automatically when the backup or restore command runs. It serves the following endpoints:

EndpointContent-TypeDescription
GET /metricstext/plain; version=0.0.4Prometheus metrics in text exposition format.
GET /healthapplication/jsonHealth check: {"status":"ok"}.
GET /healthzapplication/jsonSame as /health (Kubernetes convention).
GET /text/htmlHTML index page with links to metrics and health endpoints.

Backup Metrics

rabbitmq_backup_messages_read

PropertyValue
TypeCounter
DescriptionTotal number of messages read during backup.
Labelsqueue, vhost, queue_type

Incremented for each message successfully consumed from a queue (AMQP) or stream.

Example:

# HELP rabbitmq_backup_messages_read Total messages read during backup
# TYPE rabbitmq_backup_messages_read counter
rabbitmq_backup_messages_read{queue="orders",vhost="/",queue_type="classic"} 1542
rabbitmq_backup_messages_read{queue="payments",vhost="/",queue_type="quorum"} 823
rabbitmq_backup_messages_read{queue="events",vhost="/",queue_type="stream"} 10000

rabbitmq_backup_bytes_read

PropertyValue
TypeCounter
DescriptionTotal bytes read during backup (uncompressed message payload sizes).
Labelsqueue, vhost, queue_type

Example:

rabbitmq_backup_bytes_read{queue="orders",vhost="/",queue_type="classic"} 2097152

rabbitmq_backup_segments_written

PropertyValue
TypeCounter
DescriptionTotal number of segments finalized and written to storage.
Labelsqueue, vhost, queue_type

Incremented each time a segment is finalized (either by reaching the size threshold segment_max_bytes, the time threshold segment_max_interval_ms, or when the backup completes).

Example:

rabbitmq_backup_segments_written{queue="orders",vhost="/",queue_type="classic"} 2
rabbitmq_backup_segments_written{queue="payments",vhost="/",queue_type="quorum"} 1

rabbitmq_backup_segments_bytes

PropertyValue
TypeCounter
DescriptionTotal bytes written to storage (compressed segment sizes).
Labelsqueue, vhost, queue_type

Example:

rabbitmq_backup_segments_bytes{queue="orders",vhost="/",queue_type="classic"} 524288

rabbitmq_backup_checkpoint_syncs

PropertyValue
TypeCounter
DescriptionTotal number of checkpoint sync operations (local SQLite to remote storage).
Labels(none)

Example:

rabbitmq_backup_checkpoint_syncs 15

rabbitmq_backup_errors

PropertyValue
TypeCounter
DescriptionTotal errors by type and queue.
Labelsqueue, vhost, error_type

The error_type label classifies errors into one of:

error_typeDescription
amqpAMQP 0-9-1 protocol errors.
streamStream protocol errors.
storageStorage backend read/write errors.
serializationJSON/segment serialization errors.
connectionTCP connection failures.
authenticationAMQP or Management API auth failures.

Example:

rabbitmq_backup_errors{queue="orders",vhost="/",error_type="amqp"} 1
rabbitmq_backup_errors{queue="payments",vhost="/",error_type="storage"} 0

Restore Metrics

rabbitmq_restore_messages_published

PropertyValue
TypeCounter
DescriptionTotal messages published to the target cluster during restore.
Labelsqueue, vhost, queue_type

Example:

rabbitmq_restore_messages_published{queue="orders",vhost="/",queue_type="classic"} 1542

rabbitmq_restore_messages_confirmed

PropertyValue
TypeCounter
DescriptionTotal messages confirmed by the target broker (publisher confirms).
Labelsqueue, vhost, queue_type

Only meaningful when restore.publisher_confirms: true.

Example:

rabbitmq_restore_messages_confirmed{queue="orders",vhost="/",queue_type="classic"} 1542

rabbitmq_restore_messages_failed

PropertyValue
TypeCounter
DescriptionTotal messages that failed to publish during restore.
Labelsqueue, vhost, queue_type

Example:

rabbitmq_restore_messages_failed{queue="orders",vhost="/",queue_type="classic"} 0

Connection Metrics

rabbitmq_backup_amqp_connections_active

PropertyValue
TypeGauge
DescriptionNumber of currently active AMQP connections.
Labels(none)

Incremented when a new AMQP connection is established, decremented when closed. During backup, each queue uses its own connection.

Example:

rabbitmq_backup_amqp_connections_active 4

rabbitmq_backup_stream_connections_active

PropertyValue
TypeGauge
DescriptionNumber of currently active stream protocol connections.
Labels(none)

Example:

rabbitmq_backup_stream_connections_active 1

Label Reference

QueueLabels

Used by per-queue backup and restore metrics.

LabelDescriptionExample Values
queueQueue or stream nameorders, payments, events-stream
vhostRabbitMQ virtual host/, production, staging
queue_typeQueue typeclassic, quorum, stream

ErrorLabels

Used by the error counter.

LabelDescriptionExample Values
queueQueue name where the error occurredorders
vhostVirtual host/
error_typeError classificationamqp, stream, storage, serialization, connection, authentication

Prometheus Scrape Configuration

Add a scrape job to your prometheus.yml:

scrape_configs:
- job_name: 'rabbitmq-backup'
scrape_interval: 15s
static_configs:
- targets: ['rabbitmq-backup-host:8080']
labels:
environment: 'production'
cluster: 'rabbitmq-prod'

For Kubernetes with Prometheus Operator, add pod annotations:

apiVersion: v1
kind: Pod
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"

Or use a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: rabbitmq-backup
spec:
selector:
matchLabels:
app: rabbitmq-backup
endpoints:
- port: metrics
interval: 15s
path: /metrics

Grafana Dashboard Setup

Importing a Dashboard

Create a Grafana dashboard with the following panels. The examples below use PromQL queries.

Backup Progress: Messages Per Queue

Panel type: Time series

rate(rabbitmq_backup_messages_read[5m])

Legend: {{queue}} ({{vhost}})

Shows the backup throughput in messages per second, broken down by queue.

Backup Progress: Bytes Written

Panel type: Time series

rate(rabbitmq_backup_segments_bytes[5m])

Legend: {{queue}}

Total Messages Backed Up

Panel type: Stat

sum(rabbitmq_backup_messages_read)

Compression Ratio

Panel type: Stat

1 - (sum(rabbitmq_backup_segments_bytes) / sum(rabbitmq_backup_bytes_read))

Unit: Percent (0-1)

Shows how effective compression is across all queues.

Segments Written

Panel type: Bar gauge

rabbitmq_backup_segments_written

Legend: {{queue}}

Active Connections

Panel type: Gauge

rabbitmq_backup_amqp_connections_active + rabbitmq_backup_stream_connections_active

Error Rate

Panel type: Time series

rate(rabbitmq_backup_errors[5m])

Legend: {{queue}} - {{error_type}}

Use an alert threshold here: any non-zero error rate warrants investigation.

Restore Progress

Panel type: Time series

rate(rabbitmq_restore_messages_published[5m])

Legend: {{queue}}

Restore Failures

Panel type: Stat (red if > 0)

sum(rabbitmq_restore_messages_failed)

Checkpoint Sync Rate

Panel type: Time series

rate(rabbitmq_backup_checkpoint_syncs[5m])

Example Dashboard JSON

You can import this dashboard JSON into Grafana. Save as rabbitmq-backup-dashboard.json:

{
"dashboard": {
"title": "RabbitMQ Backup & Restore",
"tags": ["rabbitmq", "backup"],
"timezone": "browser",
"panels": [
{
"title": "Backup: Messages Read (rate/sec)",
"type": "timeseries",
"targets": [
{
"expr": "rate(rabbitmq_backup_messages_read[5m])",
"legendFormat": "{{queue}} ({{vhost}})"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"title": "Backup: Bytes Written (rate/sec)",
"type": "timeseries",
"targets": [
{
"expr": "rate(rabbitmq_backup_segments_bytes[5m])",
"legendFormat": "{{queue}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"title": "Active Connections",
"type": "gauge",
"targets": [
{
"expr": "rabbitmq_backup_amqp_connections_active",
"legendFormat": "AMQP"
},
{
"expr": "rabbitmq_backup_stream_connections_active",
"legendFormat": "Stream"
}
],
"gridPos": {"h": 8, "w": 6, "x": 0, "y": 8}
},
{
"title": "Errors",
"type": "timeseries",
"targets": [
{
"expr": "rate(rabbitmq_backup_errors[5m])",
"legendFormat": "{{queue}} - {{error_type}}"
}
],
"gridPos": {"h": 8, "w": 6, "x": 6, "y": 8}
},
{
"title": "Restore: Messages Published (rate/sec)",
"type": "timeseries",
"targets": [
{
"expr": "rate(rabbitmq_restore_messages_published[5m])",
"legendFormat": "{{queue}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 8}
}
]
}
}

Alerting Rules

Example Prometheus alerting rules:

groups:
- name: rabbitmq-backup
rules:
- alert: RabbitMQBackupErrors
expr: rate(rabbitmq_backup_errors[5m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "RabbitMQ backup errors detected"
description: "Queue {{ $labels.queue }} in vhost {{ $labels.vhost }} has {{ $labels.error_type }} errors."

- alert: RabbitMQBackupNoProgress
expr: rate(rabbitmq_backup_messages_read[10m]) == 0 AND rabbitmq_backup_amqp_connections_active > 0
for: 10m
labels:
severity: warning
annotations:
summary: "RabbitMQ backup not making progress"
description: "No messages have been read in the last 10 minutes despite active connections."

- alert: RabbitMQRestoreFailures
expr: rabbitmq_restore_messages_failed > 0
for: 1m
labels:
severity: critical
annotations:
summary: "RabbitMQ restore has failed messages"
description: "{{ $value }} messages failed to publish during restore."