Storage Format
This page documents the storage layout used by rabbitmq-backup, the RBAK binary segment format, and the manifest JSON schema.
Storage Layout
Every backup is stored under a {backup_id}/ prefix in the configured storage backend. The directory structure is:
{prefix}/{backup_id}/
├ ── manifest.json
├── definitions/
│ ├── definitions.json.zst
│ └── rollback-before-import-<timestamp>.zst
├── state/
│ └── offsets.db
└── queues/
└── {vhost}/
└── {queue_name}/
├── segment-0001.zst
├── segment-0002.zst
└── segment-NNNN.zst
Path Components
| Component | Description |
|---|---|
{prefix} | Optional storage prefix from the storage.prefix configuration field. |
{backup_id} | The backup_id from the configuration file (e.g. prod-daily-2025-04-10). |
manifest.json | Backup manifest containing metadata about all queues, segments, and definitions. |
definitions/definitions.json.zst | Compressed RabbitMQ definitions export (topology, users, policies). Extension matches the compression algorithm: .zst for zstd, .lz4 for LZ4, or no extension for uncompressed. |
definitions/rollback-before-import-<timestamp>.<ext> | Target-cluster definitions snapshot written immediately before a full restore imports backed-up definitions. Used for manual rollback/recovery. |
state/offsets.db | SQLite checkpoint database for resumable backups (synced from local to remote when configured). |
queues/{vhost}/{queue_name}/ | Per-queue directory containing message segments. |
segment-NNNN.zst | Individual segment file. NNNN is a zero-padded 4-digit sequence number. |
Vhost Encoding
The default vhost / is encoded as _default in storage paths to avoid path separator conflicts:
| Vhost | Storage Path |
|---|---|
/ | queues/_default/orders/segment-0001.zst |
production | queues/production/orders/segment-0001.zst |
Checkpoint State
Checkpoint state is stored in SQLite. Backup checkpoints track:
- queue progress by
backup_id,vhost, andqueue_name - last completed segment sequence
- target message count for the run
- whether the queue or stream completed
- stream offsets for stream queues
For object storage, offset_storage.s3_key syncs the SQLite database to remote storage, commonly under {backup_id}/state/offsets.db.
Restore checkpoints are local files configured with restore.checkpoint_state. They track processed records per queue so interrupted restores can be rerun without duplicating completed queues.
File Extension Convention
| Compression | Extension |
|---|---|
| zstd | .zst |
| LZ4 | .lz4 |
| None | (no extension) |
Manifest JSON Schema
The manifest is stored at {backup_id}/manifest.json and contains the complete metadata for a backup. It is written as the final step of a backup operation.
BackupManifest
{
"backup_id": "prod-daily-2025-04-10",
"created_at": 1712756400000,
"completed_at": 1712756535000,
"source_cluster": "rabbit@node1",
"rabbitmq_version": "3.13.2",
"backup_tool_version": "0.1.0",
"definitions": { ... },
"queues": [ ... ],
"total_messages": 2621,
"total_bytes": 905216,
"total_segments": 4
}
| Field | Type | Nullable | Description |
|---|---|---|---|
backup_id | string | No | Unique identifier matching the config. |
created_at | i64 (epoch ms) | No | Timestamp when the backup started. |
completed_at | i64 (epoch ms) | Yes | Timestamp when the backup completed. null if the backup was interrupted. |
source_cluster | string | Yes | Cluster name from the Management API /api/overview. |
rabbitmq_version | string | Yes | RabbitMQ version from the Management API. |
backup_tool_version | string | No | Version of rabbitmq-backup that created this backup. |
definitions | object | Yes | Metadata about the definitions export. See DefinitionsBackup. |
queues | array | No | Per-queue backup metadata. See QueueBackup. |
total_messages | u64 | No | Sum of all queue message counts. |
total_bytes | u64 | No | Sum of all segment sizes (compressed). |
total_segments | u64 | No | Total number of segment files. |
DefinitionsBackup
{
"key": "backup-001/definitions/definitions.json.zst",
"vhost_count": 2,
"queue_count": 5,
"exchange_count": 8,
"user_count": 3,
"size_bytes": 4300
}
| Field | Type | Description |
|---|---|---|
key | string | Storage key for the compressed definitions file. |
vhost_count | usize | Number of virtual hosts in the definitions. |
queue_count | usize | Number of queues. |
exchange_count | usize | Number of exchanges. |
user_count | usize | Number of users. |
size_bytes | u64 | Uncompressed size of the definitions JSON. |
QueueBackup
{
"vhost": "/",
"name": "orders",
"queue_type": "classic",
"segments": [ ... ],
"message_count": 1542,
"first_message_timestamp": 1712700000000,
"last_message_timestamp": 1712756400000
}
| Field | Type | Nullable | Description |
|---|---|---|---|
vhost | string | No | Virtual host of the queue. |
name | string | No | Queue name. |
queue_type | string | No | Queue type: classic, quorum, or stream. |
segments | array | No | List of segment metadata. See SegmentMetadata. |
message_count | u64 | No | Total messages backed up from this queue. |
first_message_timestamp | i64 (epoch ms) | Yes | backed_up_at timestamp of the first message. |
last_message_timestamp | i64 (epoch ms) | Yes | backed_up_at timestamp of the last message. |
SegmentMetadata
{
"key": "backup-001/queues/_default/orders/segment-0001.zst",
"sequence": 1,
"record_count": 1000,
"size_bytes": 348160,
"uncompressed_bytes": 1048576,
"first_timestamp": 1712700000000,
"last_timestamp": 1712720000000,
"checksum": "a1b2c3d4e5f6...64-char-sha256-hex..."
}
| Field | Type | Nullable | Description |
|---|---|---|---|
key | string | No | Storage key for the segment file. |
sequence | u64 | No | Monotonically increasing sequence number within the queue. |
record_count | u64 | No | Number of message records in this segment. |
size_bytes | u64 | No | Compressed segment size (header + payload + footer). |
uncompressed_bytes | u64 | No | Uncompressed payload size (before compression). |
first_timestamp | i64 (epoch ms) | Yes | backed_up_at timestamp of the first record. |
last_timestamp | i64 (epoch ms) | Yes | backed_up_at timestamp of the last record. |
checksum | string | No | SHA-256 hex digest of the complete segment file (header + compressed payload + footer). |
RBAK Segment Format
Each segment file uses a custom binary format with a 32-byte header, a compressed payload, and an 8-byte footer.
Overview
┌──────────────────────────────────┐
│ Header (32 bytes) │
├──────────────────────────────────┤
│ Compressed Payload (var) │
│ [length-prefixed JSON records] │
├──────────────────────────────────┤
│ Footer (8 bytes) │
└──────────────────────────────────┘
Header (32 bytes)
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 4 | Magic | ASCII RBAK (0x5242414B). Identifies the file as an RBAK segment. |
| 4 | 1 | Version | Format version. Currently 1. |
| 5 | 1 | Compression | Compression algorithm: 0 = none, 1 = zstd, 2 = LZ4. |
| 6 | 2 | Reserved | Reserved for future use. Set to 0x0000. |
| 8 | 8 | Record Count | Little-endian u64. Number of message records in the payload. |
| 16 | 8 | First Timestamp | Little-endian i64. backed_up_at epoch milliseconds of the first record. |
| 24 | 8 | Last Timestamp | Little-endian i64. backed_up_at epoch milliseconds of the last record. |
Compressed Payload
The payload is the compressed form of a sequence of length-prefixed JSON records. Before compression, the payload is structured as:
┌─────────────────────┐
│ len₁ (4 bytes, LE) │ ← u32 length of JSON₁
│ JSON₁ (len₁ bytes) │ ← BackupRecord serialized as JSON
├─────────────────────┤
│ len₂ (4 bytes, LE) │
│ JSON₂ (len₂ bytes) │
├─────────────────────┤
│ ... │
├─────────────────────┤
│ lenₙ (4 bytes, LE) │
│ JSONₙ (lenₙ bytes) │
└─────────────────────┘
Each record is a BackupRecord serialized as JSON:
{
"body": [104, 101, 108, 108, 111],
"properties": {
"content_type": "application/json",
"content_encoding": null,
"delivery_mode": 2,
"priority": null,
"correlation_id": "corr-123",
"reply_to": null,
"expiration": null,
"message_id": "msg-456",
"timestamp": 1712756400,
"type_field": "order.created",
"user_id": null,
"app_id": "my-service",
"cluster_id": null
},
"headers": [
["x-retry-count", {"Long": 3}],
["x-source", {"LongString": "api-gateway"}]
],
"exchange": "orders.exchange",
"routing_key": "order.created",
"delivery_tag": 42,
"redelivered": false,
"backed_up_at": 1712756400123,
"source_queue": "orders",
"source_vhost": "/"
}
BackupRecord Fields
| Field | Type | Description |
|---|---|---|
body | [u8] or null | Message body as a byte array. null for zero-length bodies. |
properties | object | AMQP basic properties (see BackupProperties below). |
headers | array of [key, value] | AMQP message headers from basic.properties.headers. |
exchange | string | Exchange the message was published to. Empty string for stream messages. |
routing_key | string | Routing key. Empty string for stream messages. |
delivery_tag | u64 | Delivery tag from the broker (or stream offset for stream queues). |
redelivered | bool | Whether the broker marked the message as redelivered. |
backed_up_at | i64 (epoch ms) | Timestamp when this record was captured by the backup tool. |
source_queue | string | Queue name this message was read from. |
source_vhost | string | Vhost of the source queue. |
BackupProperties Fields
| Field | Type | Description |
|---|---|---|
content_type | string or null | MIME content type (e.g. application/json). |
content_encoding | string or null | Content encoding (e.g. utf-8, gzip). |
delivery_mode | u8 or null | 1 = non-persistent, 2 = persistent. |
priority | u8 or null | Message priority (0-255). |
correlation_id | string or null | Correlation identifier. |
reply_to | string or null | Reply-to queue/address. |
expiration | string or null | Message TTL in milliseconds (as string). |
message_id | string or null | Application-level message identifier. |
timestamp | i64 or null | AMQP timestamp (seconds since epoch). |
type_field | string or null | Message type (application-defined). |
user_id | string or null | Publishing user. |
app_id | string or null | Publishing application. |
cluster_id | string or null | Cluster identifier (deprecated in AMQP). |
BackupHeaderValue Variants
| Variant | JSON Representation | Description |
|---|---|---|
LongString | {"LongString": "value"} | AMQP long string. |
ShortString | {"ShortString": "value"} | AMQP short string. |
Long | {"Long": 42} | 64-bit integer. |
Short | {"Short": 5} | 16-bit integer. |
Bool | {"Bool": true} | Boolean. |
Bytes | {"Bytes": [1, 2, 3]} | Raw byte array. |
Timestamp | {"Timestamp": 1712756400} | AMQP timestamp. |
Float | {"Float": 3.14} | 32-bit float. |
Double | {"Double": 3.14159} | 64-bit float. |
Void | "Void" | Null/void value. |
Table | {"Table": [[key, value], ...]} | Nested field table. |
Array | {"Array": [value, ...]} | Field array. |
Footer (8 bytes)
| Offset | Size | Field | Description |
|---|---|---|---|
| N-8 | 4 | CRC32 | Little-endian u32. CRC32 checksum of all preceding bytes (header + compressed payload). Computed using crc32fast. |
| N-4 | 4 | End Magic | ASCII KABR (0x4B414252). Reverse of the start magic; signals the end of the segment. |
Integrity Verification
Segment integrity is verified in two stages:
-
CRC32 check: The CRC32 in the footer is computed over all bytes from offset 0 to
len - 8(header + compressed payload). On read, the actual CRC32 is recomputed and compared against the stored value. -
SHA-256 check: The SHA-256 digest of the entire segment file (header + compressed payload + footer) is stored in the manifest's
SegmentMetadata.checksumfield. Thevalidate --deepcommand verifies this.
Storage URL Formats
The list, describe, and validate commands accept storage paths in URL format. The from_url parser supports:
| Scheme | Format | Example |
|---|---|---|
| S3 | s3://bucket?region=...&endpoint=... | s3://my-bucket?region=us-east-1 |
| Azure | azure://account.blob.core.windows.net/container | azure://myaccount.blob.core.windows.net/backups |
| GCS | gcs://bucket | gcs://my-gcs-bucket |
| Filesystem | file:///path | file:///var/lib/rabbitmq-backup/data |
| Memory | memory:// | memory:// |
Query parameters for S3 URLs:
| Parameter | Description |
|---|---|
region | AWS region |
endpoint | Custom endpoint URL |
path_style | Set to true for path-style requests |
Definitions File Format
The definitions file at {backup_id}/definitions/definitions.json.zst is a compressed JSON export from the RabbitMQ Management API. When decompressed, it follows the standard RabbitMQ definitions JSON format:
{
"rabbit_version": "3.13.2",
"rabbitmq_version": "3.13.2",
"vhosts": [...],
"users": [...],
"permissions": [...],
"exchanges": [...],
"queues": [...],
"bindings": [...],
"policies": [...],
"parameters": [...],
"global_parameters": [...]
}
This is the same format used by rabbitmqadmin export and the Management API GET /api/definitions endpoint.