Performance Tuning
This guide covers the key configuration knobs that affect backup and restore throughput, memory usage, and storage efficiency.
Quick Reference
| Parameter | Default | Affects | Tune When |
|---|---|---|---|
prefetch_count | 100 | Throughput, memory | Slow backup or high memory |
segment_max_bytes | 128 MB | Storage I/O, memory | Large/small messages |
segment_max_interval_ms | 60,000 ms | Segment flush frequency | Low-volume queues |
compression | zstd | CPU, storage size | CPU-constrained or storage-constrained |
compression_level | 3 | CPU vs. ratio tradeoff | Need smaller files or faster writes |
max_concurrent_queues | 4 | Throughput, connections | Many queues or limited connections |
requeue_strategy | cancel | Throughput, message ordering | Specific ordering requirements |
produce_batch_size | 100 | Restore throughput | Slow restores |
Prefetch Count
prefetch_count controls how many messages the broker sends to the backup client before waiting for acknowledgement. Higher values increase throughput but use more memory.
backup:
prefetch_count: 100 # default
Increase to 500-1000 for high-throughput backups with small messages:
backup:
prefetch_count: 500
Decrease to 10-50 for large messages (>1 MB each) to control memory:
backup:
prefetch_count: 20
Memory Impact
Approximate memory per queue: prefetch_count x average_message_size.
Example: 500 prefetch x 10 KB messages x 4 concurrent queues = 20 MB.
Segment Size
Segments are the unit of storage. Each segment file contains multiple messages, compressed together.
backup:
segment_max_bytes: 134217728 # 128 MB (default)
segment_max_interval_ms: 60000 # 60 seconds (default)
A segment is flushed when either the byte limit or time limit is reached, whichever comes first.
Larger segments (256 MB+):
- Better compression ratios (more data to compress together)
- Fewer storage API calls
- Higher memory usage during write
- Coarser granularity for PITR
backup:
segment_max_bytes: 268435456 # 256 MB
Smaller segments (16-64 MB):
- Lower memory footprint
- Finer PITR granularity
- More storage API calls
- Slightly worse compression ratio
backup:
segment_max_bytes: 67108864 # 64 MB
For low-volume queues, reduce the time interval to flush segments faster:
backup:
segment_max_interval_ms: 10000 # 10 seconds
Compression
Two compression algorithms are available:
| Algorithm | Speed | Ratio | CPU Cost | Best For |
|---|---|---|---|---|
zstd | Fast | Excellent | Moderate | General use (default) |
lz4 | Very fast | Good | Low | CPU-constrained environments |
none | N/A | 1:1 | None | Already compressed payloads, debugging |
Zstd Compression Level
Zstd supports levels 1-22. The default is 3, which offers a good balance.
backup:
compression: zstd
compression_level: 3 # default
| Level | Speed | Ratio | Use Case |
|---|---|---|---|
| 1 | Fastest | Lower | Network-bound, fast backups |
| 3 | Fast | Good | Default, general purpose |
| 6 | Moderate | Better | Storage-constrained |
| 12+ | Slow | Best | Archival, storage is expensive |
CPU-constrained? Use lz4:
backup:
compression: lz4
Already compressed payloads (images, protobuf)? Disable compression:
backup:
compression: none
Concurrent Queues
max_concurrent_queues controls how many queues are backed up or restored simultaneously:
backup:
max_concurrent_queues: 4 # default
Each concurrent queue uses:
- 1 AMQP channel (backup) or publisher (restore)
- Memory proportional to
prefetch_count x message_size - 1 segment writer
Increase when backing up many small queues:
backup:
max_concurrent_queues: 8
Decrease when the broker has limited connections or you need to control load:
backup:
max_concurrent_queues: 2
Requeue Strategy
The requeue strategy determines how messages are returned to the queue after reading during backup.
| Strategy | Speed | Description |
|---|---|---|
cancel | Fastest | Consume batch, cancel consumer -- all unacked messages requeue |
nack | Fast | basic.nack(requeue=true) per message |
reject | Fast | basic.reject(requeue=true) per message |
get | Slowest | basic.get pull mode, one message at a time |
backup:
requeue_strategy: cancel # default, fastest
Use nack if you need per-message control. Use get only when the broker does not support consumer cancellation properly.
Restore Tuning
Batch Size
produce_batch_size controls how many messages are published before waiting for confirms:
restore:
produce_batch_size: 100 # default
Increase to 500-1000 for faster restores:
restore:
produce_batch_size: 500
Rate Limiting
Limit the restore rate to avoid overwhelming the target broker:
restore:
rate_limit_messages_per_sec: 5000 # 0 = unlimited (default)
Publisher Confirms
Disabling publisher confirms increases speed but risks message loss:
restore:
publisher_confirms: true # default, recommended
Network Optimization
Checkpoint Sync Interval
The checkpoint database is synced to remote storage periodically. Longer intervals reduce I/O but risk replaying more messages on crash:
offset_storage:
sync_interval_secs: 30 # default
backup:
checkpoint_interval_secs: 5 # local checkpoint interval (default)
Connection to Remote Broker
For high-latency connections to the RabbitMQ broker:
- Increase prefetch to amortize round-trip time
- Use
cancelstrategy for batch-oriented reads - Reduce concurrent queues to avoid connection pool exhaustion
Benchmarking
Run a backup with verbose logging to see per-queue timing:
rabbitmq-backup backup -v --config backup.yaml 2>&1 | grep -E "completed|segment"
Monitor throughput via Prometheus metrics:
rate(rabbitmq_backup_messages_read[1m])
rate(rabbitmq_backup_bytes_read[1m])
Example: High-Throughput Configuration
backup:
prefetch_count: 500
segment_max_bytes: 268435456 # 256 MB
compression: lz4
max_concurrent_queues: 8
requeue_strategy: cancel
stop_at_current_depth: true
Example: Low-Memory Configuration
backup:
prefetch_count: 20
segment_max_bytes: 33554432 # 32 MB
compression: zstd
compression_level: 3
max_concurrent_queues: 2
requeue_strategy: cancel