Skip to main content

Performance Tuning

This guide covers the key configuration knobs that affect backup and restore throughput, memory usage, and storage efficiency.

Quick Reference

ParameterDefaultAffectsTune When
prefetch_count100Throughput, memorySlow backup or high memory
segment_max_bytes128 MBStorage I/O, memoryLarge/small messages
segment_max_interval_ms60,000 msSegment flush frequencyLow-volume queues
compressionzstdCPU, storage sizeCPU-constrained or storage-constrained
compression_level3CPU vs. ratio tradeoffNeed smaller files or faster writes
max_concurrent_queues4Throughput, connectionsMany queues or limited connections
requeue_strategycancelThroughput, message orderingSpecific ordering requirements
produce_batch_size100Restore throughputSlow restores

Prefetch Count

prefetch_count controls how many messages the broker sends to the backup client before waiting for acknowledgement. Higher values increase throughput but use more memory.

backup:
prefetch_count: 100 # default

Increase to 500-1000 for high-throughput backups with small messages:

backup:
prefetch_count: 500

Decrease to 10-50 for large messages (>1 MB each) to control memory:

backup:
prefetch_count: 20

Memory Impact

Approximate memory per queue: prefetch_count x average_message_size.

Example: 500 prefetch x 10 KB messages x 4 concurrent queues = 20 MB.

Segment Size

Segments are the unit of storage. Each segment file contains multiple messages, compressed together.

backup:
segment_max_bytes: 134217728 # 128 MB (default)
segment_max_interval_ms: 60000 # 60 seconds (default)

A segment is flushed when either the byte limit or time limit is reached, whichever comes first.

Larger segments (256 MB+):

  • Better compression ratios (more data to compress together)
  • Fewer storage API calls
  • Higher memory usage during write
  • Coarser granularity for PITR
backup:
segment_max_bytes: 268435456 # 256 MB

Smaller segments (16-64 MB):

  • Lower memory footprint
  • Finer PITR granularity
  • More storage API calls
  • Slightly worse compression ratio
backup:
segment_max_bytes: 67108864 # 64 MB

For low-volume queues, reduce the time interval to flush segments faster:

backup:
segment_max_interval_ms: 10000 # 10 seconds

Compression

Two compression algorithms are available:

AlgorithmSpeedRatioCPU CostBest For
zstdFastExcellentModerateGeneral use (default)
lz4Very fastGoodLowCPU-constrained environments
noneN/A1:1NoneAlready compressed payloads, debugging

Zstd Compression Level

Zstd supports levels 1-22. The default is 3, which offers a good balance.

backup:
compression: zstd
compression_level: 3 # default
LevelSpeedRatioUse Case
1FastestLowerNetwork-bound, fast backups
3FastGoodDefault, general purpose
6ModerateBetterStorage-constrained
12+SlowBestArchival, storage is expensive

CPU-constrained? Use lz4:

backup:
compression: lz4

Already compressed payloads (images, protobuf)? Disable compression:

backup:
compression: none

Concurrent Queues

max_concurrent_queues controls how many queues are backed up or restored simultaneously:

backup:
max_concurrent_queues: 4 # default

Each concurrent queue uses:

  • 1 AMQP channel (backup) or publisher (restore)
  • Memory proportional to prefetch_count x message_size
  • 1 segment writer

Increase when backing up many small queues:

backup:
max_concurrent_queues: 8

Decrease when the broker has limited connections or you need to control load:

backup:
max_concurrent_queues: 2

Requeue Strategy

The requeue strategy determines how messages are returned to the queue after reading during backup.

StrategySpeedDescription
cancelFastestConsume batch, cancel consumer -- all unacked messages requeue
nackFastbasic.nack(requeue=true) per message
rejectFastbasic.reject(requeue=true) per message
getSlowestbasic.get pull mode, one message at a time
backup:
requeue_strategy: cancel # default, fastest

Use nack if you need per-message control. Use get only when the broker does not support consumer cancellation properly.

Restore Tuning

Batch Size

produce_batch_size controls how many messages are published before waiting for confirms:

restore:
produce_batch_size: 100 # default

Increase to 500-1000 for faster restores:

restore:
produce_batch_size: 500

Rate Limiting

Limit the restore rate to avoid overwhelming the target broker:

restore:
rate_limit_messages_per_sec: 5000 # 0 = unlimited (default)

Publisher Confirms

Disabling publisher confirms increases speed but risks message loss:

restore:
publisher_confirms: true # default, recommended

Network Optimization

Checkpoint Sync Interval

The checkpoint database is synced to remote storage periodically. Longer intervals reduce I/O but risk replaying more messages on crash:

offset_storage:
sync_interval_secs: 30 # default

backup:
checkpoint_interval_secs: 5 # local checkpoint interval (default)

Connection to Remote Broker

For high-latency connections to the RabbitMQ broker:

  1. Increase prefetch to amortize round-trip time
  2. Use cancel strategy for batch-oriented reads
  3. Reduce concurrent queues to avoid connection pool exhaustion

Benchmarking

Run a backup with verbose logging to see per-queue timing:

rabbitmq-backup backup -v --config backup.yaml 2>&1 | grep -E "completed|segment"

Monitor throughput via Prometheus metrics:

rate(rabbitmq_backup_messages_read[1m])
rate(rabbitmq_backup_bytes_read[1m])

Example: High-Throughput Configuration

backup-fast.yaml
backup:
prefetch_count: 500
segment_max_bytes: 268435456 # 256 MB
compression: lz4
max_concurrent_queues: 8
requeue_strategy: cancel
stop_at_current_depth: true

Example: Low-Memory Configuration

backup-lowmem.yaml
backup:
prefetch_count: 20
segment_max_bytes: 33554432 # 32 MB
compression: zstd
compression_level: 3
max_concurrent_queues: 2
requeue_strategy: cancel