Skip to main content

Non-Destructive Backup

The core challenge of backing up RabbitMQ messages is reading them without consuming (deleting) them. Unlike Kafka, where consumers read from a log without affecting other consumers, AMQP message delivery is inherently destructive -- acknowledging a message removes it from the queue.

This page explains how rabbitmq-backup achieves non-destructive backup, the trade-offs of each strategy, and why the cancel strategy is the recommended default.

The Problem

In AMQP 0-9-1, messages flow through a delivery lifecycle:

Queue → Deliver → Consumer → Ack/Nack/Reject

Once a consumer acknowledges (basic.ack) a message, the broker permanently removes it from the queue. To back up messages without removing them, we need to read them and then ensure they return to the queue.

Why basic.get + basic.nack Is Unreliable for Classic Queues

The naive approach is:

  1. basic.get (pull one message)
  2. Record the message
  3. basic.nack(requeue=true) (put it back)

This fails for classic queues because of message reordering. When a message is nacked with requeue=true on a classic queue, the broker places the message back at the head of the queue (approximately). This means:

  • The same message can be delivered again on the next basic.get.
  • There is no reliable way to advance through the queue.
  • The backup loops over the same messages indefinitely.

Quorum queues handle requeue differently (they maintain delivery ordering), but the get + nack approach is still slow (one message at a time) and sets the redelivered flag.


The cancel Strategy (Default)

The cancel strategy is the recommended and default approach. It exploits a key behavior of AMQP 0-9-1: when a consumer is cancelled, all unacknowledged messages are returned to the queue.

How It Works

┌────────────┐         ┌──────────────┐         ┌───────────────┐
│ RabbitMQ │ │ rabbitmq- │ │ Storage │
│ Queue │ │ backup │ │ (S3/etc) │
└──────┬─────┘ └──────┬───────┘ └───────┬───────┘
│ │ │
│ 1. basic.consume │ │
│ (no_ack=false) │ │
│◄─────────────────────│ │
│ │ │
│ 2. Deliver msg₁ │ │
│─────────────────────►│ │
│ Deliver msg₂ │ │
│─────────────────────►│ 3. Record to segment │
│ Deliver msg₃ │─────────────────────────►
│─────────────────────►│ │
│ ... │ │
│ Deliver msgₙ │ │
│─────────────────────►│ │
│ │ │
│ 4. basic.cancel │ │
│◄─────────────────────│ │
│ │ │
│ (broker requeues │ │
│ all N unacked │ │
│ messages) │ │
│ │ │

Step by step:

  1. basic.consume with no_ack=false: Start a push-based consumer on the queue. Messages are delivered automatically by the broker. We intentionally do NOT acknowledge any of them.

  2. Receive messages: The broker pushes messages as AMQP frames (Deliver + ContentHeader + ContentBody). The MessageAssembler state machine reconstructs complete messages from potentially multi-frame deliveries.

  3. Record to segment: Each assembled message is added to the SegmentWriter buffer. When the buffer reaches the size threshold (segment_max_bytes) or time threshold (segment_max_interval_ms), the segment is finalized (compressed, checksummed) and uploaded to storage.

  4. basic.cancel: Cancel the consumer. The AMQP specification requires the broker to requeue all unacknowledged messages when a consumer is cancelled. This is the critical step -- it returns all backed-up messages to the queue without data loss.

Implementation in Code

From backup/engine.rs (the core backup loop):

// Start consumer (no_ack=false)
client.basic_consume(channel_id, &queue.name, &consumer_tag).await?;

// Read messages in a loop
loop {
// Receive and record frames...
if should_stop { break; }
}

// Cancel consumer -- all unacked messages requeue
client.basic_cancel(channel_id, &consumer_tag).await?;

The basic.cancel is also implicitly triggered if the connection is dropped (e.g., process crash, network failure), ensuring messages are never lost even in failure scenarios.

One-Shot Mode

With stop_at_current_depth: true (the default), the backup reads exactly the current queue depth and then stops:

  1. Query the queue depth via the Management API before consuming.
  2. Count received messages.
  3. When received_count >= initial_depth, break the loop and cancel.

This prevents the backup from running indefinitely on queues that are actively receiving new messages.


Alternative Strategies

rabbitmq-backup supports four requeue strategies, configurable via backup.requeue_strategy:

nack Strategy

backup:
requeue_strategy: nack

Protocol flow:

  1. basic.get (pull one message)
  2. Record the message
  3. basic.nack(delivery_tag, requeue=true)
  4. Repeat

How it works: Each message is individually pulled, recorded, and then negatively acknowledged with requeue=true. The broker puts the message back in the queue.

Pros:

  • Explicit per-message control
  • Works with quorum queues (which maintain delivery order on requeue)

Cons:

  • Unreliable for classic queues (message reordering on requeue)
  • Sets redelivered=true on every message
  • Increments x-delivery-count on quorum queues
  • Slower than cancel (round-trip per message)

reject Strategy

backup:
requeue_strategy: reject

Protocol flow:

  1. basic.get (pull one message)
  2. Record the message
  3. basic.reject(delivery_tag, requeue=true)
  4. Repeat

How it works: Same as nack but uses basic.reject instead. The main difference is that reject only handles a single message (no multiple flag), while nack supports multi-message acknowledgment.

Pros/Cons: Same as nack. Use only if your broker or client library has issues with basic.nack.

get Strategy

backup:
requeue_strategy: get

Protocol flow:

  1. basic.get (pull one message)
  2. Record the message
  3. basic.nack(requeue=true)
  4. Repeat (one at a time)

How it works: Pure pull-mode, one message at a time. The slowest strategy.

When to use: Only for debugging or very small queues where throughput does not matter.


Trade-Offs

The redelivered Flag

All non-destructive strategies cause the redelivered flag to be set on backed-up messages:

Strategyredelivered after backup?
cancelYes -- messages are redelivered after consumer cancellation
nackYes -- nack(requeue=true) sets redelivered
rejectYes -- reject(requeue=true) sets redelivered
getYes -- same as nack
Stream ProtocolNo -- streams are append-only logs

Impact: If your application logic depends on the redelivered flag to detect duplicate delivery, a backup operation will cause all messages to appear as redelivered. This is generally acceptable because:

  1. Well-designed consumers should be idempotent (handle redelivery gracefully).
  2. The redelivered flag is already unreliable in production (network issues, consumer crashes, and broker restarts all cause redelivery).
  3. Stream queues do not have this limitation (reads are non-destructive by design).

Quorum Queue x-delivery-count

Quorum queues track the number of times a message has been delivered using the x-delivery-count header. Each backup operation increments this counter.

Delivery limit interaction: If a quorum queue has a delivery limit configured (e.g., x-delivery-count >= 5), backup operations count toward that limit. After enough backup runs, messages may reach the delivery limit and be routed to a dead-letter exchange.

Mitigation:

  • Set the delivery limit high enough to account for backup frequency.
  • Monitor x-delivery-count via the Management API.
  • Use stream queues for data that needs frequent backups (streams have no delivery count).

Message Ordering

StrategyPreserves message order?
cancelYes -- messages are consumed in queue order and requeued in bulk
nackNo (classic) / Yes (quorum) -- classic queues requeue at approximate head position
rejectNo (classic) / Yes (quorum)
getNo (classic) / Yes (quorum)

The cancel strategy is the most ordering-friendly for classic queues because it returns all messages in a single bulk operation rather than individual requeues.


Performance Comparison

StrategyThroughputRound-trips per messageRecommended
cancelHigh0 (push-based)Yes (default)
nackMedium2 (get + nack)Quorum queues only
rejectMedium2 (get + reject)Rarely
getLow2 (get + nack)Debugging only
StreamHigh0 (push-based)Stream queues

The cancel strategy achieves the highest throughput because:

  1. Push-based delivery: The broker pushes messages continuously without per-message round-trips.
  2. No per-message acknowledgment: No ack/nack overhead during the read phase.
  3. Bulk requeue: All messages are requeued in a single operation (consumer cancellation).
  4. Prefetch control: QoS is set to 0 (unlimited) with the cancel strategy, allowing the broker to push messages as fast as the network allows.

Concurrent Consumer Impact

When rabbitmq-backup is consuming from a queue, other consumers on the same queue will see reduced throughput because messages are being delivered to the backup consumer.

Mitigation strategies:

  1. Schedule backups during low-traffic periods to minimize impact on production consumers.
  2. Use stream queues for workloads that need concurrent consumers and backups. Stream consumers are completely independent.
  3. Use the cancel strategy for the shortest possible backup window.
  4. Set stop_at_current_depth: true to back up only the current queue depth and release the consumer quickly.

Stream Protocol: The Non-Destructive Alternative

For x-queue-type: stream queues, rabbitmq-backup uses the RabbitMQ Stream Protocol instead of AMQP. Streams are append-only logs where reads are inherently non-destructive:

  • No redelivered flag modification
  • No x-delivery-count increment
  • No impact on other consumers
  • Offset-based reads (start from any position)

See the Stream Protocol page for full details.

The trade-off is that stream queues have different characteristics: no per-message TTL, no priority, FIFO-only ordering, and a different programming model. They are best suited for event streaming and logging workloads rather than task queues.