Checks/CONSUMER_005

NATS Consumer Sequence Ahead of Stream Sequence: Causes and Fixes

Severity
Critical
Category
Consistency
Applies to
Consumer
Check ID
CONSUMER_005
Detection threshold
consumer last delivered sequence exceeds the stream's last sequence

A consumer whose delivered sequence is ahead of the stream’s last sequence is waiting for messages that don’t exist yet — and may never exist at the expected sequence numbers. This paradoxical state occurs when the stream’s sequence space contracts or resets while the consumer’s cursor remains at its previous position. The consumer has “seen the future” that the stream no longer has, and it won’t deliver any new messages until the stream’s last sequence catches up to the consumer’s position. In practice, this means the consumer is silently stuck.

Why this matters

Like CONSUMER_004 (delivered below first sequence), this failure mode is silent and persistent. The consumer exists, its subscription is active, and basic health checks pass. But no messages are delivered because the consumer is waiting for a sequence number the stream hasn’t reached.

The danger is proportional to how far ahead the consumer is. If the consumer’s delivered sequence is 500,000 and the stream’s last sequence is 100,000, the consumer won’t process anything until 400,000 new messages are published — assuming the stream even uses sequential numbering after the event that caused the mismatch. In many cases, the stream was rebuilt or migrated, and its sequence numbering started fresh. The consumer will wait forever.

This state also creates a misleading picture for monitoring. The consumer reports zero num_pending (since from its perspective, there are no new messages), and zero num_ack_pending. An operator glancing at dashboards sees a consumer with no backlog and assumes everything is healthy. Meanwhile, the stream is accumulating messages that no consumer is processing. The gap between “looks healthy” and “is healthy” makes this one of the hardest consumer failures to detect without explicit sequence comparison.

Common causes

  • Stream migration or movement across clusters. When a stream is moved to a different cluster — via mirror, source, or backup/restore — the new stream typically starts with a fresh sequence space. Consumers that existed on the original stream retain their old delivered positions, which are now far ahead of the new stream’s sequence range.

  • Raft state reset or leader election with data loss. If a Raft group loses quorum and is force-recovered, or if a leader election occurs while replicas were significantly behind, the stream’s last sequence may roll back. The consumer’s delivered sequence, persisted in its own Raft group, may not roll back in sync.

  • Stream deletion and recreation with the same name. An operator deletes a stream and recreates it with the same name but fresh data. If durable consumers weren’t deleted before recreation (or were restored from a backup that predates the stream deletion), their positions reference the old stream’s sequence space.

  • Backup restore with sequence mismatch. Restoring a stream from an older backup resets its sequence state to the backup point. Consumers that continued processing after the backup was taken now have delivered positions ahead of the restored stream’s last sequence.

  • JetStream domain change or migration. Moving between JetStream domains can result in sequence renumbering. Consumers that aren’t recreated alongside the stream end up with stale positions in the new domain’s sequence space.

How to diagnose

Compare consumer delivered sequence to stream last sequence

Terminal window
# Get the stream's current state
nats stream info ORDERS --json | jq '{
first_seq: .state.first_seq,
last_seq: .state.last_seq,
messages: .state.messages
}'
# Get the consumer's delivered position
nats consumer info ORDERS my-consumer --json | jq '{
stream_seq_delivered: .delivered.stream_seq,
ack_floor_stream_seq: .ack_floor.stream_seq,
num_pending: .num_pending
}'

If stream_seq_delivered is greater than the stream’s last_seq, the consumer is ahead of the stream.

Scan all consumers on a stream

Terminal window
STREAM="ORDERS"
LAST_SEQ=$(nats stream info "$STREAM" --json | jq '.state.last_seq')
for consumer in $(nats consumer list "$STREAM" --names); do
DELIVERED=$(nats consumer info "$STREAM" "$consumer" --json | jq '.delivered.stream_seq')
if [ "$DELIVERED" -gt "$LAST_SEQ" ]; then
echo "AHEAD: consumer=$consumer delivered=$DELIVERED last_seq=$LAST_SEQ gap=$((DELIVERED - LAST_SEQ))"
fi
done

Programmatic detection

1
import (
2
"fmt"
3
"github.com/nats-io/nats.go"
4
)
5
6
func checkSequenceAhead(js nats.JetStreamContext, streamName string) error {
7
stream, err := js.StreamInfo(streamName)
8
if err != nil {
9
return err
10
}
11
lastSeq := stream.State.LastSeq
12
13
for consumer := range js.ConsumerNames(streamName) {
14
info, err := js.ConsumerInfo(streamName, consumer)
15
if err != nil {
16
continue
17
}
18
if info.Delivered.Stream > lastSeq {
19
fmt.Printf("AHEAD: stream=%s consumer=%s delivered=%d last_seq=%d gap=%d\n",
20
streamName, consumer, info.Delivered.Stream, lastSeq,
21
info.Delivered.Stream-lastSeq)
22
}
23
}
24
return nil
25
}
1
import asyncio
2
import nats
3
4
async def check_sequence_ahead(stream_name: str):
5
nc = await nats.connect()
6
js = nc.jetstream()
7
8
stream = await js.stream_info(stream_name)
9
last_seq = stream.state.last_seq
10
11
async for consumer_name in js.consumer_names(stream_name):
12
info = await js.consumer_info(stream_name, consumer_name)
13
if info.delivered.stream_seq > last_seq:
14
gap = info.delivered.stream_seq - last_seq
15
print(f"AHEAD: stream={stream_name} consumer={consumer_name} "
16
f"delivered={info.delivered.stream_seq} last_seq={last_seq} gap={gap}")
17
18
await nc.close()
19
20
asyncio.run(check_sequence_ahead("ORDERS"))

Determine the root cause

Check whether the stream was recently recreated, restored, or migrated:

Terminal window
# Check stream creation time vs consumer creation time
nats stream info ORDERS --json | jq '.created'
nats consumer info ORDERS my-consumer --json | jq '.created'

If the consumer was created before the stream (which is impossible in normal operation), the stream was deleted and recreated while the consumer’s metadata persisted.

How to fix it

Delete and recreate the consumer

This is the most reliable fix. The consumer’s position is irrecoverably ahead of the stream, so resetting it requires creating a fresh consumer:

Terminal window
# Record the consumer's configuration
nats consumer info ORDERS my-consumer --json | jq '.config' > /tmp/consumer-config.json
# Delete the stale consumer
nats consumer rm ORDERS my-consumer -f
# Recreate with appropriate deliver policy
nats consumer add ORDERS my-consumer \
--deliver-policy all \
--ack-policy explicit \
--filter "orders.>" \
--max-deliver 5 \
--replay instant

Choose the deliver policy based on your needs:

  • all — process every message currently in the stream
  • new — skip existing messages, only process future publishes
  • by-start-sequence --start-seq N — start from a specific sequence

Reset via deliver policy edit

If your server version supports editing the deliver policy on existing consumers:

Terminal window
nats consumer edit ORDERS my-consumer --deliver-policy all

This resets the consumer’s position to the stream’s first sequence without needing to delete and recreate.

Fix the root cause

For stream migration: Always recreate consumers after migrating a stream to a new cluster. Consumer positions are tied to the source stream’s sequence space and are not transferable.

For backup/restore: Include consumer state in your backup/restore procedures. If you restore a stream, also restore or recreate its consumers to ensure sequence alignment.

For stream deletion/recreation: Delete all consumers before deleting the stream. If consumers are managed declaratively (e.g., via Terraform or a CI/CD pipeline), ensure the pipeline recreates consumers when the stream is recreated.

Prevent future occurrences

Pair stream operations with consumer validation. After any stream-level operation (purge, restore, migrate, recreate), run a validation pass that compares all consumer positions against the stream’s sequence range.

Monitor with Insights. Synadia Insights evaluates CONSUMER_005 continuously, alerting when any consumer’s delivered sequence exceeds its stream’s last sequence. This catches the problem immediately after the triggering event, before it has time to cause downstream impact.

Frequently asked questions

Will the consumer self-heal if enough new messages are published?

Theoretically, yes — if the stream eventually reaches the consumer’s delivered sequence, the consumer would resume. But this requires publishing enough messages to close the gap, which could be millions of messages. In practice, if the gap was caused by a stream recreation or restore, the sequence numbering may never reach the consumer’s position. Don’t wait for self-healing; reset the consumer.

Does this affect both push and pull consumers?

Yes. Both consumer types maintain a delivered sequence cursor. If that cursor is ahead of the stream, neither type will deliver messages. Pull consumers may be slightly more obvious — pull requests will return immediately with no messages — but push consumers will silently do nothing.

Can this happen during normal operation without any manual intervention?

It’s extremely rare during normal operation. This state almost always results from an administrative action: stream purge, restore, migration, or deletion/recreation. The one exception is a Raft group recovery after data loss, which can roll back stream sequences without correspondingly rolling back consumer sequences.

What happens to messages published while the consumer is stuck?

Messages accumulate in the stream normally. They’re just not being delivered to the stuck consumer. Once you reset the consumer (with deliver policy all), it will process all accumulated messages. If the stream has retention limits (max_msgs, max_bytes, max_age), messages may be evicted before the consumer is fixed — resulting in permanent message loss for that consumer.

How is this different from CONSUMER_004?

CONSUMER_004 fires when the consumer is behind the stream’s first sequence (pointing at deleted messages). CONSUMER_005 fires when the consumer is ahead of the stream’s last sequence (pointing at messages that don’t exist yet). Both result in a stalled consumer, but the root causes differ: CONSUMER_004 is typically caused by purges and retention eviction, while CONSUMER_005 is typically caused by stream migration, restoration, or sequence resets.

Proactive monitoring for NATS consumer sequence ahead of stream sequence with Synadia Insights

With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.

Start a 14-day Insights trial
Cancel