Checks/CONSUMER_004

NATS Consumer Delivered Below Stream First Sequence: Causes and Fixes

Severity
Critical
Category
Consistency
Applies to
Consumer
Check ID
CONSUMER_004
Detection threshold
consumer last delivered sequence is below the stream's first sequence

A consumer whose last delivered sequence is below the stream’s first sequence is pointing at messages that no longer exist. This typically happens after a stream purge or truncation — the stream’s floor moves forward, but the consumer’s internal position doesn’t advance with it. The consumer appears healthy in monitoring but silently stops delivering new messages. No errors are raised. No alerts fire. Messages accumulate in the stream while the consumer sits idle, waiting for a sequence number that has been permanently deleted.

Why this matters

This is one of the most dangerous consumer failure modes in JetStream because it is completely silent. The consumer reports as running. Its subscription is active. Health checks that only verify “is the consumer present” pass. But no messages flow.

In a typical production scenario, an operator purges a stream to clear bad data or reclaim storage. The purge succeeds, new messages start flowing into the stream at sequence 50,001. But an existing consumer’s delivered position is still at sequence 12,345 — a sequence that was just deleted. The consumer’s next delivery attempt targets a sequence that doesn’t exist, and depending on the server version and consumer configuration, the consumer may stall indefinitely.

The impact compounds over time. Every minute the consumer is stalled, the backlog of unprocessed messages grows. If the consumer drives a critical pipeline — order processing, event aggregation, notification dispatch — the downstream effects cascade. By the time someone notices that messages aren’t being processed, hours of data may have accumulated. In the worst case, if the stream has a max_msgs or max_bytes limit, old unprocessed messages may be evicted before the consumer is fixed, resulting in permanent data loss.

This check exists specifically because the failure is invisible to standard health monitoring. Without explicitly comparing the consumer’s delivered sequence against the stream’s first sequence, there is no signal that anything is wrong.

Common causes

  • Stream purge without consumer reset. An operator runs nats stream purge to clear a stream. All messages are deleted and the stream’s first sequence jumps forward. Existing consumers retain their old delivered position, which now points to a deleted sequence. This is the most common cause by far.

  • Stream truncation via max_msgs, max_bytes, or max_age. The stream’s retention policy evicts old messages, advancing the first sequence. If a consumer was far enough behind — perhaps due to a processing outage — its delivered position now references evicted messages. The consumer was slow, and the stream moved on without it.

  • Partial purge with subject filter. A filtered purge (nats stream purge --subject "orders.>") removes a subset of messages. If a consumer was subscribed to that subject filter and its delivered position was within the purged range, its position becomes invalid. Other consumers on different subjects are unaffected.

  • Stream restore from backup with sequence gap. Restoring a stream from a backup or snapshot can reset the stream’s sequence range. If consumers were not also restored or recreated, their positions may reference sequences outside the restored range.

  • Cross-cluster stream migration. Moving a stream between clusters using mirror or source configurations can result in sequence renumbering. Consumers on the original stream that aren’t migrated alongside it may end up with stale positions.

How to diagnose

Compare consumer delivered sequence to stream first sequence

Terminal window
# Check the stream's sequence range
nats stream info ORDERS --json | jq '{
first_seq: .state.first_seq,
last_seq: .state.last_seq,
messages: .state.messages
}'
# Check the consumer's delivered position
nats consumer info ORDERS my-consumer --json | jq '{
stream_seq_delivered: .delivered.stream_seq,
ack_floor_stream_seq: .ack_floor.stream_seq,
num_pending: .num_pending,
num_ack_pending: .num_ack_pending
}'

If stream_seq_delivered is less than the stream’s first_seq, the consumer is in this broken state.

Check across all consumers on a stream

Terminal window
STREAM="ORDERS"
FIRST_SEQ=$(nats stream info "$STREAM" --json | jq '.state.first_seq')
for consumer in $(nats consumer list "$STREAM" --names); do
DELIVERED=$(nats consumer info "$STREAM" "$consumer" --json | jq '.delivered.stream_seq')
if [ "$DELIVERED" -lt "$FIRST_SEQ" ]; then
echo "BROKEN: consumer=$consumer delivered=$DELIVERED first_seq=$FIRST_SEQ"
fi
done

Programmatic detection

1
import (
2
"fmt"
3
"github.com/nats-io/nats.go"
4
)
5
6
func checkDeliveredBelowFirst(js nats.JetStreamContext, streamName string) error {
7
stream, err := js.StreamInfo(streamName)
8
if err != nil {
9
return err
10
}
11
firstSeq := stream.State.FirstSeq
12
13
for consumer := range js.ConsumerNames(streamName) {
14
info, err := js.ConsumerInfo(streamName, consumer)
15
if err != nil {
16
continue
17
}
18
if info.Delivered.Stream < firstSeq {
19
fmt.Printf("BROKEN: stream=%s consumer=%s delivered=%d first_seq=%d gap=%d\n",
20
streamName, consumer, info.Delivered.Stream, firstSeq,
21
firstSeq-info.Delivered.Stream)
22
}
23
}
24
return nil
25
}
1
import asyncio
2
import nats
3
4
async def check_delivered_below_first(stream_name: str):
5
nc = await nats.connect()
6
js = nc.jetstream()
7
8
stream = await js.stream_info(stream_name)
9
first_seq = stream.state.first_seq
10
11
async for consumer_name in js.consumer_names(stream_name):
12
info = await js.consumer_info(stream_name, consumer_name)
13
if info.delivered.stream_seq < first_seq:
14
print(f"BROKEN: stream={stream_name} consumer={consumer_name} "
15
f"delivered={info.delivered.stream_seq} first_seq={first_seq} "
16
f"gap={first_seq - info.delivered.stream_seq}")
17
18
await nc.close()
19
20
asyncio.run(check_delivered_below_first("ORDERS"))

Verify the consumer is actually stalled

Check whether the consumer’s num_pending is static — if it’s not changing over time, the consumer isn’t processing:

Terminal window
# Run twice, 10 seconds apart
nats consumer info ORDERS my-consumer --json | jq '.num_pending'
sleep 10
nats consumer info ORDERS my-consumer --json | jq '.num_pending'

If num_pending is identical in both checks and the stream is receiving new messages, the consumer is stalled.

How to fix it

A consumer’s delivered cursor cannot be advanced via nats consumer edit--deliver is only accepted on consumer add/copy, and the cursor itself is not editable. Recovery requires deleting the consumer and recreating it with a deliver policy that points to a valid sequence.

Option 1: Delete and recreate the consumer

Terminal window
# Capture the original consumer configuration for reference
nats consumer info ORDERS my-consumer --json > /tmp/consumer-config.json
# Delete the consumer
nats consumer rm ORDERS my-consumer -f
# Recreate. Choose the deliver policy that fits the recovery you need:
# --deliver all → process from the stream's current first sequence
# --deliver new → only future messages
# --deliver <seq> → start from a specific stream sequence
nats consumer add ORDERS my-consumer \
--deliver all \
--ack explicit \
--filter "orders.>" \
--max-deliver 5

If you need to preserve the original configuration verbatim, edit the JSON captured above to reset deliver_policy/opt_start_seq and feed it back through nats consumer add --config <file>. Otherwise, set the flags above to match your original consumer.

Option 2: Automate recovery for future purges

Build purge-then-reset into your operational procedures. Because consumer edit cannot reset the cursor, the script must delete each consumer and re-create it from the post-purge stream state — typically by triggering your application’s consumer-bootstrap path:

#!/bin/bash
STREAM="$1"
# Purge the stream
nats stream purge "$STREAM" -f
# Remove all consumers on this stream so the application recreates them
# against the new first_seq. Durable consumers must be recreated by the
# subscribing service; ephemeral consumers re-attach automatically.
for consumer in $(nats consumer list "$STREAM" --names); do
echo "Removing consumer (will be recreated by client): $consumer"
nats consumer rm "$STREAM" "$consumer" -f
done

Prevent future occurrences

Always reset consumers after stream purges. Make this a standard operating procedure. Document it in runbooks. Better yet, script it so purge-and-reset is a single atomic operation.

Monitor the gap continuously. Synadia Insights evaluates CONSUMER_004 across all consumers in your deployment, alerting immediately when any consumer’s delivered position falls below the stream’s first sequence — catching the problem in seconds rather than hours.

Set appropriate retention limits. If your streams use max_age or max_msgs retention, ensure consumers can keep up with the eviction rate. A consumer that periodically falls behind risks having its delivered position overtaken by the stream’s advancing first sequence.

Frequently asked questions

Will the consumer eventually self-heal?

No. The consumer’s delivered position is a persistent cursor. It will not automatically advance past the gap to the stream’s current first sequence. Without operator intervention, the consumer will remain stalled indefinitely.

Does this affect push consumers and pull consumers equally?

Yes. Both push and pull consumers maintain a delivered sequence position. If that position references a deleted sequence, both types stall. The difference is visibility — pull consumers may surface the problem faster because client-side pull requests will return no messages, which may trigger application-level alerts. Push consumers fail silently because the server simply has nothing to deliver.

Can I prevent this by using DeliverPolicy: Last or New?

The deliver policy only applies at consumer creation time. Once a consumer exists, its position is tracked by the delivered sequence cursor, not the original deliver policy. Purging the stream after consumer creation still leaves the cursor pointing at the old position regardless of the initial deliver policy.

Is there a risk of duplicate processing when resetting the consumer?

Yes. If you reset to deliver-policy all, the consumer will redeliver messages that were already processed before the purge (if any survived the purge). Use idempotent message processing or set deliver-policy by-start-sequence to the stream’s current first_seq to avoid reprocessing. If only new messages matter, use deliver-policy new.

Proactive monitoring for NATS consumer delivered below stream first sequence with Synadia Insights

With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.

Start a 14-day Insights trial
Cancel