In a replicated JetStream stream, all current replicas should have the same message count. When this check fires, every replica reports that it is caught up in Raft consensus — none are lagging — yet they hold different numbers of messages. This silent divergence means at least one replica’s data does not match the others, and the cluster doesn’t know it.
Raft consensus guarantees that committed operations are replicated to a majority. Under normal conditions, all current replicas apply the same sequence of operations and arrive at the same state. When replicas report “current” but have different message counts, the consensus layer believes everything is synchronized while the storage layer tells a different story.
This is dangerous because the divergence is invisible to normal operations. Publishers receive acknowledgments. Consumers read messages. Monitoring shows all replicas current with zero lag. The problem only becomes visible when you compare the actual message counts across replicas — or when a leader election promotes a replica with fewer messages, causing consumers to see gaps or duplicate deliveries.
For streams using interest-based retention, the divergence is particularly insidious. Interest-based retention deletes messages when all consumers have acknowledged them. If consumer acknowledgment propagation fails on one replica, that replica retains messages the others have deleted (or vice versa). The replicas drift further apart over time rather than converging.
For streams using limits-based retention with max_msgs or max_bytes, each replica independently enforces limits based on its local state. A replica with a higher message count may purge messages that a replica with a lower count still needs, creating permanent data loss if leadership transfers.
The worst-case scenario: a leader failure promotes the replica with the fewest messages. Every message present on the old leader but absent on the new leader is permanently lost. Consumers that already processed those messages see them vanish from the stream. Sequence numbers may recycle, breaking deduplication guarantees.
Filestore corruption on one or more replicas. Disk errors, unclean shutdowns, or storage subsystem failures can corrupt individual message blocks on a specific replica without affecting Raft log consistency. The Raft layer sees all operations as applied, but the filestore failed to persist some messages.
Raft state reset or snapshot divergence. If a replica’s Raft state is reset (e.g., due to a corrupt WAL), it rebuilds from a snapshot. If the snapshot itself was taken from an already-diverged state, the replica starts from an incorrect baseline and the divergence persists.
Interest-based retention consumer ack propagation failure. In interest-based retention streams, message deletion is driven by consumer acknowledgments. If an ack is applied on the leader but fails to propagate to a follower’s consumer state, the follower retains messages the leader has deleted. Over time, this produces a measurable count divergence.
Partial restore from backup. Restoring a stream backup to one replica without coordinating with the cluster can introduce divergence. The restored replica has a message set that doesn’t match what the other replicas have through normal replication.
Server version mismatch during rolling upgrade. Different server versions may handle edge cases in message deletion, compaction, or retention differently. During a rolling upgrade window, replicas running different versions can process the same operation with different outcomes.
nats stream info STREAM_NAME --allLook at the state section for each replica. Under normal operation, all replicas should show identical values for messages, bytes, first_seq, and last_seq. Example output showing divergence:
1Cluster Information:2 Leader: server-13 Replica: server-2, current, seen 0.12s ago (messages: 482,917)4 Replica: server-3, current, seen 0.09s ago (messages: 481,203)If the leader has 483,000 messages and replicas show 482,917 and 481,203, you have divergence despite all replicas being “current.”
# Get detailed per-replica statenats stream info STREAM_NAME --json | jq '.cluster.replicas[] | {name, current, lag, active, msgs: .state.messages}'The replica with the lowest message count is most likely the one that lost data. However, if using interest-based retention, the replica with the highest count may be the anomaly — it failed to delete messages that others correctly purged.
# On each server hosting a replicagrep -i "filestore\|corrupt\|recover" /var/log/nats-server.log | tail -50Look for patterns like:
1[ERR] JetStream filestore error: short block2[WRN] Stream STREAM_NAME recovered with X fewer messagesnats server report jetstream --account ACCOUNT --stream STREAM_NAMEIf all replicas report identical Raft applied indexes but different message counts, the divergence is in the storage layer, not the consensus layer.
1package main2
3import (4 "fmt"5 "log"6 "math"7
8 "github.com/nats-io/nats.go"9)10
11func main() {12 nc, _ := nats.Connect(nats.DefaultURL)13 js, _ := nc.JetStream()14
15 for name := range js.StreamNames() {16 info, err := js.StreamInfo(name, nats.MaxWait(5*time.Second))17 if err != nil {18 log.Printf("error getting info for %s: %v", name, err)19 continue20 }21 if info.Cluster == nil || len(info.Cluster.Replicas) == 0 {22 continue23 }24 leaderMsgs := info.State.Msgs25 for _, r := range info.Cluster.Replicas {26 if r.Current && r.Lag == 0 {27 // Replica reports current — compare state if available28 fmt.Printf("Stream %s: leader=%d, check replica %s manually\n",29 name, leaderMsgs, r.Name)30 }31 }32 }33}Determine which replica has the correct data. In most cases, the leader’s state is authoritative. However, if the leader was recently elected from a previously lagging replica, a different peer may have more complete data. Check the Raft applied index and server logs to determine which replica has the highest message count with a clean recovery history.
Step the leader to the best replica. If the current leader is the diverged replica, move leadership to the replica with the most complete data:
nats stream cluster step-down STREAM_NAME --preferred HEALTHY_SERVERRemove and replace the diverged replica. Once you’ve confirmed which replica has the correct state and ensured it’s the leader, remove each diverged peer:
# Remove the diverged replicanats stream cluster peer-remove STREAM_NAME DIVERGED_PEER
# NATS will automatically schedule a replacement on an available server# Monitor progress:nats stream info STREAM_NAMEThe new replica will be provisioned from the current leader’s state, eliminating the divergence.
For interest-based retention streams: also check consumers. If the divergence was caused by consumer ack propagation failures, the consumer state on the rebuilt replica may also need to be recreated:
# List consumers and check for state inconsistenciesnats consumer info STREAM_NAME CONSUMER_NAME --allEnable consistent stream checksums. Upgrade to the latest NATS server version, which includes improved filestore checksum validation during compaction and recovery.
Monitor for divergence continuously. Add per-replica message count comparison to your monitoring. Synadia Insights evaluates this automatically, but for custom monitoring.
Avoid SIGKILL in production. Always use SIGTERM for server shutdowns to allow clean filestore flushes. In Kubernetes, ensure terminationGracePeriodSeconds is sufficient for the server to complete pending writes.
Keep server versions consistent. During rolling upgrades, minimize the window where replicas run different server versions. Upgrade followers first, then step down the leader and upgrade it last.
Replica lag (JETSTREAM_001) means a follower is behind in applying Raft operations — it knows it’s behind and is catching up. Replica message count divergence (this check) means replicas believe they’re synchronized but their actual data doesn’t match. Lag is a transient operational condition; divergence is a corruption indicator.
No. R1 streams have a single copy — there’s no other replica to diverge from. This check only applies to replicated streams (R3, R5).
No. New messages are replicated correctly through Raft, but the existing divergence persists. If replica A has 1,000 messages and replica B has 990, after 100 new messages they’ll have 1,100 and 1,090. The gap doesn’t close because the missing messages are from the past and won’t be re-replicated.
Compare message counts with any external source of truth — application logs, upstream publish counters, or consumer delivery records. If no external reference exists, treat the replica with the highest message count as most likely correct (it retained data others lost). Back up from that replica before rebuilding.
Yes. This check only fires when all replicas report “current” with zero lag. If any replica reports lag, the difference in message counts is expected and is covered by JETSTREAM_001 instead. The simultaneous “current + different counts” condition is what makes this check a corruption signal rather than a normal replication delay.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community