A Raft group peer count mismatch occurs when a NATS JetStream stream or consumer Raft group reports more active peers than the num_replicas value in its configuration. The extra peer is typically a ghost — a node that was supposed to be removed during a cluster operation but whose removal didn’t fully propagate. The group continues to function, but the stale peer consumes resources, complicates leader elections, and can mask real quorum problems.
Every Raft group in JetStream maintains a peer set that determines quorum. For an R3 stream, quorum requires agreement from 2 of 3 peers. If a fourth ghost peer lingers in the group, the quorum calculation shifts — the group now needs 3 of 4 peers to agree, making it harder to achieve consensus and more fragile during node outages.
The ghost peer creates several operational risks. First, the leader continues sending append entries to the stale peer, wasting network bandwidth and CPU on messages that will never be acknowledged. In clusters with hundreds of streams, the aggregate overhead of ghost peers across many Raft groups becomes measurable. Second, the stale peer inflates the reported replica count, making capacity planning inaccurate — operators think the stream has more redundancy than it actually does, or they see an unexpected peer count and waste time investigating a non-existent node. Third, during rolling upgrades or maintenance windows, the ghost peer can prevent clean leader elections if the group momentarily loses quorum while waiting for a response from a node that no longer participates.
The mismatch is also a symptom of incomplete operational procedures. If one peer-remove left a ghost behind, others likely did too. Finding and fixing these mismatches now prevents a slow accumulation of Raft group inconsistencies that become much harder to untangle later.
Incomplete peer-remove during scaling down. An operator reduced num_replicas from 3 to 1, or used nats stream cluster peer-remove to evict a node, but the removal didn’t fully propagate. The removed peer’s entry persists in the Raft group’s peer set even though the node is no longer replicating data. This is the most common cause.
Node replacement without clean removal. A server was decommissioned and replaced with a new node. The new node joined the Raft group, but the old node’s peer entry was never explicitly removed. The group now has N+1 peers — the N configured replicas plus the ghost of the old node.
Raft state divergence after network partition. A network partition during a peer-remove operation can leave the group in a split state where some members processed the removal and others didn’t. When the partition heals, the group may settle with the stale peer still in the peer set if the removal command wasn’t retried.
Leadership transfer during replica count change. If the Raft group leader changes mid-operation while num_replicas is being decreased, the new leader may not complete the peer removal that the old leader initiated. The configuration update succeeds (num_replicas shows the new value), but the actual peer set retains the extra member.
Manual Raft group manipulation. Direct manipulation of JetStream metadata or Raft state files, typically during disaster recovery, can introduce peer set inconsistencies if not done carefully. The metadata says R3, but the Raft group’s internal peer list has 4 or 5 entries.
List all streams and compare the configured replica count to the actual peer count:
nats stream reportLook for streams where the Replicas column shows more peers than expected. For a detailed view of a specific stream’s Raft group:
nats stream info ORDERS --json | jq '{ config_replicas: .config.num_replicas, cluster_name: .cluster.name, leader: .cluster.leader, peers: [.cluster.replicas[].name], peer_count: (.cluster.replicas | length) + 1}'If peer_count exceeds config_replicas, the group has a mismatch.
Consumer Raft groups inherit the stream’s replica count but can independently develop mismatches:
nats consumer info ORDERS my-consumer --json | jq '{ cluster_leader: .cluster.leader, peers: [.cluster.replicas[].name], peer_count: (.cluster.replicas | length) + 1}'Compare the Raft group’s peer list against currently known cluster members:
# List active serversnats server list
# Compare against the stream's peer setnats stream info ORDERSA peer that appears in the stream’s replica list but not in nats server list is the ghost. If the ghost peer name matches a decommissioned or replaced server, that confirms the cause.
1import (2 "fmt"3 "github.com/nats-io/nats.go"4)5
6func checkPeerMismatches(js nats.JetStreamContext) error {7 for name := range js.StreamNames() {8 info, err := js.StreamInfo(name)9 if err != nil {10 return err11 }12 expected := info.Config.Replicas13 actual := len(info.Cluster.Replicas) + 1 // +1 for leader14 if actual > expected {15 fmt.Printf("MISMATCH: stream=%s expected=%d actual=%d extra=%d\n",16 name, expected, actual, actual-expected)17 for _, r := range info.Cluster.Replicas {18 fmt.Printf(" peer=%s current=%v offline=%v lag=%d\n",19 r.Name, r.Current, r.Offline, r.Lag)20 }21 }22 }23 return nil24}1import asyncio2import nats3
4async def check_peer_mismatches():5 nc = await nats.connect()6 js = nc.jetstream()7
8 async for name in js.stream_names():9 info = await js.stream_info(name)10 expected = info.config.num_replicas11 actual = len(info.cluster.replicas) + 1 # +1 for leader12 if actual > expected:13 print(f"MISMATCH: stream={name} expected={expected} actual={actual}")14 for r in info.cluster.replicas:15 print(f" peer={r.name} current={r.current} offline={r.offline} lag={r.lag}")16
17 await nc.close()18
19asyncio.run(check_peer_mismatches())Once you’ve identified the ghost peer, remove it from the stream’s Raft group:
nats stream cluster peer-remove ORDERS ghost-server-nameFor consumer Raft groups, you may need to delete and recreate the consumer if the consumer doesn’t expose a direct peer-remove command:
# Export consumer confignats consumer info ORDERS my-consumer --json > consumer-backup.json
# Delete and recreatenats consumer rm ORDERS my-consumer -fnats consumer add ORDERS --config consumer-backup.jsonAfter removing the extra peer, confirm the peer count matches the configured replicas:
nats stream info ORDERSThe replica list should now show exactly num_replicas - 1 followers plus 1 leader.
Always verify peer removal completed. After any peer-remove operation, check that the peer count decreased:
nats stream cluster peer-remove ORDERS old-node# Wait a few seconds for propagationnats stream info ORDERSScript node decommissioning. When removing a server from the cluster, iterate over all streams and consumers hosted on that node and explicitly remove it from each Raft group before shutting down the server:
# Find all streams with replicas on the departing nodenats stream report --json | jq -r '.[] | select(.cluster.replicas[]?.name == "old-node") | .stream'Monitor for mismatches continuously. Synadia Insights evaluates OPT_SYS_026 automatically across your deployment, flagging any Raft group where the observed peer count doesn’t match the configured replica count — before ghost peers accumulate and cause operational surprises.
In rare cases, the peer count mismatch exists because num_replicas was decreased in configuration but the intent is to keep all current replicas. If so, update the replica count to match reality:
nats stream edit ORDERS --replicas 3Only do this if you genuinely want the additional replica. In most cases, the ghost peer should be removed instead.
No. The extra peer doesn’t corrupt data — the Raft protocol still functions correctly with an extra member. The risks are operational: degraded quorum math, wasted replication overhead, and misleading cluster topology. However, if the ghost peer pushes the group to an even number of voters, the group becomes less partition-tolerant, which indirectly increases the risk of unavailability (though not data loss) during network events.
Yes. The JetStream meta-group is itself a Raft group and can develop the same mismatch if a server is removed from the cluster without a clean meta-group peer removal. Check the meta-group with nats server report jetstream and compare the listed nodes against your expected cluster membership.
Use the programmatic approach shown in the diagnosis section. For production environments, Synadia Insights runs OPT_SYS_026 continuously across all streams and consumers, generating alerts when any Raft group has more peers than its configured replica count.
No. Raft peer sets are explicit — a peer remains in the group until it is actively removed via a configuration change or a peer-remove operation. The ghost peer will persist indefinitely, even if the corresponding server no longer exists in the cluster.
Yes. Peer removal is a Raft membership change that the leader coordinates without disrupting message flow. The leader proposes the membership change, the group commits it, and the removed peer stops receiving append entries. Active publishers and consumers are not affected.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community