A leaderless Raft group is a stream, consumer, or meta cluster group that previously had a leader but currently has none. Without a leader, the group cannot process any writes — stream publishes stall, consumer deliveries stop, or JetStream API calls fail, depending on which group is affected.
Every replicated JetStream asset operates through Raft consensus. The leader coordinates all writes: accepting published messages for streams, tracking acknowledgments for consumers, or managing asset placement for the meta group. When a group has no leader, it has no coordinator — all operations that require consensus are blocked.
A brief leaderless state during a leader election is normal. Elections happen in milliseconds after a leader steps down or fails. What’s abnormal is a persistent leaderless state — the group can’t complete an election and remains stuck. This check specifically targets groups that had a leader before (ruling out newly created groups that haven’t elected yet), meaning something went wrong with the election process.
The impact depends on which Raft group is leaderless. A leaderless stream group means publishes to that stream return errors or time out. A leaderless consumer group means message delivery stops entirely for that consumer. A leaderless meta group is the worst case — all JetStream API operations across the entire cluster fail. In each case, the group is effectively frozen until a leader is elected.
Election loop from network instability. Raft candidates start elections by requesting votes from peers. If network latency is high or packets are being dropped, vote requests and responses time out before reaching quorum. The election restarts, times out again, and the group cycles without ever electing a leader.
All candidates have stale Raft logs. Raft requires that a leader’s log be at least as up-to-date as a quorum of peers. If replicas were partitioned and diverged, candidates may reject each other’s vote requests because each candidate’s log is missing entries the others have. This is rare but can happen after complex failure/recovery sequences.
Server overload causing election timeouts. Raft elections have timeout windows. If the servers hosting the Raft group are under heavy CPU or I/O load, the election process can’t complete within the timeout. The election resets, but the load that caused the timeout hasn’t changed — creating a loop.
Disk I/O blocking Raft operations. Raft writes proposals and votes to disk before responding. If disk I/O is saturated (from stream writes, snapshots, or other processes), the election protocol can’t persist state fast enough to complete within timeouts.
Corrupt Raft state. Rare, but possible after unclean shutdowns, disk errors, or filesystem corruption. A replica with corrupt state may participate in elections but fail to process the result, preventing the group from settling on a leader.
Bug in Raft implementation. Very rare in stable releases but possible, especially in edge cases with specific timing or failure patterns. If other causes are ruled out, this may warrant a bug report to the nats-server repository.
A leaderless group is detected within 10 seconds of quorum loss. Wait 30 seconds after detection. Transient leaderless states during routine elections resolve quickly. If the group is still leaderless after 30 seconds, the election is stuck.
# For streams — Leader field will be emptynats stream info <stream-name>
# For consumersnats consumer info <stream-name> <consumer-name>
# For the meta groupnats server report jetstreamThe Leader field will be blank or empty for a leaderless group. The replica list will show peers but none designated as leader.
nats server listVerify all servers in the Raft group are online and visible. Ensure a quorum of peers is online and reachable. If a server is missing, the group may lack quorum to elect a leader — this is more accurately a quorum loss issue (JETSTREAM_008 or CONSUMER_003) than a leaderless issue. Check server logs for election failures or network partition indicators.
nats event --js-advisoryLeader election advisories on $JS.EVENT.ADVISORY.STREAM.LEADER_ELECTED and $JS.EVENT.ADVISORY.CONSUMER.LEADER_ELECTED fire when elections succeed. If you see no election events for the affected group, elections are either not starting or not completing.
# Check CPU and connectionsnats server list
# Check disk I/O on the servers hosting the groupiostat -xz 1 5High CPU (see SERVER_003) or high disk I/O latency can prevent elections from completing within Raft’s timeout windows.
Server logs will show Raft election activity — vote requests, vote grants, election timeouts. Repeated election timeout messages indicate the election loop:
1[WRN] JetStream cluster - Loss of stream quorum for 'ORDERS'2[INF] JetStream cluster - Stream 'ORDERS' leader election startedStep down the group to trigger a fresh election. Even though the group is leaderless, issuing a step-down request can reset election state and break out of a stuck cycle:
# For streamsnats stream cluster step-down <stream-name>
# For consumersnats consumer cluster step-down <stream-name> <consumer-name>If the step-down command fails (no leader to receive it), try restarting one of the servers in the Raft group. A server restart resets its local Raft state and triggers a new election round.
Resolve network issues between Raft peers. If elections are failing due to network timeouts, fix the network path:
# Check RTT between cluster serversnats server listRTT between cluster peers should be consistently under 10ms in the same datacenter. High or variable RTT causes election timeouts.
Reduce server load. If CPU or disk I/O pressure is preventing elections from completing:
# Check for hot subjects or high connection countsnats server report connections --sort out-msgsConsider temporarily reducing workload (pausing publishers, draining non-critical connections) to give the election process room to complete.
Check and repair disk I/O. If disk latency is the bottleneck, validate the storage health:
# Check disk performancefio --name=write-test --rw=write --bs=4k --size=100M --runtime=10 --filename=/path/to/jetstream/storeIf the JetStream store_dir is on slow storage, migrating to SSD is essential for stable Raft operations.
Use fast, dedicated storage for JetStream. Raft’s election protocol depends on timely disk writes. SSDs with consistent latency prevent election timeouts caused by I/O spikes:
1jetstream {2 store_dir: "/fast-ssd/nats/jetstream"3}Monitor Raft group health continuously. Set up alerts for leaderless groups so you catch stuck elections before users notice:
1// Go: check stream leader status2nc, _ := nats.Connect(url)3js, _ := nc.JetStream()4info, _ := js.StreamInfo("ORDERS")5if info.Cluster.Leader == "" {6 log.Printf("ALERT: stream ORDERS has no leader")7}1# Python: monitor for leaderless groups2import nats3
4async def check_leaders():5 nc = await nats.connect()6 js = nc.jetstream()7 info = await js.stream_info("ORDERS")8 if not info.cluster.leader:9 print(f"ALERT: stream ORDERS is leaderless")Ensure cluster sizes are odd. Even-numbered clusters risk split-vote scenarios where two candidates each get exactly half the votes and neither achieves majority. Odd-numbered clusters eliminate this:
1# Use 3 or 5 servers, never 2 or 4Maintain headroom on server resources. Raft elections need CPU time and disk bandwidth. If servers routinely run at 90%+ CPU or disk utilization, elections during disruptions are more likely to time out.
Synadia Insights detects leaderless Raft groups automatically and distinguishes transient election states from persistent leadership failures, alerting only when a group that previously had a leader remains leaderless.
Quorum loss (JETSTREAM_008, CONSUMER_003) means not enough replicas are online to form a majority. A leaderless group may have enough replicas online but can’t complete an election — all peers are present, but they can’t agree on a leader. Quorum loss is a membership problem; leaderless is an election problem. In practice, quorum loss always causes leaderlessness, but leaderlessness can occur with full membership.
Yes. When a leader fails or steps down, the group is briefly leaderless while a new election runs. This typically lasts milliseconds to low-single-digit seconds. Clients may see a brief timeout on publishes or fetches during this window. This check only fires for groups that remain leaderless beyond the normal election window, and only for groups that previously had a leader.
No. Without a leader, the stream has no coordinator to accept and replicate messages. Publish requests to a leaderless stream will return a “no responders” error or time out. The NATS client will surface this as a publish error. If using nats.PublishAsync, the pending ack future will fail.
Use nats stream report to get a summary of all streams with their leader status. Leaderless streams will show no leader in the cluster column. For consumers, nats consumer report --all provides similar visibility. Synadia Insights provides a single-pane view of all Raft groups and their leadership status.
As a last resort, yes. If a Raft group is persistently leaderless despite all peers being online and healthy, there may be corrupted Raft state. Delete and recreate the stream or consumer from configuration. For streams, ensure you have a backup or mirror to recover data. For consumers, recreating resets the delivery position — use the opt_start_seq or opt_start_time option to resume from the appropriate position.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community