The Raft write-ahead log (WAL) is the durability mechanism for JetStream’s consensus protocol. Every stream mutation — message publish, consumer ack, metadata change — is first written to the WAL before being applied. Normally, the WAL compacts automatically as entries are committed across replicas and applied to state. When compaction stalls, the WAL grows without bound. An excessively large WAL is a ticking time bomb: it consumes disk, causes cascading OOM failures on restart, and can render a stream unrecoverable without intervention.
The failure cascade from an unbounded WAL is one of the most severe failure modes in NATS JetStream deployments. Here’s how it unfolds:
Disk exhaustion. The WAL grows until it fills the available disk. Once disk is full, the server can’t write to any stream or WAL on that volume, affecting all JetStream assets on the node — not just the one with the bloated WAL.
Memory spike on restart. When a server restarts, it replays the WAL to rebuild in-memory state. A 50 GiB WAL means loading 50 GiB of log entries into memory during recovery. If the server doesn’t have enough RAM, it OOMs during startup.
Restart loop. The server OOMs, gets restarted by the process manager, tries to replay the same WAL, OOMs again. Without intervention, the node is permanently stuck.
Quorum impact. If the node is part of an R3 stream, the remaining two replicas are now running as an R2 group. If a second node hits the same issue (which is likely if the root cause affects all replicas), the stream loses quorum entirely.
The WAL is separate from the stream’s message storage. A stream with max_bytes=1GB can have a WAL that’s 10x larger if compaction has stalled. This means WAL growth can exhaust disk even when individual stream limits are properly configured.
Stalled follower preventing log truncation. The Raft leader can’t truncate the WAL past the oldest uncommitted entry across all followers. If one follower is disconnected, slow, or stuck, the leader’s WAL retains all entries since the follower last acknowledged. In severe cases — a follower down for days — this means the entire WAL from that point forward is retained.
No active consumers advancing the commit index. For some Raft group types, the commit index advances as consumers acknowledge messages. If a stream has no active consumers, the WAL accumulates entries that are never compacted because the state machine’s applied index doesn’t advance.
Raft group with no leader. A leaderless Raft group can’t perform compaction. Entries accumulate on all replicas but no compaction or snapshotting occurs. Check for leaderless groups (OPT_SYS_009) as a root cause.
Disk I/O contention slowing compaction. If the underlying storage is slow (saturated IOPS, degraded RAID array, noisy neighbor on shared storage), WAL writes succeed but compaction can’t keep up. The WAL grows faster than it shrinks.
Large message payloads. Streams receiving messages with large payloads (>100 KB each) generate proportionally larger WAL entries. Combined with any compaction delay, the WAL grows rapidly in absolute terms.
Snapshot failures. Raft periodically creates snapshots to allow WAL truncation. If snapshotting fails (due to disk space, I/O errors, or internal errors), the WAL can’t be truncated and grows indefinitely.
# List Raft group sizes (requires server access)nats server report jetstream --json | jq '.[] | {server: .name, storage_used: .stats.store, reserved_storage: .stats.reserved_storage}'For direct filesystem inspection on the server node:
# Check WAL directory sizesdu -sh /path/to/jetstream/*/streams/*/raft/WAL files are stored in the raft/ subdirectory of each stream. Compare the WAL size to the stream’s actual message storage.
# Check stream health and look for peers catching upnats stream reportLook for streams showing replicas in catching up state for extended periods — these are the most likely to have WAL accumulation on the leader.
nats stream info MY_STREAM --json | jq '.cluster.replicas[] | {name: .name, current: .current, lag: .lag, active: .active}'A replica with current: false and a high lag value indicates a stalled follower that’s preventing WAL truncation on the leader.
# Check JetStream storage usagenats server report jetstream
# Compare reserved vs. used — WAL growth shows as used > expectedIf storage used is significantly higher than the sum of all stream max_bytes settings, WAL growth is likely the cause.
grep -i "raft\|snapshot\|compact\|wal" /var/log/nats/nats-server.log | tail -50Look for errors related to snapshot creation, compaction failures, or disk I/O issues.
If disk usage is critical, identify and address the largest WAL first:
# Find the largest WAL directoriesfind /path/to/jetstream -name "raft" -type d -exec du -sh {} \; | sort -rh | head -10Do not manually delete WAL files. This will corrupt the Raft state and can cause data loss. WAL recovery must be handled through the NATS server’s own mechanisms.
If a follower is stalled, restoring it allows the leader to truncate the WAL:
# Check whether the peer is actually running. `nats server ping` pings every# server it can reach (the --id flag only toggles whether server IDs appear in# the output — it does not select a peer). To verify a specific peer:nats server check connection --name <peer_name>
# If the peer is up but a stream replica is stalled, force a stream peer# removal — the cluster will replace it according to placement policynats stream cluster peer-remove MY_STREAM <stalled_peer>After removing the stalled peer, the leader can truncate the WAL entries that were waiting for that follower. Then re-add the peer:
nats stream edit MY_STREAM --replicas 3The new replica will catch up via snapshot transfer rather than WAL replay, avoiding the accumulated WAL issue.
If the WAL is large but the stream is otherwise healthy, a leader step-down can trigger snapshot creation:
nats stream cluster step-down MY_STREAMThe new leader will create a fresh snapshot as part of the leadership transition, allowing WAL truncation.
If a server is caught in an OOM restart loop due to WAL replay:
GOMEMLIMIT — this helps Go’s GC operate more efficiently during WAL replay:
GOMEMLIMIT=8GiB nats-server -c nats-server.conf# CAUTION: only do this if other replicas are healthymv /path/to/jetstream/streams/MY_STREAM/raft /tmp/MY_STREAM_raft_backupIf any WAL exceeds 50 GiB, contact Synadia support before attempting recovery. Large WAL recovery requires careful orchestration to avoid data loss, and the support team has tooling for safe WAL compaction.
1// Monitor WAL health programmatically2js, _ := nc.JetStream()3for _, name := range streamNames {4 info, err := js.StreamInfo(name)5 if err != nil {6 continue7 }8 for _, r := range info.Cluster.Replicas {9 if !r.Current && r.Lag > 10000 {10 log.Printf("WARNING: stream %s replica %s is stalled (lag: %d)",11 name, r.Name, r.Lag)12 }13 }14}1import nats2
3nc = await nats.connect()4js = nc.jetstream()5
6for name in stream_names:7 info = await js.stream_info(name)8 if info.cluster and info.cluster.replicas:9 for r in info.cluster.replicas:10 if not r.current and r.lag > 10000:11 print(f"WARNING: stream {name} replica {r.name} stalled (lag: {r.lag})")Set up alerting on replica lag and offline peers to catch stalled followers before the WAL accumulates significantly.
A healthy WAL is typically a small fraction of the stream’s data size — usually under 1 GiB for most workloads. The exact size depends on write rate and compaction frequency. If the WAL is larger than the stream’s actual message storage, something is preventing compaction.
max_bytes on the stream config limits message storage, not WAL storage. The WAL is outside the stream’s configured limits. Increasing max_bytes won’t help with WAL growth. You need to address the root cause preventing WAL compaction.
Not directly during normal operation — the WAL is append-only, and writes are fast. But WAL growth is a symptom of compaction issues, which often correlate with stalled replicas or disk I/O problems. These underlying issues do affect publish latency through Raft commit delays.
Yes. Once all followers are caught up and acknowledge entries, the leader can truncate the WAL. Compaction will bring the WAL back to its normal size. If the stalled follower can’t recover (e.g., the node is permanently lost), removing it from the peer set allows truncation.
Yes. Synadia Insights monitors WAL size both as an absolute value and as a percentage of js_max_store. The check triggers well before disk exhaustion, giving you time to investigate and remediate. For critical WAL sizes (>50 GiB), the alert escalates to critical severity.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community