Checks/OPT_BALANCE_005

NATS JetStream Storage Skew: What It Means and How to Fix It

Severity
Info
Category
Saturation
Applies to
Balance
Check ID
OPT_BALANCE_005
Detection threshold
Storage > 2× cluster average

JetStream storage skew means one server in your NATS cluster uses more than double the cluster average disk storage. This imbalance signals that data is concentrated on specific nodes rather than distributed evenly, creating a single point of storage pressure that can fill before other servers are even moderately utilized.

Why this matters

Disk capacity is finite, and the server that fills first determines your cluster’s effective storage ceiling. If one server holds 80 GiB of JetStream data while its peers hold 20 GiB each, that server hits its storage reservation limit while 75% of the cluster’s aggregate disk capacity remains unused. New streams can’t be placed on the full server, and existing streams on it start rejecting writes when storage is exhausted — regardless of how much space is free elsewhere.

Storage skew also affects I/O performance. The server with the most data handles the most disk reads and writes: compaction, snapshot creation, and message retrieval all scale with data volume. During periods of high write throughput, the overloaded server’s disk becomes the bottleneck. Raft log replication to that server slows because its disk queue is deeper, which in turn increases replication lag for every stream it hosts. The cluster appears healthy, but one server is quietly degrading.

The risk becomes acute during recovery scenarios. If the storage-heavy server restarts, it must replay or restore more data than any other server, extending its recovery time. Other servers that depend on it for Raft quorum wait longer for it to catch up. The server that should be the same as any other becomes the critical path in every failure and recovery scenario.

Common causes

  • A few large streams placed on the same server. One or two high-volume streams with large retention windows can dominate storage on whatever server hosts their leader or replicas. If these streams were created without placement constraints, they may all land on the same node.

  • R1 streams concentrated on one server. Unreplicated (R1) streams exist only on their leader’s server. If many R1 streams were created while one server happened to be the meta leader’s preferred target, all their storage accumulates on that single node.

  • Uneven retention policies. Streams with max_age of 30 days accumulate far more data than streams with 24-hour retention. If long-retention streams cluster on one server, that server’s storage grows disproportionately — even if replica counts are balanced.

  • Streams with different message sizes. A stream receiving 1 KB messages and another receiving 100 KB messages have very different storage footprints even at the same message rate. If the large-message streams land on the same server, storage skews quickly.

  • No storage rebalancing after cluster changes. Adding a new server to the cluster doesn’t redistribute existing data. The new server starts empty while existing servers retain their accumulated storage. Without explicit migration, the skew persists indefinitely.

How to diagnose

Check per-server storage usage

Terminal window
nats server report jetstream

The output shows File and Memory storage per server. Compare the File column across all servers. A server using more than 2× the cluster average is skewed.

Identify which streams consume the most storage

Terminal window
nats stream report

This lists all streams with their byte sizes and cluster placement. Sort mentally or pipe through jq to find the largest streams and note which servers host them.

Inspect per-server JetStream details

Terminal window
# Direct monitoring endpoint for a specific server
curl -s http://localhost:8222/jsz | jq '{memory: .memory, storage: .storage, streams: .streams, consumers: .consumers}'

Check this endpoint on each server to get exact byte counts. The delta between the highest and lowest server storage reveals the skew magnitude.

Correlate with replica placement

Terminal window
nats stream info <large_stream_name>

For the largest streams identified above, check which servers host their replicas. If the same server appears repeatedly as a replica host for large streams, that explains the storage skew.

Check storage reservation headroom

Terminal window
nats server report jetstream

Compare each server’s used storage against its reserved storage. The skewed server may be approaching its reservation limit while others have abundant headroom. If it’s above 90%, check SERVER_005 (JetStream Resource Pressure) as well.

How to fix it

Immediate: free space on the overloaded server

If the skewed server is approaching its storage limit, purge data from its largest streams or apply stricter retention:

Terminal window
# Purge all messages from a specific stream
nats stream purge <stream_name>
# Set a retention limit to cap storage
nats stream edit <stream_name> --max-bytes 10GiB

This buys time but doesn’t address the placement problem.

If the cluster is genuinely under-provisioned for the workload — every server is filling, not just the skewed one — add storage capacity. Provisioning a larger volume on the skewed server (or expanding the existing one) raises that server’s reservation ceiling and lets it stay above water while you redistribute. Re-balance after capacity is added; otherwise the new headroom just delays the next pressure event.

Short-term: migrate large streams to other servers

Move the largest streams off the overloaded server by adjusting placement:

1
// Go — move a stream to a different server group via placement tags
2
js, _ := nc.JetStream()
3
4
_, err := js.UpdateStream(&nats.StreamConfig{
5
Name: "EVENTS",
6
Subjects: []string{"events.>"},
7
Replicas: 3,
8
Placement: &nats.Placement{
9
Tags: []string{"storage:high"},
10
},
11
})
1
// TypeScript (nats.js)
2
const jsm = await nc.jetstreamManager();
3
4
await jsm.streams.update("EVENTS", {
5
subjects: ["events.>"],
6
num_replicas: 3,
7
placement: { tags: ["storage:high"] },
8
});

For R1 streams, moving them requires scaling to R3 (which creates replicas on other servers), then scaling back to R1 — the new leader may land on a different server:

Terminal window
# Scale up to create replicas on other servers
nats stream edit <stream_name> --replicas 3
# Wait for sync to complete
nats stream info <stream_name>
# Step down to potentially move leadership
nats stream cluster step-down <stream_name>
# Scale back to R1
nats stream edit <stream_name> --replicas 1

Long-term: design for balanced storage

Establish per-server storage budgets and placement policies:

  1. Tag servers by storage tier. Use server tags like storage:high for servers with large disks and storage:standard for others. Direct high-volume streams to appropriate tiers.

  2. Set stream retention limits at creation time. Every stream should have at least one of max_bytes, max_age, or max_msgs to prevent unbounded growth (see OPT_SYS_001).

  3. Monitor storage distribution continuously. Track per-server JetStream storage as a Prometheus metric and alert when any server exceeds 1.5× the cluster average.

  4. Review stream placement quarterly. As workloads evolve, streams grow at different rates. Periodic audits catch skew before it becomes a capacity problem.

Frequently asked questions

Does NATS automatically balance storage across servers?

No. NATS distributes stream replicas at creation time based on available resources, but it does not rebalance after initial placement. As streams grow at different rates and new servers are added, storage naturally drifts out of balance. Rebalancing requires manual intervention — moving streams via placement tags or cycling replica counts.

Can I move a stream to a different server without downtime?

For replicated streams (R3+), yes. Update the stream’s placement tags to target different servers, and NATS will migrate replicas. During migration, the stream remains available because quorum is maintained. For R1 streams, there’s a brief window of reduced availability when cycling through R3 and back — the stream is always available, but the R1→R3→R1 transition involves temporary replication overhead.

How does storage skew interact with JetStream resource pressure?

Storage skew (this check) is about relative distribution — one server has more than its fair share. Resource pressure (SERVER_005) is about absolute utilization — a server is approaching its storage reservation limit. They often co-occur: the skewed server is typically the first to hit resource pressure. Fixing the skew (redistributing data) directly reduces pressure on the overloaded server.

What happens when a skewed server hits its storage limit?

The server stops accepting new writes to streams it hosts. Existing consumers can still read, but publishers get errors. If the stream is replicated, a new leader can be elected on another server — but only if the other replicas are current. If the skewed server was also the leader for its streams, write availability depends entirely on replica health on other servers.

Should I use the same disk size for all servers in a cluster?

Uniform disk sizes simplify capacity planning and make skew easier to detect. When all servers have identical storage reservations, any imbalance in usage is clearly a placement problem, not a capacity mismatch. If you must use heterogeneous storage, configure JetStream reservations proportionally and adjust your monitoring thresholds accordingly.

Proactive monitoring for NATS jetstream storage skew with Synadia Insights

With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.

Start a 14-day Insights trial
Cancel