When a NATS server has a high number of leafnode connections configured with s2_auto compression, it risks a CPU feedback loop under load. The s2_auto mode dynamically selects a compression level based on observed RTT — as RTT increases, it escalates to more aggressive (and more CPU-intensive) compression. With many leafnode connections, the aggregate CPU cost of compression itself can increase RTT, triggering further escalation across all connections simultaneously. This positive feedback loop can drive CPU to saturation and destabilize the server.
Leafnode connections are the backbone of hub-and-spoke NATS architectures. Edge deployments, IoT gateways, multi-tenant isolations, and branch-office topologies all rely on leafnodes to connect to a central cluster. It’s common for a hub server to maintain dozens or even hundreds of leafnode connections.
The s2_auto compression mode was designed to adapt to changing network conditions. When RTT is low (indicating a healthy, fast link), it uses lightweight compression (s2_uncompressed or s2_fast). When RTT rises (suggesting congestion or a slow link), it escalates to s2_better or s2_best to reduce bandwidth consumption at the cost of CPU. This adaptation works well for a small number of connections where the CPU cost of compression is negligible relative to server capacity.
The problem emerges at scale. Consider a hub server with 100 leafnode connections, all using s2_auto. During a traffic spike, RTT on some connections increases slightly — perhaps due to normal network jitter. The s2_auto algorithm escalates compression on those connections. The additional CPU load from higher compression increases processing latency on the server, which raises RTT on the remaining connections. Those connections now escalate their compression too. Within seconds, all 100 connections are using s2_best compression, consuming maximum CPU. The server’s processing loop slows further, RTT climbs higher, and the system is stuck in a high-CPU, high-latency state.
Even without the feedback loop, the CPU cost of s2_auto at scale is unpredictable. Capacity planning becomes difficult when compression level — and therefore CPU consumption — can shift dynamically based on network conditions outside the server’s control. A server that comfortably handles 100 leafnodes at s2_fast may struggle with 50 at s2_best if conditions trigger the escalation.
Default or templated compression setting. Many deployment guides and configuration templates recommend s2_auto as a sensible default. For servers with a handful of leafnode connections, it is. But when the same template is used for a hub server that accumulates many leafnodes over time, the risk profile changes without anyone revisiting the compression setting.
Gradual leafnode count growth. A hub server starts with 10 leafnodes and s2_auto works fine. Over months, the count grows to 80. The compression setting is never revisited because there’s no immediate symptom — the feedback loop only manifests under specific load conditions that haven’t been triggered yet.
Network instability on leafnode links. Leafnodes often connect over less reliable networks — WAN links, cellular connections, satellite links. RTT fluctuations are common and expected. Each fluctuation triggers compression level changes across all s2_auto connections, creating persistent CPU churn even at modest leafnode counts.
Heterogeneous leafnode link quality. When some leafnodes connect over fast local networks and others over slow WAN links, s2_auto escalates compression on the slow links. If enough slow links are present, the aggregate CPU cost spills over and affects processing for the fast links too.
nats server report connections --sort subs | grep leafnodeCount the leafnode connections on each server. Then check the server configuration for the compression setting:
# View server configurationnats server info --json | jq '.connect_urls'
# Or check the config file directlycat /etc/nats/nats-server.conf | grep -A 5 'leafnodes'# Check current RTT for all leafnode connectionsnats server list | grep leaf
# Check server CPU usagenats server report server --json | jq '.[] | {name: .name, cpu: .cpu}'If you observe high CPU coinciding with elevated leafnode RTT across many connections simultaneously, the feedback loop is likely active.
curl -s http://localhost:8222/leafz?subs=1 | jq '.leafs[] | {name: .name, rtt: .rtt, compression: .compression}'If many leafnodes show s2_better or s2_best compression during periods that should be low-traffic, the auto-escalation is triggering inappropriately.
1package main2
3import (4 "encoding/json"5 "fmt"6 "net/http"7)8
9type LeafzResp struct {10 Leafs []struct {11 Name string `json:"name"`12 RTT string `json:"rtt"`13 Compression string `json:"compression"`14 } `json:"leafs"`15}16
17func checkLeafnodeCompression(monitorURL string, countThreshold int) error {18 resp, err := http.Get(monitorURL + "/leafz")19 if err != nil {20 return err21 }22 defer resp.Body.Close()23
24 var leafz LeafzResp25 if err := json.NewDecoder(resp.Body).Decode(&leafz); err != nil {26 return err27 }28
29 autoCount := 030 for _, leaf := range leafz.Leafs {31 if leaf.Compression == "s2_auto" {32 autoCount++33 }34 }35
36 if autoCount > countThreshold {37 fmt.Printf("WARN: %d leafnode connections using s2_auto (threshold: %d) — risk of CPU feedback loop\n",38 autoCount, countThreshold)39 }40 return nil41}1import httpx2
3async def check_leafnode_compression(monitor_url: str, count_threshold: int = 20):4 resp = await httpx.AsyncClient().get(f"{monitor_url}/leafz")5 data = resp.json()6 auto_count = sum(7 1 for leaf in data.get("leafs", [])8 if leaf.get("compression") == "s2_auto"9 )10 if auto_count > count_threshold:11 return {12 "auto_count": auto_count,13 "threshold": count_threshold,14 "message": "High leafnode count with s2_auto — CPU feedback loop risk",15 }16 return NoneReplace s2_auto with a fixed compression level on the hub server’s leafnode configuration:
1leafnodes {2 port: 74223 compression: s2_fast4}Choosing the right level:
s2_fast — Minimal CPU overhead, good compression ratio for most payloads. Best default for high-connection-count servers.s2_better — Moderate CPU, better compression. Use when bandwidth is constrained but the server has CPU headroom.s2_best — Highest CPU, best compression. Only use for a small number of leafnodes over very constrained links.s2_uncompressed or off — No compression overhead. Use when bandwidth is plentiful and CPU is the constraint.For most hub servers with many leafnodes, s2_fast is the right choice. It provides meaningful bandwidth reduction with negligible CPU impact, and — critically — it’s deterministic. CPU usage doesn’t change based on network conditions.
# Reload the server configuration (no restart needed)nats server config reload <server-id>
# Or send SIGHUP to the server processkill -HUP $(pidof nats-server)Verify the compression level changed:
curl -s http://localhost:8222/leafz | jq '.leafs[0].compression'Note: existing connections retain their compression setting until they reconnect. To apply the new setting immediately to all connections, a rolling restart of the leafnode clients may be needed.
If some leafnode links genuinely benefit from higher compression (e.g., satellite connections), configure per-remote compression overrides instead of a blanket s2_auto:
1leafnodes {2 port: 74223 compression: s2_fast # default for most connections4
5 remotes [6 {7 url: "nats-leaf://satellite-hub:7422"8 compression: s2_better # higher compression for the slow link9 }10 ]11}This gives you the bandwidth benefit on constrained links without exposing the hub server to the feedback loop across all connections.
Set alerts on leafnode count per server. When a hub server’s leafnode count crosses a threshold (e.g., 50), review the compression configuration proactively.
Track CPU-per-leafnode. If aggregate CPU from leafnode compression becomes a concern even with fixed levels, it’s time to scale horizontally — add more hub servers and distribute the leafnode connections across them.
Synadia Insights automatically detects the combination of high leafnode count and s2_auto compression, alerting you before the feedback loop can trigger in production.
No. s2_auto is a reasonable choice for servers with a small number of leafnode connections (under 10–20). The adaptive behavior is genuinely useful when you have a few links with varying quality. The risk emerges specifically when many connections are using s2_auto simultaneously — the aggregate CPU impact of compression level changes becomes significant enough to affect the server’s own processing latency, creating the feedback loop.
Look for three simultaneous symptoms: (1) high CPU on the hub server, (2) elevated RTT across most or all leafnode connections, and (3) most leafnode connections showing s2_better or s2_best compression in the /leafz endpoint. If all three are present, the feedback loop is active. Switching to a fixed compression level and reloading the configuration should immediately break the cycle.
The feedback loop risk is specific to leafnodes because hub servers commonly have many leafnode connections. Route connections (within a cluster) are typically 2–4 per cluster member, and gateway connections are one per remote cluster — not enough for the aggregate CPU cost to create a feedback loop. However, the same principle applies: if you had an unusually large number of route or gateway connections with s2_auto, the same risk would exist.
A configuration reload (nats server config reload <server-id>) applies the new compression setting to new connections only. Existing leafnode connections retain their current compression level until they reconnect. To force the change, either restart the leafnode clients or briefly restart the hub server. In most deployments, leafnode clients will periodically reconnect on their own, picking up the new compression setting gradually.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community