NATS Leafnode Auto Compression with High Count: Causes, Diagnosis, and Fixes

When a NATS server has a high number of leafnode connections configured with s2_auto compression, it risks a CPU feedback loop under load. The s2_auto mode dynamically selects a compression level based on observed RTT — as RTT increases, it escalates to more aggressive (and more CPU-intensive) compression. With many leafnode connections, the aggregate CPU cost of compression itself can increase RTT, triggering further escalation across all connections simultaneously. This positive feedback loop can drive CPU to saturation and destabilize the server.

Why this matters

Leafnode connections are the backbone of hub-and-spoke NATS architectures. Edge deployments, IoT gateways, multi-tenant isolations, and branch-office topologies all rely on leafnodes to connect to a central cluster. It’s common for a hub server to maintain dozens or even hundreds of leafnode connections.

The s2_auto compression mode was designed to adapt to changing network conditions. When RTT is low (indicating a healthy, fast link), it uses lightweight compression (s2_uncompressed or s2_fast). When RTT rises (suggesting congestion or a slow link), it escalates to s2_better or s2_best to reduce bandwidth consumption at the cost of CPU. This adaptation works well for a small number of connections where the CPU cost of compression is negligible relative to server capacity.

The problem emerges at scale. Consider a hub server with 100 leafnode connections, all using s2_auto. During a traffic spike, RTT on some connections increases slightly — perhaps due to normal network jitter. The s2_auto algorithm escalates compression on those connections. The additional CPU load from higher compression increases processing latency on the server, which raises RTT on the remaining connections. Those connections now escalate their compression too. Within seconds, all 100 connections are using s2_best compression, consuming maximum CPU. The server’s processing loop slows further, RTT climbs higher, and the system is stuck in a high-CPU, high-latency state.

Even without the feedback loop, the CPU cost of s2_auto at scale is unpredictable. Capacity planning becomes difficult when compression level — and therefore CPU consumption — can shift dynamically based on network conditions outside the server’s control. A server that comfortably handles 100 leafnodes at s2_fast may struggle with 50 at s2_best if conditions trigger the escalation.

Common causes

Default or templated compression setting. Many deployment guides and configuration templates recommend s2_auto as a sensible default. For servers with a handful of leafnode connections, it is. But when the same template is used for a hub server that accumulates many leafnodes over time, the risk profile changes without anyone revisiting the compression setting.
Gradual leafnode count growth. A hub server starts with 10 leafnodes and s2_auto works fine. Over months, the count grows to 80. The compression setting is never revisited because there’s no immediate symptom — the feedback loop only manifests under specific load conditions that haven’t been triggered yet.
Network instability on leafnode links. Leafnodes often connect over less reliable networks — WAN links, cellular connections, satellite links. RTT fluctuations are common and expected. Each fluctuation triggers compression level changes across all s2_auto connections, creating persistent CPU churn even at modest leafnode counts.
Heterogeneous leafnode link quality. When some leafnodes connect over fast local networks and others over slow WAN links, s2_auto escalates compression on the slow links. If enough slow links are present, the aggregate CPU cost spills over and affects processing for the fast links too.

How to diagnose

Check leafnode count and compression settings

nats server report connections --sort subs | grep leafnode

Count the leafnode connections on each server. Then check the server configuration for the compression setting:

# View server configuration
nats server info --json | jq '.connect_urls'

# Or check the config file directly
cat /etc/nats/nats-server.conf | grep -A 5 'leafnodes'

Monitor CPU and RTT correlation

# Check current RTT for all leafnode connections
nats server list | grep leaf

# Check server CPU usage
nats server report server --json | jq '.[] | {name: .name, cpu: .cpu}'

If you observe high CPU coinciding with elevated leafnode RTT across many connections simultaneously, the feedback loop is likely active.

Check the monitoring endpoint for compression details

curl -s http://localhost:8222/leafz?subs=1 | jq '.leafs[] | {name: .name, rtt: .rtt, compression: .compression}'

If many leafnodes show s2_better or s2_best compression during periods that should be low-traffic, the auto-escalation is triggering inappropriately.

Programmatic detection

1
package main
2

3
import (
4
  "encoding/json"
5
  "fmt"
6
  "net/http"
7
)
8

9
type LeafzResp struct {
10
  Leafs []struct {
11
    Name        string `json:"name"`
12
    RTT         string `json:"rtt"`
13
    Compression string `json:"compression"`
14
  } `json:"leafs"`
15
}
16

17
func checkLeafnodeCompression(monitorURL string, countThreshold int) error {
18
  resp, err := http.Get(monitorURL + "/leafz")
19
  if err != nil {
20
    return err
21
  }
22
  defer resp.Body.Close()
23

24
  var leafz LeafzResp
25
  if err := json.NewDecoder(resp.Body).Decode(&leafz); err != nil {
26
    return err
27
  }
28

29
  autoCount := 0
30
  for _, leaf := range leafz.Leafs {
31
    if leaf.Compression == "s2_auto" {
32
      autoCount++
33
    }
34
  }
35

36
  if autoCount > countThreshold {
37
    fmt.Printf("WARN: %d leafnode connections using s2_auto (threshold: %d) — risk of CPU feedback loop\n",
38
      autoCount, countThreshold)
39
  }
40
  return nil
41
}

1
import httpx
2

3
async def check_leafnode_compression(monitor_url: str, count_threshold: int = 20):
4
    resp = await httpx.AsyncClient().get(f"{monitor_url}/leafz")
5
    data = resp.json()
6
    auto_count = sum(
7
        1 for leaf in data.get("leafs", [])
8
        if leaf.get("compression") == "s2_auto"
9
    )
10
    if auto_count > count_threshold:
11
        return {
12
            "auto_count": auto_count,
13
            "threshold": count_threshold,
14
            "message": "High leafnode count with s2_auto — CPU feedback loop risk",
15
        }
16
    return None

How to fix it

Switch to a fixed compression level

Replace s2_auto with a fixed compression level on the hub server’s leafnode configuration:

1
leafnodes {
2
  port: 7422
3
  compression: s2_fast
4
}

Choosing the right level:

s2_fast — Minimal CPU overhead, good compression ratio for most payloads. Best default for high-connection-count servers.
s2_better — Moderate CPU, better compression. Use when bandwidth is constrained but the server has CPU headroom.
s2_best — Highest CPU, best compression. Only use for a small number of leafnodes over very constrained links.
s2_uncompressed or off — No compression overhead. Use when bandwidth is plentiful and CPU is the constraint.

For most hub servers with many leafnodes, s2_fast is the right choice. It provides meaningful bandwidth reduction with negligible CPU impact, and — critically — it’s deterministic. CPU usage doesn’t change based on network conditions.

Apply the change

# Reload the server configuration (no restart needed)
nats server config reload <server-id>

# Or send SIGHUP to the server process
kill -HUP $(pidof nats-server)

Verify the compression level changed:

curl -s http://localhost:8222/leafz | jq '.leafs[0].compression'

Note: existing connections retain their compression setting until they reconnect. To apply the new setting immediately to all connections, a rolling restart of the leafnode clients may be needed.

Per-remote overrides for heterogeneous links

If some leafnode links genuinely benefit from higher compression (e.g., satellite connections), configure per-remote compression overrides instead of a blanket s2_auto:

1
leafnodes {
2
  port: 7422
3
  compression: s2_fast  # default for most connections
4

5
  remotes [
6
    {
7
      url: "nats-leaf://satellite-hub:7422"
8
      compression: s2_better  # higher compression for the slow link
9
    }
10
  ]
11
}

This gives you the bandwidth benefit on constrained links without exposing the hub server to the feedback loop across all connections.

Long-term: monitor and capacity-plan

Set alerts on leafnode count per server. When a hub server’s leafnode count crosses a threshold (e.g., 50), review the compression configuration proactively.

Track CPU-per-leafnode. If aggregate CPU from leafnode compression becomes a concern even with fixed levels, it’s time to scale horizontally — add more hub servers and distribute the leafnode connections across them.

Synadia Insights automatically detects the combination of high leafnode count and s2_auto compression, alerting you before the feedback loop can trigger in production.

Frequently asked questions

Is s2_auto always bad?

No. s2_auto is a reasonable choice for servers with a small number of leafnode connections (under 10–20). The adaptive behavior is genuinely useful when you have a few links with varying quality. The risk emerges specifically when many connections are using s2_auto simultaneously — the aggregate CPU impact of compression level changes becomes significant enough to affect the server’s own processing latency, creating the feedback loop.

How do I know if the feedback loop is happening right now?

Look for three simultaneous symptoms: (1) high CPU on the hub server, (2) elevated RTT across most or all leafnode connections, and (3) most leafnode connections showing s2_better or s2_best compression in the /leafz endpoint. If all three are present, the feedback loop is active. Switching to a fixed compression level and reloading the configuration should immediately break the cycle.

Does this affect route or gateway compression too?

The feedback loop risk is specific to leafnodes because hub servers commonly have many leafnode connections. Route connections (within a cluster) are typically 2–4 per cluster member, and gateway connections are one per remote cluster — not enough for the aggregate CPU cost to create a feedback loop. However, the same principle applies: if you had an unusually large number of route or gateway connections with s2_auto, the same risk would exist.

Will changing compression disconnect existing leafnodes?

A configuration reload (nats server config reload <server-id>) applies the new compression setting to new connections only. Existing leafnode connections retain their current compression level until they reconnect. To force the change, either restart the leafnode clients or briefly restart the hub server. In most deployments, leafnode clients will periodically reconnect on their own, picking up the new compression setting gradually.

FEATURED

RESOURCES

Comparisons