NATS Connection Count High: What It Means and How to Fix It

Connection count high means a NATS server’s active client connections have reached 80% or more of its configured max_connections limit. Once that limit is hit, the server rejects all new connection attempts — and there’s no queue or retry at the server level.

Why this matters

When a NATS server reaches its max_connections ceiling, every new client that attempts to connect receives an immediate rejection. This includes new application instances scaling up, existing clients reconnecting after a network blip, and monitoring tools trying to check server health. There’s no graceful degradation: the server goes from accepting connections to refusing them entirely, with no backpressure signal in between.

The 80% warning threshold exists because connection counts rarely grow linearly. A Kubernetes deployment scaling event can add dozens of connections in seconds. A network partition that resolves can trigger a reconnection storm where every disconnected client simultaneously attempts to reconnect. If the server is already at 80% capacity, these burst events push it past the limit, causing cascading failures as rejected clients retry repeatedly — each retry consuming server CPU to evaluate and reject the connection.

Connection count pressure also has second-order effects. Each active connection consumes server memory for read/write buffers and subscription state. At high connection counts, the server spends more CPU time on subscription matching, connection bookkeeping, and garbage collection. Even before hitting max_connections, a server with 50,000 connections handles each connection slightly slower than a server with 5,000 — the overhead is cumulative.

Common causes

Connection leaks in client applications. Applications that create new NATS connections without properly closing old ones accumulate zombie connections on the server. This is common in long-running services that handle connection errors by creating a fresh connection without closing the failed one, or in test environments where processes crash without cleanup.
Microservice scaling creating excessive connections. Each pod or process in a containerized deployment typically opens its own NATS connection. A service with 500 replicas creates 500 connections to the NATS cluster. If multiple services each have hundreds of replicas, the total connection count grows multiplicatively. This is especially problematic when services open multiple connections per process.
Reconnect storms after network events. When a network partition heals or a load balancer fails over, every disconnected client reconnects simultaneously. If the server was already at moderate capacity, this burst can push it past max_connections. The NATS client libraries include jitter on reconnect attempts, but in large deployments the aggregate burst can still be significant.
Uneven distribution across cluster members. Clients configured with only a single server URL, or DNS resolving to the same server IP, concentrate connections on one cluster member while others sit idle. The overloaded server hits its connection limit while peer servers have ample capacity.
Load balancer health checks. External load balancers or monitoring systems that open TCP connections to check NATS server health may create and abandon connections frequently. Each health check connection counts against the limit, and if the health checker doesn’t properly close connections, they accumulate.
max_connections set too low for the workload. The default max_connections is 65,536, which is sufficient for many deployments. But operators who explicitly set a lower value for resource management may find their limit is too restrictive as the deployment grows.

How to diagnose

Check current connection count and limits

nats server list

This shows connection counts per server. Compare the current count against the server’s max_connections value.

For the exact limit:

curl -s http://localhost:8222/varz | jq '{connections: .connections, max_connections: .max_connections, total_connections: .total_connections}'

connections is the current count, max_connections is the ceiling, and total_connections is the lifetime total (useful for spotting churn).

Identify the heaviest connection consumers

nats server report connections

This breaks down connections by account and client name. Look for accounts or applications with disproportionately many connections.

To find connections consuming the most resources:

nats server report connections --sort subs

Sort by subscriptions to identify clients with heavy subscription footprints — these consume more server resources per connection.

Check for idle connections that may be leaks

nats server request connections --sort idle

Connections with long idle times and zero messages sent/received are likely leaked or abandoned. These are safe candidates for investigation and potential cleanup.

Check per-account connection limits

nats account info

If account-level connection limits are configured, this shows usage against the limit. Even if the server-level limit isn’t hit, an account can exhaust its own connection quota.

Check connection distribution across the cluster

nats server list

Compare connection counts across servers. If one server has significantly more connections than its peers, the distribution is uneven and that server is more likely to hit its limit.

How to fix it

Immediate: increase capacity

Raise max_connections if server resources allow. If the server has available memory and CPU headroom, increasing the connection limit buys immediate time:

1
max_connections: 100000

Reload the configuration:

nats-server --signal reload

Each connection consumes approximately 10-30KB of memory (depending on subscription count and buffer sizes), so 100,000 connections requires roughly 1-3GB of RAM for connection state alone.

Check OS-level file descriptor limits. Each NATS connection consumes a file descriptor. If the OS ulimit is lower than max_connections, the OS limit is the real ceiling:

# Check current limits for the NATS server process
cat /proc/$(pidof nats-server)/limits | grep "open files"

Increase if needed:

# In /etc/security/limits.conf or systemd unit
LimitNOFILE=131072

Short-term: reduce connection count

Identify and close leaked connections. Find connections that have been idle with zero message activity:

nats server request connections --sort idle

If your application framework supports it, implement connection lifecycle logging to detect leaks in development.

Enforce one connection per process. The NATS best practice is a single connection per application process, shared across all goroutines/threads. Multiple connections from the same process waste server resources:

1
// Go — share a single connection
2
package main
3

4
import "github.com/nats-io/nats.go"
5

6
var nc *nats.Conn
7

8
func main() {
9
    var err error
10
    nc, err = nats.Connect("nats://localhost:4222",
11
        nats.Name("order-service"),
12
    )
13
    if err != nil {
14
        panic(err)
15
    }
16
    defer nc.Close()
17

18
    // Share nc across all handlers, goroutines, etc.
19
}

1
# Python — singleton connection pattern
2
import nats
3

4
_nc = None
5

6
async def get_connection():
7
    global _nc
8
    if _nc is None or _nc.is_closed:
9
        _nc = await nats.connect("nats://localhost:4222", name="order-service")
10
    return _nc

Distribute connections using DNS round-robin or a load balancer. Ensure client connection URLs include all servers in the cluster, and enable randomization. You can also place a DNS round-robin record or TCP load balancer in front of the cluster to distribute new connections evenly:

1
nc, err := nats.Connect(
2
    "nats://server1:4222,nats://server2:4222,nats://server3:4222",
3
    nats.DontRandomize(), // Remove this line — let the client randomize
4
)

Most NATS client libraries randomize the server list by default, providing natural load distribution.

Long-term: control connection growth

Set per-account connection limits. Increase max_connections in the server config or set per-account limits to control who gets how many. Prevent any single account from monopolizing server connections:

1
// Go — account claims with connection limits
2
import "github.com/nats-io/jwt/v2"
3

4
claims := jwt.NewAccountClaims(accountPub)
5
claims.Limits.Conn = 1000  // Max 1000 connections for this account

Implement connection monitoring and alerting. Alert before reaching the 80% threshold to give operators time to respond.

Use leafnodes to reduce connection pressure on the cluster. In edge deployments where many clients connect from a single location, a leafnode consolidates hundreds of client connections into a single connection to the hub cluster:

1
# Leafnode at the edge — clients connect here
2
leafnodes {
3
    remotes [{
4
        urls: ["nats-leaf://hub-server:7422"]
5
    }]
6
}

Instead of 500 edge clients each connecting to the hub cluster, they connect to the local leafnode, which maintains a single connection upstream. The hub sees one connection instead of 500.

Frequently asked questions

What happens when a NATS server hits max_connections?

The server immediately rejects any new connection attempt with an -ERR 'maximum connections exceeded' error. The rejection happens during the CONNECT handshake — the TCP connection is accepted, the error is sent, and the connection is closed. NATS client libraries will attempt to reconnect to other servers in their URL list, which is why having multiple servers configured is important.

Does max_connections include route and gateway connections?

No. Route connections (between cluster members) and gateway connections (between super-clusters) are not counted against max_connections. The limit applies only to client connections, including leafnode connections. System account connections (used by the NATS CLI and monitoring tools) do count against the limit.

How many connections can a single NATS server handle?

The NATS server can handle over 100,000 concurrent connections on modern hardware with adequate memory and file descriptor limits. The practical limit depends on the workload: idle connections are cheap (~10KB each), but connections with high subscription counts or high message throughput consume more resources. Deployments exceeding 50,000 connections per server should monitor CPU and memory closely.

Should I set max_connections lower than the default?

Only if you want to enforce a hard capacity limit for operational reasons — for example, ensuring a server never consumes more than a specific amount of memory. The default of 65,536 is reasonable for most deployments. If you lower it, make sure your deployment won’t legitimately need more connections during scaling events or reconnection storms. Setting it too low causes unnecessary client rejections.

How do I find which application is leaking connections?

Use nats server report connections and look for application names (set via the client Name option at connect time) that appear more times than expected. If your applications don’t set connection names, fix that first — unnamed connections are nearly impossible to attribute. Sort by idle time to find connections that are consuming a slot but doing no work.

FEATURED

RESOURCES

Comparisons