Excessive user connections means a single NATS user identity has more than 100 active connections. In most architectures, one connection per process is sufficient — a single user with hundreds of connections typically indicates a connection leak, misconfigured reconnection logic, or a deployment pattern that creates unnecessary connections.
Every NATS connection consumes server resources: memory for the connection state, CPU for subscription matching, and file descriptors at the OS level. A single user holding 100+ connections is using resources that could serve 100+ distinct clients. At scale, a few users with connection leaks can exhaust the server’s max_connections limit, blocking legitimate new connections from other users and services.
The problem is often invisible until it causes an outage. Connection leaks tend to be gradual — a process that opens a new connection on every request without closing the old one accumulates connections slowly. At 10 requests per minute, it takes just under two hours to reach 1,000 connections. By the time someone notices the server rejecting new connections, the leaking process has been running for hours and the connection table is full.
Excessive connections from a single user also complicate debugging and monitoring. If one user credential is shared across hundreds of connections, per-user metrics (throughput, subscription interest, pending bytes) aggregate all those connections into a single identity. You can’t distinguish the healthy connections from the problematic ones without additional metadata like client name or IP address. This is why the NATS best practice is one connection per process with a descriptive client name.
Connection leak in application code. The application creates a new NATS connection for each request, batch job, or goroutine without closing the previous one. The old connections remain open on the server until they time out or the process exits. This is the most common cause.
Reconnect loop creating duplicate connections. A client with aggressive reconnect logic creates a new connection before the server has cleaned up the old one. Each reconnect cycle adds a connection, and the old connection lingers until the server’s ping/pong timeout evicts it.
Shared credentials across many processes. A single user credential (token, NKey, or JWT) is used by every instance of a service. When the service scales to many replicas — Kubernetes pods, Lambda functions, container instances — each replica opens its own connection under the same user identity.
Multiple connections per process. Some applications create separate NATS connections for different subsystems: one for publishing, one for subscribing, one for request-reply. While occasionally justified, this pattern multiplies the connection count per process by the number of subsystems.
Microservice scaling without per-instance credentials. Horizontal scaling with a single shared credential means the user connection count grows linearly with replica count. At 50 replicas with 2 connections each, a single user has 100 connections.
Query the server’s connection endpoint with authentication details:
curl -s http://localhost:8222/connz?auth=true&limit=500 | \ jq '[.connections[] | .authorized_user] | group_by(.) | map({user: .[0], count: length}) | sort_by(-.count) | .[:10]'This groups connections by user and shows the top 10 by connection count.
Once you’ve identified the high-connection user, drill into their connections:
curl -s http://localhost:8222/connz?auth=true&user=<username>&limit=100 | \ jq '.connections[] | {cid, name, ip, start, idle, pending_bytes, in_msgs, out_msgs}'Look for patterns:
Connections that have been idle with zero lifetime messages are strong indicators of leaks:
curl -s http://localhost:8222/connz?auth=true&limit=500 | \ jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0 and .idle != "") | {cid, name, ip, idle, start}'nats server report accountsIf the user is approaching the account’s connection limit, the impact extends beyond this single user to all users in the account.
Kick idle, zero-activity connections. If you’ve identified connections that are clearly leaked (zero messages, idle for extended periods), close them via the system account:
# Close a specific connection by CID (requires system account access)nats server request kick <connection_id>The /connz HTTP endpoint is read-only — connections cannot be terminated via HTTP. Use nats server request kick (or publish to $SYS.REQ.SERVER.<id>.KICK) instead.
Set account-level connection limits. NATS does not support per-user connection caps, but per-account limits constrain the total connections any user (and the account as a whole) can hold. This bounds the blast radius of a leaky service without changing per-user JWTs:
# JWT/nsc — limit total connections across all users in the accountnsc edit account Production --conns 100For server-config-based auth, account-level limits are configured in the accounts {} block (see ACCOUNTS_001).
Ensure one connection per process, reused for the process lifetime. The NATS client connection should be created once at startup and shared across all goroutines, threads, or handlers:
1// Go — singleton connection pattern2package main3
4import (5 "log"6 "github.com/nats-io/nats.go"7)8
9var nc *nats.Conn10
11func main() {12 var err error13 nc, err = nats.Connect("nats://localhost:4222",14 nats.Name("order-service"),15 nats.MaxReconnects(-1),16 )17 if err != nil {18 log.Fatal(err)19 }20 defer nc.Close()21
22 // Use nc throughout the application23 startHTTPServer(nc)24}1# Python (nats.py) — single connection, reused2import nats3
4class App:5 def __init__(self):6 self.nc = None7
8 async def start(self):9 self.nc = await nats.connect(10 servers=["nats://localhost:4222"],11 name="order-service",12 max_reconnect_attempts=-1,13 )14 # Use self.nc throughout the application15
16 async def stop(self):17 if self.nc:18 await self.nc.close()Fix reconnect logic to avoid duplicates. Ensure the client library’s built-in reconnect handles connection lifecycle correctly. Don’t wrap the connect call in a retry loop that creates new connection objects — use the library’s reconnect options instead:
1// Wrong — creates duplicate connections2for {3 nc, err := nats.Connect(url)4 if err != nil {5 time.Sleep(time.Second)6 continue7 }8 break9}10
11// Right — built-in reconnect handles it12nc, err := nats.Connect(url,13 nats.MaxReconnects(-1),14 nats.ReconnectWait(2*time.Second),15)Issue per-instance user credentials. Instead of sharing one credential across all replicas, generate unique user credentials per deployment instance. This gives you per-instance visibility in connz output, lets you set per-user subscription/payload limits, and makes it easy to revoke a single instance’s credentials in isolation:
# Create per-instance usersnsc add user --name order-service-pod-1 --account Productionnsc add user --name order-service-pod-2 --account ProductionPer-user JWT limits do not include a connection cap (NATS only exposes connection limits at the account level), so use per-account --conns to bound aggregate connection usage.
Monitor per-user connection counts. Set up alerting on connection counts per user to catch leaks early.
Synadia Insights tracks per-user connection counts automatically, flagging users that exceed the threshold across all servers in your deployment — catching connection leaks and misconfigurations before they exhaust server resources.
One connection per process is the NATS best practice. If a service runs 10 replicas with a shared credential, 10 connections is expected. The default threshold of 100 is intentionally conservative — it catches runaway leaks without flagging normal scaled deployments. Adjust the threshold based on your architecture: if a service legitimately runs 200 replicas, set the threshold accordingly for that user.
Each connection consumes approximately 10–20 KB of server memory for the connection state, plus memory for any subscription interest. A thousand idle connections use roughly 10–20 MB — not catastrophic on its own, but they consume file descriptors and count against max_connections. The real risk is reaching the connection limit and blocking new legitimate connections, not the memory footprint.
No. NATS does not expose a per-user connection cap — neither the server-config user block nor the JWT user limits include a connection field (only account JWTs do, via nsc edit account --conns). To bound a single credential’s blast radius, either issue per-instance credentials (so each replica has its own user) or rely on the account-level --conns limit covering all users in that account.
Set or update the account-level --conns limit. With JWT-based auth (nsc), nsc edit account <name> --conns N updates the account JWT and the server picks up the change via the account resolver — no restart required. For server-config-based auth, update the accounts {} block and run nats-server --signal reload. Existing excess connections aren’t terminated retroactively; new connections beyond the limit will be rejected.
Per-account limits (ACCOUNTS_001) cap the total connections across all users in an account — that’s the only knob NATS exposes. There is no per-user connection cap. To get fine-grained control, issue per-instance user credentials inside an account so the account-level limit acts as a ceiling on aggregate connections, and use per-user subscription/payload limits to constrain other resources.
No. Unlike database connections, NATS connections are fully multiplexed — a single connection supports unlimited concurrent subscriptions, publishes, and request-reply operations. There’s no performance benefit to multiple connections, and pooling adds complexity. The only valid reason for multiple connections per process is isolation (e.g., separating a high-throughput data plane connection from a low-volume control plane connection), and even that is rarely necessary.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community