NATS Idle Client Connections: What They Mean and How to Fix Them

An idle client connection is one that has been connected for more than the configured threshold (default: 5 minutes) and has sent and received zero messages over its entire lifetime. The TCP connection is alive — pings are exchanged, the socket is open — but no application data has ever flowed through it.

Why this matters

Every connection consumes server resources. Each one costs a file descriptor, a goroutine for the read loop, memory for the connection’s pending buffer and subscription state, and CPU cycles for periodic ping/pong health checks. On a single server, one idle connection is invisible in the resource profile. But idle connections tend to accumulate — leaked connections from applications that connect but never subscribe or publish, monitoring agents that maintain presence without traffic, or misconfigured clients that connect to the wrong server or account.

At scale, idle connections create real capacity pressure. A server configured with max_connections: 10000 that has 2,000 idle connections has effectively lost 20% of its connection capacity to clients that aren’t doing anything. Operators see the connection count climbing and add capacity or raise limits, solving the wrong problem. The actual issue is clients that connect and then do nothing, consuming slots that should be available for productive workloads.

Idle connections also obscure monitoring. Connection counts, per-server connection distribution, and account connection metrics all include idle connections. This inflates apparent load, skews balancing decisions, and makes it harder to distinguish between a server that’s busy serving real traffic and one that’s collecting dead weight. During incidents, operators investigating connection-related issues waste time examining connections that have literally zero traffic.

Common causes

Connection leak in application code. The application creates a NATS connection on startup or per-request but never subscribes or publishes on it. Common in microservices that initialize a NATS client as part of their dependency injection framework but only use it in specific code paths that never execute — or that were removed in a refactor while the connection setup remained.
Monitoring or health-check agent. Some infrastructure tools connect to NATS to verify connectivity — confirming the TCP handshake and auth succeed — without subscribing to any subjects. The connection serves its purpose (health check passes) but appears idle from the server’s perspective.
Misconfigured client connecting to the wrong server or account. A client connects successfully (authentication passes) but subscribes to subjects that don’t exist in this account, or is configured to connect to a server that doesn’t handle its traffic. The connection is established but useless.
Standby or failover instance. An application connects to NATS in standby mode, waiting for a leader election or failover signal before it starts subscribing or publishing. Until failover occurs, the connection is idle. This is a legitimate pattern, but long-idle standby connections should be identifiable by name.
Legacy or abandoned process. A long-running process that was once active has been functionally replaced but never stopped. It maintains its NATS connection out of inertia — the process is alive, the connection is open, but the code paths that used NATS have been disabled or are never triggered.

How to diagnose

Find connections with zero messages

The most direct approach uses the server’s connections endpoint:

nats server report connections --sort in-msgs

Connections at the bottom of the list with zero messages sent and received are idle. For more detail:

curl -s "http://localhost:8222/connz?sort=idle&limit=50" | jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0) | {cid: .cid, name: .name, account: .account, idle: .idle, uptime: .uptime}'

This filters for connections that have exchanged zero messages and shows how long they’ve been idle and how long they’ve been connected.

Identify the source application

The name field in the connection info is set by the client at connect time. Well-named connections make this step trivial:

curl -s "http://localhost:8222/connz?sort=idle&limit=50" | jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0) | {name: .name, ip: .ip, lang: .lang, version: .ver}'

The lang field (Go, Python, Java, etc.) and client version help narrow down which application or library is responsible. The ip field identifies the source host.

Check subscription count

An idle connection might still have subscriptions — it subscribed to subjects that never receive messages:

curl -s "http://localhost:8222/connz?sort=idle&subs=true&limit=50" | jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0) | {name: .name, subscriptions: .subscriptions_list}'

A connection with zero messages and zero subscriptions is almost certainly a leak or a health-check agent. A connection with subscriptions but zero messages might be subscribed to subjects with no publishers — which is a different kind of misconfiguration.

Check subscription count to identify leaked connections

A key diagnostic step is checking whether the idle connection has any subscriptions at all. If the subscription count is zero, the connection is likely leaked — connected but never subscribed to anything:

curl -s "http://localhost:8222/connz?sort=idle&subs=detail&limit=50" | jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0) | {name: .name, subscriptions: .subscriptions, lang: .lang, version: .ver}'

If subscriptions exist but message counts are still zero, check the client library name and version — it may be a monitoring or health-check client that connects but does not publish or subscribe to active subjects.

Distinguish from legitimate standby connections

Check the connection name and uptime. Standby connections typically have descriptive names (order-service-standby, failover-worker) and have been connected since the application started. Random or default names with very long idle times are more likely leaks.

How to fix it

Immediate: close leaked connections to free server resources

Connections with zero subscriptions and zero messages are almost certainly leaked. Close them to free server resources — file descriptors, goroutines, and memory. The NATS server supports forcibly closing connections by CID:

# Get the CID from connz
curl -s "http://localhost:8222/connz?sort=idle&limit=10" | jq '.connections[] | select(.in_msgs == 0 and .out_msgs == 0) | .cid'

# Kick a specific connection
nats server request kick <cid> <SERVER_ID>

If the client has reconnect logic (most do by default), it will reconnect — and if the application is still not subscribing or publishing, the connection will be idle again. Kicking is diagnostic: it tells you whether the client is actively maintained (it reconnects) or truly orphaned (it doesn’t come back).

Short-term: fix connection leaks in application code

The most common fix is ensuring applications actually use the connections they create:

1
// Go - nats.go
2
// Bad: connection created but never used
3
nc, _ := nats.Connect(url)
4
// ... nc is never used for Subscribe or Publish
5

6
// Good: connection setup with subscription
7
nc, _ := nats.Connect(url,
8
    nats.Name("order-service-prod"),
9
)
10
_, err := nc.Subscribe("orders.>", func(msg *nats.Msg) {
11
    processOrder(msg)
12
})

1
// TypeScript - nats.js
2
import { connect } from "nats";
3

4
// Bad: connected but never subscribing
5
const nc = await connect({ servers: "nats://localhost:4222" });
6

7
// Good: connect with a purpose
8
const nc = await connect({
9
    servers: "nats://localhost:4222",
10
    name: "order-service-prod",
11
});
12
const sub = nc.subscribe("orders.>");
13
for await (const msg of sub) {
14
    processOrder(msg);
15
}

Always set a client name. This is the single most impactful debugging improvement for connection management:

1
nc, _ := nats.Connect(url, nats.Name("myservice-prod-v2"))

Without a name, idle connections show up as anonymous entries in connz, and tracing them back to a specific application requires IP address correlation with your deployment inventory.

Long-term: enforce connection hygiene

Set per-account connection limits. Prevent any single account from accumulating unlimited connections:

nsc edit account -n PROD --max-conns 500

When idle connections push the account toward its limit, the problem becomes self-evident — new legitimate connections are rejected, forcing the team to investigate and clean up.

Configure shorter ping intervals for faster stale detection. The server’s default ping interval is 2 minutes with 2 missed pings before disconnect. For environments where idle connections are a concern, tighten this:

1
ping_interval: 30s
2
ping_max: 2

This doesn’t help with idle-but-alive connections (they respond to pings), but it catches connections where the client process has died without closing the TCP socket.

Use Synadia Insights for continuous detection. Insights flags connections with zero lifetime messages and idle time exceeding the threshold automatically, across every server and account. Instead of periodic connz audits, idle connections surface as findings every collection cycle — with the client name, IP, account, and idle duration already correlated.

Frequently asked questions

Are idle connections the same as stale connections?

No. A stale connection (SERVER_012) has stopped responding to the server’s ping health checks — the client process has likely crashed or the network has failed. An idle connection is alive and healthy (it responds to pings) but has never sent or received application messages. Stale connections are broken. Idle connections are functional but unused.

What if the idle connection is a monitoring agent?

That’s a legitimate pattern. Monitoring tools that verify NATS connectivity by establishing a connection are expected to show zero messages. The fix isn’t removing the connection — it’s naming it clearly (e.g., healthcheck-prometheus) so operators can filter it out when investigating idle connections. You can also have the monitoring agent subscribe to a dedicated health subject and publish a ping, which both validates messaging and removes it from idle connection reports.

How many idle connections are a concern?

It depends on your server’s connection capacity and total count. If idle connections are less than 5% of your total connections, the resource waste is negligible. If they’re 20-30% or more, you’re leaving significant capacity on the table. The bigger concern is what idle connections represent — each one is a symptom of a process or configuration that isn’t working as intended.

Will the server automatically close idle connections?

Not by default. The NATS server closes connections only when they fail ping checks (stale) or fall behind on message delivery (slow consumer). A connection that responds to pings but does nothing else will persist indefinitely. Client-side inactivity timeouts don’t exist in the NATS protocol. Connection lifecycle management is the application’s responsibility.

Can idle connections cause slow consumer events?

No. Slow consumer events (SERVER_004) occur when a client can’t read messages fast enough. A client with zero subscriptions receives no messages and can’t become a slow consumer. However, idle connections do consume the same connection slots and file descriptors that active connections need. If idle connections push the server toward its max_connections limit, new clients that would actively process messages can’t connect — indirectly degrading the system’s ability to handle traffic.

FEATURED

RESOURCES

Comparisons