Client pending pressure means a NATS client connection has accumulated more outbound data in the server’s write buffer than the configured threshold. This is the early warning stage before the server disconnects the client as a slow consumer — catching it here lets you intervene before message loss occurs.
Every NATS client connection has a server-side write buffer. When the server needs to deliver a message to a subscriber, it writes the message into this buffer. The client reads from the buffer over the TCP connection. When the client reads slower than the server writes, pending bytes accumulate. If the buffer fills completely, the server disconnects the client to protect system-wide throughput — a slow consumer eviction.
Client pending pressure is the canary in the coal mine. By the time the slow consumer counter increments, the connection is already gone — messages in the buffer are lost (for core NATS), and the client must reconnect. Monitoring pending bytes gives you a window to act: reduce the publish rate, fix the client’s processing bottleneck, or add capacity — all before any data is lost or any connection is dropped.
In clustered deployments, high pending pressure on multiple connections compounds quickly. Each buffered connection consumes server memory proportional to its pending bytes. A handful of clients each holding 10–50 MiB of pending data can push a server into memory pressure territory, affecting route and gateway connections and degrading performance for every client on that node.
Blocking message handler. The subscriber performs synchronous work — database writes, HTTP calls, file I/O — inside the message callback. While the callback blocks, the client’s TCP read loop stalls and pending bytes climb on the server side. This is the most common cause by a wide margin.
Publish rate spike. A subject’s message rate jumps during a batch import, backfill, or incident-driven traffic surge. The subscriber’s throughput ceiling hasn’t changed, but the inbound rate has temporarily exceeded it.
Network degradation between client and server. Increased latency, packet loss, or reduced bandwidth slows the TCP drain rate from the server’s write buffer. Pending bytes build even though the client is processing messages as fast as it receives them. Often correlated with High Client RTT (CONN_001).
Garbage collection pauses. Runtime GC pauses (JVM, Go, .NET) freeze the client’s read loop. A 200ms pause at 50,000 msg/s means 10,000 messages buffer while the client is frozen. Short pauses at high message rates are enough to trigger the threshold.
Wildcard subscription fan-out. A subscriber on a broad wildcard like events.> receives aggregate traffic from hundreds or thousands of specific subjects. No single subject is hot, but the combined rate overwhelms a single consumer.
Undersized client pending limits. The client library’s own pending buffer is too small, causing the client to apply backpressure on its TCP read, which backs up into the server’s write buffer. Many libraries default to 64 MB, which may not be enough for high-throughput subjects.
Sort all client connections by pending data:
curl -s 'http://localhost:8222/connz?sort=pending_bytes&limit=20' | jq '.connections[]'Look for connections with pending bytes in the MiB range. Any connection consistently above 1 MiB is at risk.
For more detail, query the monitoring endpoint directly:
curl -s http://localhost:8222/connz?sort=pending&limit=10 | jq '.connections[] | {cid, name, ip, pending_bytes, subscriptions_list}'Check the publish rate on subjects the high-pending client subscribes to:
# Check per-subject rates for a specific accountnats server report accounts --account <account_name>If pending pressure appears only during rate spikes, the client cannot sustain peak throughput. If pending pressure is constant even at normal rates, the client has a baseline processing bottleneck.
# Client-side RTTnats rtt
# Server-side view of all client RTTsnats server listIf high pending correlates with high RTT on the same connections, the network path is the bottleneck. See also: High Client RTT (CONN_001).
nats server infoThe write_deadline setting (default 10s) controls how long the server waits for a slow client before disconnecting it. If pending pressure is building but clients aren’t being disconnected, the write deadline may be configured high — giving you more warning time but also allowing more memory buildup per connection.
Increase the client pending buffer. The default server-side pending buffer is 64 MiB (max_pending in server config). On the client side, pending limits are set per subscription — a larger limit absorbs transient spikes without triggering server-side pressure:
1// Go client — pending limits are per-subscription2sub, _ := nc.Subscribe("orders.>", handler)3sub.SetPendingLimits(1_000_000, 256*1024*1024) // 1M msgs, 256MBThis buys time but doesn’t fix the root cause. Note that the server disconnects a client as a slow consumer when the server-side pending buffer (default 64 MiB, configured via max_pending) is exhausted — this check fires at 1 MiB as an early warning well before that limit. Monitor pending bytes after the change to confirm the client is actually draining the buffer during normal operation.
Verify client names are set. Without client names, identifying the problematic process from connz output is nearly impossible:
1nc, err := nats.Connect(url,2 nats.Name("order-processor-east-1"),3)Move blocking work out of the message callback. Decouple reading from NATS and processing messages using an internal channel or queue:
1work := make(chan *nats.Msg, 10_000)2sub, _ := nc.Subscribe("orders.>", func(msg *nats.Msg) {3 work <- msg // non-blocking enqueue4})5
6// Worker pool processes messages independently7for i := 0; i < numWorkers; i++ {8 go func() {9 for msg := range work {10 processOrder(msg.Data)11 }12 }()13}1// TypeScript (nats.js) — async iterator with worker pool2import { connect } from "nats";3
4const nc = await connect({ servers: "nats://localhost:4222" });5const sub = nc.subscribe("orders.>");6
7const process = async (msg) => {8 // do work outside the read loop9 await db.insert(msg.data);10};11
12for await (const msg of sub) {13 // Fire and forget — don't await in the loop14 process(msg).catch(console.error);15}Add queue group subscribers. Distribute load across multiple consumer instances. NATS round-robins messages within a queue group:
# Each instance joins the same queue groupnats sub "orders.>" --queue order-processorsAdding instances linearly increases aggregate throughput without changing any publisher.
Switch to JetStream pull consumers for high-throughput subjects. Pull consumers give the client explicit flow control — it fetches batches at its own pace, eliminating server-side pending pressure entirely:
1// Go — JetStream pull consumer with batched fetch2js, _ := nc.JetStream()3sub, _ := js.PullSubscribe("orders.>", "order-processor")4
5for {6 msgs, err := sub.Fetch(100, nats.MaxWait(5*time.Second))7 if err != nil {8 continue9 }10 for _, msg := range msgs {11 processOrder(msg.Data)12 msg.Ack()13 }14}Synadia Insights evaluates pending pressure automatically every collection epoch, alerting on clients approaching danger without any manual metric configuration.
Client pending pressure (CONN_002) is the early warning — the server’s outbound buffer for a client is accumulating data faster than the client reads it. A slow consumer event (SERVER_004) is the result — the server has given up waiting and disconnected the client. Pending pressure gives you time to intervene; a slow consumer event means the damage is already done. Monitoring pending bytes is how you prevent slow consumer disconnections.
The connz endpoint shows the subscription list for each connection. Query it with curl http://localhost:8222/connz?sort=pending&subs=detail&limit=10 to see both the pending bytes and the subjects each high-pending client subscribes to. Cross-reference with per-subject publish rates from nats server report accounts to identify the hot subject.
Not directly — each connection has its own write buffer. But indirectly, yes: pending bytes consume server memory. If multiple connections each hold tens of MiB in pending data, the aggregate memory usage impacts server performance, garbage collection frequency, and ultimately latency for all clients on that server.
No. Pull consumers request messages explicitly via fetch, so the server never pushes data the client hasn’t asked for. There is no server-side write buffer accumulating for a pull consumer. This is one of the main advantages of pull-based consumption for high-throughput workloads. Push consumers, including JetStream push consumers, are still subject to pending pressure.
A good starting point is 1 MiB — the Insights default. For high-throughput subjects with large messages, you may want to raise this to 10 MiB to reduce noise. The key metric is trend, not absolute value: a connection whose pending bytes are steadily climbing is more concerning than one that briefly spikes and drains. If pending bytes never return to near-zero between collection intervals, the client is structurally too slow.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community