A slow consumer is a NATS client that cannot read messages from the server fast enough. When the server’s outbound buffer for that client fills up, the server disconnects the client to protect overall system throughput. In production, slow consumer events are one of the most common — and most disruptive — failure modes in high-throughput NATS deployments.
Slow consumer disconnections are not graceful. The server drops the connection without waiting for the client to catch up. For core NATS subscribers (non-JetStream), every message in the pending buffer and every message published while the client reconnects is permanently lost. There is no replay.
In clustered deployments, a single slow consumer can trigger a cascade. The server buffers messages for the slow client, consuming memory that affects all connections on that server. If multiple clients fall behind simultaneously — during a traffic spike or a downstream dependency slowdown — server memory spikes, garbage collection stalls increase, and other healthy clients start experiencing latency. In severe cases, the server itself becomes resource-constrained, affecting route and gateway connections to other cluster members.
The problem is especially insidious because slow consumer events are often intermittent. A client that keeps up at normal traffic volumes may fail during peak load, during garbage collection pauses, or when network latency spikes. By the time someone notices the disconnections in logs, the damage — dropped messages, broken request-reply chains, stale cache state — has already propagated through the system.
Processing bottleneck in the message handler. The subscriber does blocking work — database writes, HTTP calls, complex computation — inside the message callback. Every message that takes longer than the inter-message arrival time pushes the client further behind. This is the most common cause by far.
Publish rate exceeds subscriber capacity. A subject that normally sees 1,000 msg/s spikes to 50,000 msg/s. The subscriber’s throughput ceiling hasn’t changed, but the inbound rate has. Common during batch imports, backfills, or incident-driven traffic surges.
Server outbound write timeout (max_pending saturation). Slow consumer eviction triggers when the server’s outbound write to a client times out due to buffer saturation. The server-side max_pending buffer (default 64 MiB) fills when the client cannot read fast enough. Once full, the server disconnects the client to protect overall throughput. Increasing max_pending in the server config provides more buffer headroom but does not fix the underlying throughput mismatch.
Insufficient client pending buffer. The NATS client library maintains a pending messages/bytes buffer between the TCP reader and the application callback. If this buffer is too small relative to message size and rate, it fills before the server-side buffer does — but the effect is the same. Many client libraries default to 64MB / 65,536 messages, which may be inadequate for high-throughput subjects.
Network latency or packet loss. The TCP connection between client and server slows down. The server writes messages to the socket, the OS send buffer fills, and the server marks the client as slow. This commonly surfaces as slow consumer events correlated with network latency spikes (see also: CONN_001 High Client RTT).
Garbage collection pauses. JVM, Go, .NET, and other runtime GC pauses freeze the client’s read loop. If the pause exceeds the time it takes for the server buffer to fill at the current publish rate, the client gets disconnected. A 200ms GC pause at 100,000 msg/s means 20,000 messages buffer — easily enough to trigger eviction.
Fan-out on a wildcard subscription. A subscriber on events.> receives traffic for thousands of specific subjects. The aggregate rate across all matching subjects exceeds what a single subscriber can process, even though no individual subject is particularly hot.
Check the server-level slow consumer counter:
nats server report connectionsLook for the slow column. Any non-zero value indicates slow consumer disconnections have occurred on that server. For real-time monitoring:
curl -s 'http://localhost:8222/connz?state=closed&sort=stop&limit=50' | jq '.connections[] | select(.reason | test("Slow"))'Server logs record the client details at disconnection time:
1[WRN] Slow Consumer Detected: ReadBufferSize=0, MaxPending=67108864, PendingBytes=67108864, PendingMsgs=12847 - cid:42 "order-processor" ...Key fields:
Compare the publish rate on affected subjects with the subscriber’s processing rate:
# Check per-subject message ratesnats server report jetstream --subjects
# For core NATS, check the subscribe interestnats server report accounts --account <account_name>If the publish rate is stable and the slow consumer events are intermittent, suspect processing bottlenecks or GC pauses. If slow consumer events correlate with rate spikes, the subscriber simply can’t keep up with peak load.
# Check RTT to the server from the client's perspectivenats rtt
# Check server-side RTT for all connectionsnats server listRTT above 10ms between client and server in the same datacenter, or above 50ms across regions, combined with high message rates, increases slow consumer risk.
Increase the client pending buffer. Pending limits in nats.go are configured per subscription, not on the connection. Raising them buys time for transient slowdowns:
1// Go client — set pending limits on the subscription2sub, _ := nc.Subscribe("orders.>", handler)3sub.SetPendingLimits(1_000_000, 256*1024*1024) // 1M msgs, 256MBThis is a band-aid. It widens the per-subscription buffer but doesn’t fix the underlying throughput mismatch.
Increase server-side max_pending. The server’s max_pending setting (default 64 MiB) controls how much data the server buffers before evicting a slow client. Increasing it provides more time for clients to catch up during transient slowdowns:
1max_pending: 128MBHave clients subscribe to fewer subjects. Reducing the number of subjects a single client subscribes to lowers the aggregate message rate it must handle, reducing the likelihood of buffer saturation.
Restart disconnected consumers. Ensure your client connection has reconnect logic enabled (most NATS client libraries auto-reconnect by default). If the consumer process itself crashed, restart it. Every second it’s down is messages lost (core NATS) or delivery delay (JetStream).
Move blocking work out of the message handler. The message callback should do the minimum: deserialize, validate, enqueue to an internal channel/queue. A separate worker pool processes the queue. This decouples “reading from NATS” from “doing the work”:
1// Instead of this:2sub, _ := nc.Subscribe("orders.>", func(msg *nats.Msg) {3 db.Insert(msg.Data) // blocks the read loop4})5
6// Do this:7work := make(chan *nats.Msg, 10_000)8sub, _ := nc.Subscribe("orders.>", func(msg *nats.Msg) {9 work <- msg // non-blocking enqueue10})11for i := 0; i < numWorkers; i++ {12 go func() {13 for msg := range work {14 db.Insert(msg.Data)15 }16 }()17}Add queue group subscribers. Distribute the message processing load across multiple consumer instances. NATS round-robins messages within a queue group, so adding instances linearly increases throughput:
# Each instance subscribes to the same queue groupnats sub "orders.>" --queue order-processorsTune GC if applicable. For JVM-based consumers, consider ZGC or Shenandoah for low-pause collection. For Go, GOGC tuning or GOMEMLIMIT can reduce pause frequency. The goal is to keep GC pauses well below the time-to-fill-buffer at peak message rate.
Use JetStream instead of core NATS for critical data flows. JetStream consumers can resume from where they left off after a disconnection. Slow consumer disconnections still happen, but no messages are lost — the consumer catches up on reconnect. This is the single most impactful architectural change for systems where message loss is unacceptable.
Implement backpressure signaling. If publishers can slow down when consumers are behind, the system self-regulates. Request-reply patterns naturally provide this — the publisher waits for a response before sending the next message. For fire-and-forget patterns, consider a rate-limiting mechanism tied to consumer lag.
Partition high-fan-out subjects. If a wildcard subscription like events.> aggregates too many subjects, split consumers by subject prefix: one consumer for events.orders.>, another for events.inventory.>, etc. Each consumer handles a fraction of the total rate.
A stalled client (SERVER_013) is an early warning — the client is reading slowly but hasn’t been disconnected yet. A slow consumer is the next stage: the server has given up waiting and dropped the connection. Stalled client events let you intervene before disconnection happens. If you’re seeing slow consumer events without prior stalled client warnings, the buffer is filling too fast for the early warning to trigger.
Yes. The server allocates memory to buffer messages for the slow client. Under heavy load, this memory pressure can increase garbage collection pauses on the server itself, adding latency for all connections. In extreme cases, memory exhaustion can destabilize the server. This is exactly why the server disconnects slow consumers — it’s protecting the health of the overall system.
Yes. JetStream consumers use the same underlying NATS connection and are subject to the same pending buffer limits. The difference is resilience: when a JetStream consumer is disconnected for being slow, it can resume from its last acknowledged message on reconnect. Core NATS subscribers lose all buffered and in-flight messages permanently. This is why JetStream is strongly recommended for any data flow where message loss is unacceptable.
The NATS server exposes the slow_consumers counter via the /varz monitoring endpoint. Alert when the delta of this counter is greater than zero over your collection interval. In Prometheus.
Synadia Insights evaluates this automatically every collection epoch across your entire deployment, including per-account attribution (ACCOUNTS_002) so you know which tenant or workload is affected.
There’s no universal answer — it depends on your message sizes, publish rates, and tolerance for memory usage per connection. A good starting heuristic: set the pending bytes limit to hold at least 10 seconds of messages at peak publish rate. If your peak is 10,000 msg/s at 1KB each, that’s ~100MB. The default 64MB in most client libraries is fine for moderate workloads but inadequate for high-throughput subjects.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community