Ack pending buildup occurs when a JetStream consumer has a growing number of delivered-but-unacknowledged messages approaching the max_ack_pending limit. When that limit is reached, the server stops delivering new messages to the consumer entirely, creating a hard stall that persists until in-flight messages are acknowledged, nak’d, or expire past their ack_wait window.
The max_ack_pending limit is JetStream’s built-in backpressure mechanism. It prevents a consumer from accepting more messages than it can handle by capping how many can be in-flight simultaneously. When ack pending climbs to 80% or more of this limit, the consumer is approaching the point where delivery will pause — and a minor slowdown or latency spike is enough to push it over the edge.
Once max_ack_pending is reached, the server holds all new messages in the stream. They don’t disappear — they accumulate as num_pending. But no new messages are delivered until the consumer acknowledges or otherwise resolves some of its in-flight messages. For real-time workloads, this means the consumer falls behind. For request-reply patterns built on JetStream, it means timeouts. For event-driven architectures, it means cascading delays across every downstream service waiting on those messages.
The buildup is often gradual. A consumer that processes messages in 50ms at average load may creep to 200ms under peak traffic, slowly filling the ack pending pool. Everything looks fine until the limit is hit and delivery stops abruptly. There’s no graceful degradation — it’s a cliff. By the time you notice, the consumer may have thousands of undelivered messages queued in the stream, and recovery requires both clearing the backlog and addressing whatever caused the processing slowdown.
Processing throughput below publish rate. The consumer processes messages slower than they arrive. If the stream receives 5,000 msg/s and the consumer processes 3,000 msg/s, the ack pending pool fills at 2,000 msg/s. At the default max_ack_pending of 1,000, the limit is hit in under a second.
Downstream dependency latency. The consumer calls a database, API, or external service during processing. When that dependency slows down — connection pool exhaustion, query timeouts, rate limiting — each message takes longer to process, and ack pending climbs.
Single-threaded or under-parallelized processing. The consumer processes messages sequentially when the workload could be parallelized. A single-threaded consumer with 100ms processing time per message can handle at most 10 msg/s, regardless of how high max_ack_pending is set.
Batch processing holding messages. The consumer collects messages into batches (for bulk database inserts, for example) and only acknowledges them when the batch completes. A batch size of 500 with 30-second batch windows means 500 messages sit in ack pending for the entire window.
max_ack_pending set too low. The default max_ack_pending is 1,000. A limit of 256 or 1,000 may be adequate for low-throughput streams but completely insufficient for high-rate workloads. The limit should reflect how many messages your consumer can realistically have in-flight at any moment. Note that ack pending can also be limited at the stream level via consumer limits and at the account level.
Consumer not acknowledging on error paths. The happy path sends Ack, but error handlers that log and continue without calling Nak or Term leave messages in limbo until ack_wait expires. Each unresolved message occupies an ack pending slot.
Get the current ack pending count and limit for a specific consumer:
nats consumer info <stream_name> <consumer_name>Look for:
Compare these to calculate how close you are to stalling. At 80%+, the consumer is in the danger zone.
For a quick overview across all consumers on a stream:
nats consumer report <stream_name>If num_pending is growing while num_ack_pending equals max_ack_pending, the consumer is stalled. Messages are accumulating in the stream with no delivery:
nats consumer info <stream_name> <consumer_name> --json | jq '{ ack_pending: .num_ack_pending, max_ack_pending: .config.max_ack_pending, pending: .num_pending, stalled: (.num_ack_pending >= .config.max_ack_pending)}'Check whether the issue is throughput or latency:
ack_wait misconfiguration.Check for downstream dependency issues by correlating ack pending spikes with latency metrics on databases, APIs, or other services your consumer depends on.
Increase max_ack_pending if the consumer can handle more in-flight messages. The default is 1,000. This is only appropriate if the consumer has processing headroom but the limit is artificially low:
nats consumer edit <stream_name> <consumer_name> --max-pending=5000Don’t set this arbitrarily high. max_ack_pending should reflect the actual number of messages your consumer can have in-flight without degrading. Setting it to 1,000,000 when your consumer can only process 100/s just delays the stall and increases memory usage.
Check for stream-level and account-level limits. Ack pending can also be constrained at the stream level via consumer limits and at the account level via account JetStream limits. If the consumer’s max_ack_pending is set appropriately but ack pending is still capped, check these higher-level limits.
Acknowledge messages on all code paths. Audit your message handler to ensure every exit path — success, transient error, permanent error — sends an appropriate response:
1// Go client (nats.go)2func handleMessage(msg *nats.Msg) {3 data, err := unmarshal(msg.Data)4 if err != nil {5 _ = msg.Term() // Permanent failure — stop redelivering6 return7 }8 if err := process(data); err != nil {9 _ = msg.NakWithDelay(5 * time.Second) // Transient — retry later10 return11 }12 _ = msg.Ack()13}Parallelize message processing within a single consumer instance. Fetch messages in batches and process them concurrently:
1// Go client — parallel processing with pull subscribe2sub, _ := js.PullSubscribe("EVENTS.>", "event-processor")3for {4 msgs, _ := sub.Fetch(100, nats.MaxWait(5*time.Second))5 var wg sync.WaitGroup6 for _, msg := range msgs {7 wg.Add(1)8 go func(m *nats.Msg) {9 defer wg.Done()10 if err := processEvent(m); err != nil {11 _ = m.Nak()12 return13 }14 _ = m.Ack()15 }(msg)16 }17 wg.Wait()18}1// TypeScript (nats.js)2import { connect, AckPolicy, DeliverPolicy } from "nats";3
4const nc = await connect();5const js = nc.jetstream();6const consumer = await js.consumers.get("EVENTS", "event-processor");7
8while (true) {9 const batch = consumer.fetch({ max_messages: 100, expires: 5000 });10 const promises: Promise<void>[] = [];11 for await (const msg of batch) {12 promises.push(13 processEvent(msg.data)14 .then(() => msg.ack())15 .catch(() => msg.nak())16 );17 }18 await Promise.all(promises);19}Scale out with multiple consumer instances. For pull consumers, multiple instances can pull from the same durable consumer. For push consumers, use a queue group. Each additional instance linearly increases aggregate throughput:
# Verify the consumer allows multiple instancesnats consumer info <stream_name> <consumer_name> --json | jq '.config.max_ack_pending'Right-size max_ack_pending based on measured capacity. Profile your consumer under load: measure P99 processing time, maximum concurrent processing slots, and sustained throughput. Set max_ack_pending to: (concurrent_workers) × (processing_time_p99 / fetch_interval) × safety_margin. This ensures the limit reflects actual capacity.
Decouple acknowledgment from downstream completion. If the consumer writes to a database, consider acknowledging the NATS message once the data is durably queued in a local write-ahead buffer, rather than waiting for the full downstream write to complete. This trades exactly-once downstream delivery (which NATS doesn’t guarantee anyway) for significantly better ack throughput.
Implement adaptive concurrency. Monitor your consumer’s ack pending ratio and automatically adjust worker pool size. When ack pending exceeds 50% of the limit, scale up workers. When it drops below 20%, scale down to conserve resources.
max_ack_pending is reached?The server stops delivering new messages to the consumer. Messages continue arriving in the stream and accumulate as num_pending, but no new deliveries occur. Delivery resumes only when in-flight messages are acknowledged, nak’d, or expire past ack_wait. This is an abrupt stop, not a gradual slowdown — the consumer goes from receiving messages to receiving nothing.
max_ack_pending?It depends on your consumer’s processing parallelism and latency. For a consumer with 10 worker threads and 100ms average processing time, 1,000 is a reasonable starting point — it provides 10 seconds of buffer. For high-throughput consumers with fast processing, 5,000-10,000 may be appropriate. The key principle: set it high enough to absorb latency variance but low enough that hitting the limit is a meaningful signal, not a catastrophic surprise.
Ack pending counts messages that have been delivered to the consumer but not yet acknowledged — they’re in-flight. Consumer lag (often shown as num_pending) counts messages in the stream that haven’t been delivered yet. A consumer can have zero lag but high ack pending (all messages delivered, none acknowledged) or high lag but zero ack pending (delivery stalled at the limit). Both metrics together tell the full story.
Yes. Multiple instances can pull from the same pull-based durable consumer. The server distributes messages across instances, effectively multiplying your processing throughput. Each instance contributes to the shared ack pending pool, so the same max_ack_pending limit applies across all instances. This is the simplest way to scale consumer throughput horizontally.
During a rolling deployment, consumer instances restart sequentially. When an instance shuts down, its in-flight messages remain in ack pending until ack_wait expires — they can’t be redelivered to another instance until then. With a 30-second ack_wait and 500 in-flight messages per instance, a restart temporarily locks 500 ack pending slots for 30 seconds. Use graceful shutdown that acknowledges or nak’s in-flight messages before stopping to avoid this.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community