RethinkConn is back — the biggest NATS event of the year returns June 4. Save your (virtual) spot.
All posts

A common community question is whether JetStream consumer pinning guarantees that only one client can ever process messages from a consumer at a time.

The short answer: consumer pinning is useful for affinity and failover, but it is not a hard exclusivity or distributed-lock mechanism. Once messages have been delivered to a client, NATS cannot take them back from that client’s local buffers or stop user code that is already handling them.

In NATS, this capability is provided by JetStream priority groups. A pull consumer is configured with one or more priority groups and a priority policy; the pinned client policy is the one that keeps a single client pinned so that it receives the group’s messages. Priority groups were introduced in NATS Server 2.11, and the design is explicit that pinning steers delivery rather than enforcing strict exclusivity. The rest of this post explains why that distinction matters in practice.

What consumer pinning is good for

Consumer pinning is a good fit when you want a worker to have affinity for a subset of work, such as a shard or partition-like stream of entities. For example, you might have services that:

  • maintain an in-memory cache of entity state,
  • rebuild that cache from JetStream after startup or failover,
  • process messages for a shard while they hold the pin, and
  • allow another service instance to take over if the current one disappears or stops making progress.

That maps well to actor-like or stateful-worker designs where one active worker should normally own a shard’s processing path.

What pinning does not guarantee

Pinning does not guarantee that no overlap can ever occur.

The important boundary is message delivery. If a client has already pulled a batch of messages, those messages may already be:

  • in the client’s receive buffer,
  • queued in application code,
  • being processed by a handler, or
  • waiting to be acknowledged.

If the pin later moves to another client, the server cannot revoke those already-delivered messages from the old client. The new pinned client may begin receiving work while the previous client is still processing messages it received earlier.

This is why consumer pinning should not be treated as a strict work-based election system. It provides useful affinity and failover behavior, but it does not provide a global guarantee that only one process can ever be executing handler code for that consumer or shard.

What happens after a timeout or disconnect?

Consider this sequence:

  1. Worker A is pinned and pulls a batch of messages.
  2. Worker A becomes disconnected from the server, stalls, or otherwise fails to maintain the pin.
  3. After the configured pinned-client timeout elapses, Worker B becomes the pinned client.
  4. Worker B starts receiving messages.
  5. Worker A may still be running application code for messages it already received.

From the server’s point of view, it can stop delivering new work to Worker A after the pin is lost. But it cannot erase messages from Worker A’s memory or interrupt business logic already in progress.

This matters most when handlers are long-running, when pull batches are large, or when the application buffers work internally before acknowledging it.

Does MaxAckPending: 1 make it exclusive?

MaxAckPending: 1 can reduce concurrency by limiting how many messages are outstanding without acknowledgment. It is a useful control when you want at most one unacknowledged message in flight for a consumer.

However, it still does not create a hard exclusivity guarantee.

For example:

  1. Worker A receives one message.
  2. Worker A starts processing but takes longer than the acknowledgment wait period.
  3. The message becomes eligible for redelivery.
  4. Worker B may receive the same message while Worker A is still processing it.

In that case, even with only one pending acknowledgment allowed, the same message can be processed by more than one client if the first worker is slow, disconnected, or unable to acknowledge in time.

What about limiting redeliveries?

A low maximum delivery count (MaxDeliver) can reduce duplicate processing, but it changes the failure behavior. If a message reaches the configured maximum delivery count without being acknowledged, JetStream stops redelivering it and publishes a max deliveries advisory event for that consumer. That may be acceptable for some workloads, but it can also mean work is silently left incomplete unless your application watches for those advisories — or has another mechanism — to detect and handle the unprocessed message.

Use delivery limits as a reliability tradeoff, not as an exclusivity primitive.

Design guidance for pinned stateful workers

If you are using consumer pinning for stateful workers with in-memory caches, a practical design is:

  • Treat pinning as shard affinity plus failover, not as a lock.
  • Keep pull batch sizes small if overlap would be expensive.
  • Tune acknowledgment wait values to match realistic handler latency.
  • Consider MaxAckPending: 1 when sequential processing is important.
  • Make message handling idempotent where possible.
  • Include entity versions, sequence numbers, or other ordering checks in your state model.
  • Rebuild local state from the stream when a worker acquires ownership.
  • Keep standby workers connected and issuing pull requests so the pin can move to one of them quickly when the current owner fails.
  • Assume a previous owner may still be finishing work for a short period after failover.

For actor-like workloads, the safest mental model is: only one worker should normally be active for a shard, but your application must tolerate brief overlap around failure, timeout, and redelivery boundaries.

When strict exclusivity is required

If your system cannot tolerate any overlap at all, consumer pinning alone is not enough. You may need an additional application-level coordination mechanism, such as fencing tokens, compare-and-set state transitions, external leases, or idempotent writes guarded by entity versions.

The right choice depends on what must be protected:

  • If duplicate side effects are the concern, make side effects idempotent or fenced.
  • If stale workers are the concern, include an ownership epoch or generation in writes.
  • If ordering is the concern, validate stream sequence, entity version, or expected state before applying changes.
  • If availability is more important than perfect exclusivity, tune acknowledgment and redelivery behavior for fast recovery.

Bottom line

JetStream consumer pinning is a strong fit for stateful shard ownership and failover-oriented worker designs. It helps route work to a pinned client, but it does not cancel messages that have already been delivered or stop handlers that are already running.

Use pinning to get affinity. Use acknowledgments, small batches, idempotency, version checks, and careful timeout tuning to make the workload safe during failover and redelivery.


Want help from the NATS experts? Meet with our architects to get help tailored to your use case and environment.

Related posts

All posts
Get the NATS Newsletter

News and content from across the community


Cancel