Pull Consumer Priority Groups in NATS JetStream: Overflow, Pinned, and Prioritized Routing

Watch the video for a quick walkthrough, or keep reading for the full breakdown.

The Problem

Picture several worker processes all pulling from the same JetStream consumer (the cursor that tracks which messages have been received and acknowledged). By default, NATS spreads messages across whichever clients happen to be asking, in simple rotating turns. This is round-robin distribution, and it has no notion of which client should get the work.

That is fine for plain horizontal scaling, where every worker is interchangeable. But it cannot express patterns that teams keep asking for. “Send all the work to one client, but fail over instantly if it dies.” “Let local workers drain the backlog first, and only spill to another region when they fall behind.” “Give one pool of workers higher priority than another.” None of these fit a model where the next message goes to whoever pulled most recently.

The usual workaround is to build coordination outside of NATS — a lock service, a leader election, or custom client-side logic to decide who processes what. That is exactly the kind of moving part you do not want in the middle of a message pipeline.

Pull Consumer Priority Groups: The Solution

Pull consumer priority groups, introduced in NATS Server 2.11, add a routing layer directly to the consumer. A priority group is a named set of clients that all pull from one consumer under a shared rule, and a priority policy decides which waiting client is served next instead of plain round-robin. Every pull request names the group it belongs to, and the server applies the group’s policy to decide whether that pull is served now, served later, or rejected.

The important part: all of this comes entirely from consumer configuration. There is no leader election, no external lock, and no coordination code for you to write and maintain.

How It Works

A consumer opts in by declaring one or more group names and a policy. From then on, every pull request for that consumer must name its group, and the server routes accordingly. The policy you pick determines the behavior.

The Three Policies

Overflow serves a client’s pull only once the consumer’s backlog crosses a threshold that client set. A pull is served only when num_pending (undelivered messages) is at least min_pending, or num_ack_pending (delivered-but-unacknowledged messages) is at least min_ack_pending — a simple either-or of whichever thresholds the pull sets. Clients that pull with no threshold drain the consumer normally, so local workers clear the backlog first while remote workers sit idle, receiving heartbeats (empty keep-alive status messages), until the backlog grows past their threshold.

Pinned client sends all of a group’s messages to one designated client while the rest wait on standby. The server stamps the active client’s messages with a Nats-Pin-Id header carrying the pin id — the identifier the server assigns to the current pinned client. The client echoes that id back on every later pull to prove it still holds the pin. If the pinned client goes quiet past the configured timeout, or an admin unpins it, the server re-pins one of the standbys and issues a fresh pin id. This is failover (shifting work to a healthy standby automatically) built into the consumer.

Prioritized (NATS Server 2.12+) serves pulls in order of a per-pull priority value from 0 to 9, with lower numbers served first, rather than round-robin. A primary pool wins, but a secondary pool picks up immediately when the primary isn’t pulling — no threshold, no backlog buildup. The trade-off versus overflow is some flip-flopping, where work bounces between clients more readily because there is no deliberate threshold holding it in place.

Configuration

Priority groups are configured with three ConsumerConfig fields. They are pull-only and expect explicit acknowledgment.

Parameter	JSON tag	Description
`PriorityGroups`	`priority_groups`	Group names the consumer supports (one, in the initial implementation).
`PriorityPolicy`	`priority_policy`	One of `none` (default, feature off), `overflow`, `pinned_client`, `prioritized`.
`PinnedTTL`	`priority_timeout`	Grace period before a quiet pinned client is replaced. Defaults to 2 minutes; a wire value of 0 means “use the default.”

Per-pull, a request carries the group plus the fields its policy needs: min_pending / min_ack_pending for overflow, id (the echoed pin id) for pinned client, and priority for prioritized. On the wire, durations such as priority_timeout and a pull’s expires are expressed in nanoseconds.

Observability and Operator Control

The server publishes an advisory — a system message on a well-known subject — on every pin change: consumer_group_pinned when a group pins a new client, and consumer_group_unpinned (with a reason of admin or timeout) when a pin is lost. An operator can force a hand-off out of band with the UNPIN API, which is handy for rotating the active client before a deploy.

Code Examples

Go — Configuring overflow, pinned, and prioritized consumers

1
js, _ := jetstream.New(nc)
2

3
// Overflow: local workers drain first; remote workers only get
4
// messages once the backlog crosses their threshold.
5
overflow, _ := stream.CreateOrUpdateConsumer(ctx, jetstream.ConsumerConfig{
6
  Durable:        "workers",
7
  AckPolicy:      jetstream.AckExplicitPolicy, // required for priority groups
8
  PriorityPolicy: jetstream.PriorityPolicyOverflow,
9
  PriorityGroups: []string{"jobs"},
10
})
11

12
// A local worker pulls normally (no min_pending).
13
local, _ := overflow.Fetch(50, jetstream.FetchPriorityGroup("jobs"))
14

15
// A remote worker only receives once >= 1000 messages are pending.
16
remote, _ := overflow.Fetch(50,
17
  jetstream.FetchPriorityGroup("jobs"),
18
  jetstream.FetchMinPending(1000),
19
)
20

21
// Pinned: one active client, others on standby, re-pin after 2m quiet.
22
pinned, _ := stream.CreateOrUpdateConsumer(ctx, jetstream.ConsumerConfig{
23
  Durable:        "exclusive",
24
  AckPolicy:      jetstream.AckExplicitPolicy,
25
  PriorityPolicy: jetstream.PriorityPolicyPinned,
26
  PriorityGroups: []string{"jobs"},
27
  PinnedTTL:      2 * time.Minute,
28
})
29
// The client library tracks the Nats-Pin-Id header automatically.
30
cc, _ := pinned.Consume(func(msg jetstream.Msg) { msg.Ack() },
31
  jetstream.PullPriorityGroup("jobs"))
32
defer cc.Stop()
33

34
// Prioritized (2.12+): priority 0 served before 9.
35
prioritized, _ := stream.CreateOrUpdateConsumer(ctx, jetstream.ConsumerConfig{
36
  Durable:        "tiered",
37
  AckPolicy:      jetstream.AckExplicitPolicy,
38
  PriorityPolicy: jetstream.PriorityPolicyPrioritized,
39
  PriorityGroups: []string{"jobs"},
40
})
41
high, _ := prioritized.Fetch(20,
42
  jetstream.FetchPriorityGroup("jobs"),
43
  jetstream.FetchPrioritized(0), // 0 = highest, served first
44
)

Notice that the routing behavior is entirely a property of the consumer config and the per-pull options — the worker loops themselves are ordinary Fetch/Consume calls. Overflow and pinned support landed in nats.go v1.41.0; the prioritized options in v1.46.0.

JSON — How it looks on the wire

1
// Consumer config (overflow)
2
{ "priority_groups": ["jobs"], "priority_policy": "overflow", "ack_policy": "explicit" }
3

4
// Pull request with an overflow threshold (expires is 30s in nanoseconds)
5
{ "batch": 100, "expires": 30000000000, "group": "jobs", "min_pending": 1000 }
6

7
Because the fields live in the shared consumer-configuration wire schema, priority groups are not Go-only — `priority_policy` and the related fields are present across the major clients (Python, Rust, JS, Java, .NET, C).
8

9
### CLI — Force a pin hand-off
10

11
```bash
12
nats consumer unpin <stream> <consumer> <group>

This wraps the UNPIN API and lets an operator deliberately rotate the active client, for example to drain it cleanly before a deploy.

When to Use This

Active/standby exclusive processing — Some jobs should not run in parallel: a billing run, a stateful migration, anything where you want one processor at a time. pinned_client routes all work to one client while others wait on standby, and re-pins a backup within PinnedTTL if the active one dies. You get high availability without running duplicate processors and without standing up a separate lock service.

Latency- and cost-aware regional routing — Say your stream and most of your workers live in US East. With overflow, US East workers pull with no threshold and drain the backlog first, while US West workers pull with a min_pending threshold and only wake up once the queue gets deep. You pay for cross-region traffic only when there is genuinely too much work for one region to keep up.

Tiered worker pools with no waiting — Want that same regional preference but with zero delay? Use prioritized: give US East priority 0 and US West priority 1, and the moment US East stops pulling, US West picks up immediately — no backlog required. You can stack a third tier, putting EU West at priority 2 so it only sees traffic when both US regions are quiet.

Tips and Gotchas

Pinning is affinity and failover, not a distributed lock. Affinity means a preference for routing to one instance, not a hard guarantee. There are windows where an old client still believes it is pinned after the server has moved on, and messages already in its buffer cannot be revoked. Your handlers must be idempotent — producing the same result whether a message is processed once or twice.
Fetch(N) is not an atomic reservation. With multiple clients on one consumer, the server may spread those N messages across concurrent pulls rather than hand one client a consecutive batch. Each client still receives its own messages in order, but the global sequence is interleaved.
Keep priority_timeout comfortably above the pull’s expires. The pin only resets when the pinned client issues a new pull inside the window, so the client needs room to let a pull expire, process, and re-pull without losing the pin. A 2-minute timeout (the server default) pairs well with pulls that expire in one minute or less.
The feature is pull-only, and explicit ack is expected. Configuring a priority policy on a push consumer is rejected by the server. Explicit ack is required by design — but note the shipped server validation does not currently reject other ack policies, so do not rely on an error to catch the mistake.
One group per consumer, for now. The feature is designed for exactly one priority group per consumer today, and group names are capped at 16 characters from a restricted alphabet.
Handle the 423. Under pinned_client, a client that sends a stale or wrong id gets a 423 Nats-Pin-Id mismatch. It should clear its id and re-pull with no id to rejoin the standby pool.

Wrapping Up

Pull consumer priority groups move routing decisions out of your application and into the consumer: one consumer, several clients, and a policy that lets the server decide who gets the next message — whether that means draining locally first, pinning one active worker with failover, or serving a primary pool ahead of a backup. The key thing to remember is that pinning gives you affinity and failover, not a lock, so keep your handlers idempotent and keep your timeout above your pull expiry.

For the full reference, including the per-policy field semantics and the advisory subjects, see the official NATS documentation.

Want to explore more advanced NATS JetStream topics with Synadia’s experts? Check out our on-demand workshop on JetStream.

Requires NATS Server 2.11+ (the prioritized policy requires 2.12+). See the official documentation for the full reference.

FEATURED

RESOURCES

Comparisons