A subscription hotspot occurs when one server in a NATS cluster carries more than double the cluster average number of subscriptions. This imbalance concentrates subject-matching CPU work and memory overhead on a single node, creating a bottleneck that limits cluster-wide throughput.
Every message published to a NATS cluster requires the server to match the subject against its subscription interest table. The more subscriptions a server holds, the more work it does per publish — even with NATS’s highly optimized subject trie. When subscriptions concentrate on one server, that server spends disproportionate CPU time on matching, while other servers sit underutilized. The bottleneck isn’t theoretical: at high message rates, a server with 50,000 subscriptions performs measurably more work than one with 5,000, and publish latency on the hot server increases accordingly.
The problem compounds with wildcard subscriptions. A single events.> subscription on a hot server matches every subject under events., which means the server evaluates that match for every publish to any events.* subject. If the hot server also holds the most connections, it becomes the chokepoint for both subscription matching and message delivery. Other servers in the cluster have spare capacity that goes unused.
Subscription hotspots also affect cluster resilience. If the overloaded server goes down, all those subscriptions must reestablish on remaining servers — potentially overloading them during the reconnection storm. What was a performance imbalance during steady state becomes a cascading failure during a disruption.
Client connection configuration lists servers in a fixed order. Most NATS client libraries connect to the first reachable server in the URL list. If every client uses the same ordered list (e.g., nats://s1,s2,s3), the majority connect to s1 and bring their subscriptions with it. Without randomization, the first server absorbs the bulk of the subscription load.
Wildcard subscribers concentrated on one node. A monitoring or analytics service subscribing to > or *.> patterns runs on a single host that happens to connect to one server. That one wildcard subscription generates enormous fan-in on that server for every subject in the system.
Microservice deployments scaled unevenly. One service runs 20 replicas, each with 50 subscriptions, and all replicas land on the same server due to infrastructure affinity (same availability zone, same Kubernetes node, same DNS resolution).
Queue group subscribers not distributed. Queue groups balance message delivery, but if all members of the queue group connect to the same server, the subscription interest is still concentrated. The server must track each group member’s subscription individually.
Leafnode hub funneling subscriptions. A leafnode connection propagates all remote subscriptions to the hub server. If a single leafnode connects a large edge deployment with thousands of subscriptions, the hub server it connects to becomes a subscription hotspot.
nats server report connections --sort subsThis shows total subscription count per server, sorted highest first. Compare the top server against the cluster average — if it’s more than 2× the mean, you have a hotspot.
# Per-server subscription statsnats server request subscriptions --help
# Direct monitoring endpointcurl -s http://localhost:8222/subsz?subs=1 | jq '.num_subscriptions'The /subsz endpoint returns the subscription count and optionally the full subscription list. Compare across all servers to confirm the imbalance.
nats server report connections --sort subs --account <account_name>This breaks down per-client subscription counts. Look for clients with unusually high subscription counts or many clients from the same application clustered on one server.
curl -s http://localhost:8222/subsz?subs=1 | jq '.subscriptions_list[]' | grep '>'Wildcard subscriptions (containing > or *) on the hot server are prime suspects. A single > subscription matches everything and generates maximum fan-in load.
nats server listCompare connection counts across servers. If connections are also skewed, the subscription hotspot is likely a side effect of a connection hotspot (see OPT_BALANCE_002).
Force clients to reconnect with balanced distribution by performing a rolling restart or drain of the hot server:
# Drain the overloaded server — clients reconnect to other serversnats-server --signal ldm=<server_name> # send SIGUSR2 to put the server in lame-duck modeDraining gracefully migrates connections (and their subscriptions) to other cluster members. This is a temporary fix — clients may re-concentrate on reconnect if the underlying cause isn’t addressed.
Ensure all clients list every server in the cluster and enable randomization:
1// Go client — list all servers, randomize by default2nc, err := nats.Connect(3 "nats://s1:4222,nats://s2:4222,nats://s3:4222",4 nats.DontRandomize(), // REMOVE this if present — randomize is on by default5)1# Python (nats.py) — list all servers2nc = await nats.connect(3 servers=["nats://s1:4222", "nats://s2:4222", "nats://s3:4222"],4)5# Randomization is enabled by default in nats.pyIf you’re using DNS-based discovery, ensure the DNS record returns all server IPs and that the client library randomizes the resolved addresses.
Applications that create many fine-grained subscriptions can often consolidate them with wildcards at the application level:
1// Instead of 1,000 individual subscriptions:2// nc.Subscribe("orders.us.ny.12345", handler)3// nc.Subscribe("orders.us.ny.12346", handler)4// ...5
6// Use a wildcard and filter in the handler:7nc.Subscribe("orders.us.ny.*", func(msg *nats.Msg) {8 orderID := extractOrderID(msg.Subject)9 if shouldProcess(orderID) {10 process(msg)11 }12})Fewer subscriptions per client means less concentration impact when clients aren’t perfectly distributed.
Design your deployment pipeline to distribute clients across servers deliberately:
Monitor subscription distribution as a standard operational metric. Alert when any server exceeds 1.5× the cluster average to catch imbalances before they become hotspots.
A connection hotspot (OPT_BALANCE_002) means one server has disproportionately many client connections. A subscription hotspot means one server has disproportionately many subscriptions. They often co-occur — more connections typically means more subscriptions — but they can diverge. A server with few connections that each create hundreds of subscriptions (e.g., a monitoring service subscribing to events.> for every account) can be a subscription hotspot without being a connection hotspot.
Yes. When a client subscribes on one server, that interest is propagated to all servers in the cluster via route connections. However, the server that holds the actual client connection does the final delivery and tracking work. The hotspot server bears the cost of maintaining the subscription state, matching incoming messages, and writing to client buffers — work that doesn’t transfer to other servers just because interest is propagated.
Indirectly, yes. A server spending excessive CPU on subscription matching may deliver messages to clients more slowly. If the delivery pipeline backs up, the server’s per-client outbound buffer fills, and the client gets disconnected as a slow consumer (SERVER_004). The root cause is the subscription imbalance, but the symptom appears as slow consumer events on the hot server.
Generally, no. NATS client libraries have built-in server randomization and cluster discovery that handle distribution without external infrastructure. A TCP load balancer adds latency, complicates TLS, and can mask server identity from clients. The better approach is to list all cluster server URLs in your client configuration and let the client library randomize. Use a load balancer only if you have specific network topology constraints that prevent direct client-to-server connectivity.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community