Gateway pending pressure occurs when the outbound write buffer on a NATS gateway connection exceeds 1 MiB. Gateways are the inter-cluster communication links in a NATS super-cluster — when pending data accumulates faster than it drains, it signals that the receiving cluster cannot keep up with the sending cluster’s message rate, or the network between them is a bottleneck. Left unchecked, gateway pending pressure leads to message delivery latency across clusters, memory growth on the sending server, and eventually gateway disconnections.
Gateways carry all inter-cluster traffic: subscriptions, messages, and protocol control data. Unlike client connections where the server can evict a slow consumer to protect itself, gateway connections are infrastructure — losing a gateway means losing connectivity to an entire cluster. The server will tolerate much more pending data before severing a gateway than it would for a regular client.
This tolerance comes at a cost. While the gateway buffer fills, the sending server allocates increasingly large amounts of memory to hold unsent data. In a super-cluster with multiple gateways, one congested link can consume hundreds of megabytes. The sending server’s write loop for that gateway also backs up, adding latency to every message destined for the remote cluster. If the congestion is mutual — both clusters sending to each other through saturated links — the latency compounds bidirectionally.
The operational impact is subtle but severe. Request-reply patterns that span clusters start timing out. Consumers in one cluster that subscribe to streams sourced from another cluster see growing delivery delays. Monitoring and control plane traffic between clusters slows down, making the system harder to observe exactly when you need visibility most. If the pending pressure persists long enough, the server may disconnect the gateway entirely, triggering a full reconnect cycle that temporarily partitions the super-cluster.
Insufficient network bandwidth between clusters. The most straightforward cause. If the aggregate message rate across a gateway link exceeds the available bandwidth (accounting for protocol overhead, encryption, and other traffic), pending data accumulates. This is especially common when clusters are connected over WAN links, shared VPNs, or cloud inter-region peering with throughput caps.
Poor stream and consumer placement. When a stream lives in Cluster A but most of its consumers are in Cluster B, every message crosses the gateway. If the stream is high-throughput, this single placement decision can saturate the inter-cluster link. The same applies to mirrors and sources that replicate data across clusters — each replicated message consumes gateway bandwidth.
Wildcard subscriptions generating excessive fan-out. A subscriber in one cluster with a broad wildcard (> or events.>) forces the gateway to forward matching messages from every other cluster. The aggregate rate of matched messages can far exceed what any individual subject would produce.
Gateway interest mode not engaged. By default, NATS gateways start in “optimistic” mode, forwarding all messages to the remote cluster. The gateway switches to “interest” mode once it learns the remote cluster has no subscribers for a subject. If subscriptions churn rapidly or subjects are highly dynamic, the gateway may spend significant time in optimistic mode, forwarding messages that will be dropped on arrival.
Burst traffic from batch operations. Imports, backfills, or migration jobs that publish large volumes of data in short bursts can overwhelm gateway capacity even when steady-state traffic fits comfortably. The pending buffer absorbs the burst, but if the burst duration exceeds the buffer’s drain time, pressure builds.
TLS overhead on constrained links. Gateway connections are typically TLS-encrypted. On links where CPU is limited or bandwidth is marginal, the encryption/decryption overhead reduces effective throughput, contributing to pending buildup.
Use the NATS CLI to inspect gateway connections and their pending state:
curl -s 'http://localhost:8222/connz?sort=pending_bytes&limit=20' | jq '.connections[]'Look for gateway connections (indicated by the connection type) with elevated pending bytes. Any gateway connection consistently above 1 MiB warrants investigation.
For a more detailed view of gateway-specific metrics:
nats server report gatewaysThis shows per-gateway statistics including RTT, message rates, and byte rates for each inter-cluster link.
The server’s /gatewayz endpoint provides detailed gateway state:
curl -s http://localhost:8222/gatewayz | jq '.outbound_gateways | to_entries[] | {name: .key, pending: .value.connection.pending_bytes, rtt: .value.connection.rtt}'Key fields:
# Check gateway RTT from the NATS perspectivenats server list | grep gatewayIf RTT is high (above 20ms for same-region, above 100ms for cross-region), the network link is likely the bottleneck. Cross-reference with infrastructure monitoring for packet loss, bandwidth utilization, and interface errors on the hosts running the NATS servers.
Determine which subjects are driving the most inter-cluster traffic:
# Check stream placements and consumer locationsnats stream ls -a --json | jq -r '.[] | "\(.config.name) cluster=\(.cluster.name) replicas=\(.cluster.replicas | length)"'
# Check which accounts are generating gateway trafficnats server report accountsIf a single stream or account dominates gateway traffic, targeted placement changes can relieve the pressure.
1package main2
3import (4 "encoding/json"5 "fmt"6 "net/http"7)8
9type GatewayzResp struct {10 OutboundGateways map[string]struct {11 Connection struct {12 PendingBytes int64 `json:"pending_bytes"`13 RTT string `json:"rtt"`14 } `json:"connection"`15 } `json:"outbound_gateways"`16}17
18func checkGatewayPending(monitorURL string, thresholdBytes int64) error {19 resp, err := http.Get(monitorURL + "/gatewayz")20 if err != nil {21 return err22 }23 defer resp.Body.Close()24
25 var gw GatewayzResp26 if err := json.NewDecoder(resp.Body).Decode(&gw); err != nil {27 return err28 }29
30 for name, info := range gw.OutboundGateways {31 if info.Connection.PendingBytes > thresholdBytes {32 fmt.Printf("WARN: gateway %s pending=%d bytes rtt=%s\n",33 name, info.Connection.PendingBytes, info.Connection.RTT)34 }35 }36 return nil37}1import httpx2
3async def check_gateway_pending(monitor_url: str, threshold_bytes: int = 1_048_576):4 resp = await httpx.AsyncClient().get(f"{monitor_url}/gatewayz")5 data = resp.json()6 alerts = []7 for name, gw in data.get("outbound_gateways", {}).items():8 pending = gw.get("connection", {}).get("pending_bytes", 0)9 if pending > threshold_bytes:10 alerts.append({11 "gateway": name,12 "pending_bytes": pending,13 "rtt": gw["connection"].get("rtt"),14 })15 return alertsRelocate streams closer to their consumers. If most consumers of a stream are in a remote cluster, move the stream (or add a mirror) to that cluster. This converts inter-cluster gateway traffic into intra-cluster route traffic, which typically has much higher bandwidth available:
# Create a cross-domain mirror in the consumer's cluster.# Cross-domain mirrors are configured via the `--mirror nats:STREAM@DOMAIN`# syntax (or by feeding a JSON stream config to `--config`); there is no# `--mirror-domain` flag on `nats stream add`.nats stream add orders-mirror \ --mirror nats:orders@hub \ --storage file \ --replicas 3Restrict wildcard subscriptions. If broad wildcards are driving excessive gateway fan-out, narrow the subscription scope or move the subscribing service to the cluster where the data originates.
Throttle batch operations. If bulk imports or backfills are spiking gateway traffic, rate-limit the publisher or schedule batch work during off-peak hours.
Upgrade inter-cluster bandwidth. If the gateway link is genuinely saturated, the fix is more bandwidth. In cloud environments, this may mean moving to dedicated interconnects, larger instance types with higher network performance, or placement groups that optimize cross-AZ throughput.
Enable or verify gateway compression. NATS supports S2 compression on gateway connections, which can significantly reduce bandwidth consumption for compressible payloads:
1gateway {2 name: "cluster-east"3 port: 72224 compression: s2_auto5 gateways: [6 { name: "cluster-west", urls: ["nats://west-1:7222"] }7 ]8}For constrained links, s2_fast provides a good compression ratio with minimal CPU overhead.
Design topic topologies that minimize cross-cluster traffic. Keep producers and consumers of high-throughput subjects in the same cluster. Use NATS subject mapping or account-level imports/exports to control which traffic crosses cluster boundaries.
Use JetStream sources instead of raw subscriptions for cross-cluster data. Sources give you explicit control over which streams replicate across clusters and can be paused or rate-limited. Raw subscriptions on gateways offer no such control — every matching message flows through.
Monitor gateway metrics continuously. Set up Prometheus alerts on gateway pending bytes to catch pressure before it becomes critical.
Synadia Insights evaluates gateway pending pressure automatically across your entire super-cluster deployment, flagging links that are under sustained pressure before they degrade into disconnections.
Route pending pressure (CLUSTER_005) occurs between servers within the same cluster. Gateway pending pressure occurs between servers in different clusters. Routes typically use high-bandwidth local network links, so pressure is less common. Gateways often cross WAN links with lower bandwidth and higher latency, making them more susceptible to pending buildup. The diagnostic and remediation approaches are similar, but gateway pressure usually requires network-level or architectural changes rather than simple configuration adjustments.
The server does not expose a configurable pending limit for gateways in the same way it does for client connections. Gateway connections are handled differently — the server maintains larger internal buffers and tolerates more pending data because gateway health is critical to super-cluster connectivity. If pending pressure reaches the point of disconnection, the root cause is sustained bandwidth exhaustion, not buffer sizing. Focus on reducing traffic volume or increasing network capacity.
NATS itself does not support message prioritization on gateway connections. All messages share the same TCP connection and pending buffer. However, you can achieve effective prioritization by placing latency-sensitive streams in the same cluster as their consumers (eliminating gateway traversal entirely) and reserving gateway bandwidth for traffic that genuinely needs to cross cluster boundaries.
Yes. When a gateway transitions from optimistic mode to interest-only mode for a subject, it stops forwarding messages that the remote cluster has no subscribers for. This can significantly reduce gateway traffic. However, interest mode transitions require the remote cluster to have no subscriptions on the subject — wildcard subscriptions prevent interest-mode optimization. Check gateway interest mode status with nats server report gateways and see the Gateway Interest Mode check (OPT_SYS_003) for optimization guidance.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community