A NATS gateway stuck in optimistic interest mode floods all messages to remote clusters regardless of whether any subscriber there is interested, wasting inter-cluster bandwidth and adding unnecessary load to gateway connections. The server auto-transitions to interest-only mode after a subscription activity threshold is reached, but certain conditions can prevent or delay this convergence.
NATS super-clusters connect multiple clusters through gateway connections. Each gateway operates in one of two modes: optimistic or interest-only. In optimistic mode, the local cluster sends every message on every subject to the remote cluster. The remote cluster’s gateway connection then drops messages that have no local subscribers. This is the default starting mode — it ensures no messages are missed while the gateway learns what the remote cluster actually needs.
The problem is when gateways stay in optimistic mode. In a healthy super-cluster, gateways converge to interest-only mode within seconds of connecting. Once converged, the local cluster only sends messages for subjects that have active subscriptions in the remote cluster. The bandwidth savings can be enormous — if a remote cluster subscribes to 50 subjects but the local cluster publishes on 10,000, interest-only mode eliminates 99.5% of cross-cluster traffic. When gateways keep resetting to optimistic mode (typically due to frequent reconnections), that savings disappears and inter-cluster links carry traffic that serves no purpose.
In cloud deployments where inter-region data transfer is metered, persistent optimistic mode translates directly to cost. Even in on-premises environments, unnecessary gateway traffic competes with legitimate cross-cluster messages for the same network links. High gateway utilization increases message latency for traffic that actually matters, and in extreme cases can trigger gateway pending pressure (OPT_SYS_005) — a cascade where the gateway connection itself becomes a bottleneck.
Frequent gateway reconnections. Every time a gateway connection drops and re-establishes, it starts in optimistic mode and must re-learn remote interest. If the underlying network is unstable or a remote cluster node keeps restarting, gateways cycle through optimistic mode repeatedly. Check CLUSTER_007 for gateway disconnection events.
Cluster topology changes. Adding or removing servers from a cluster forces gateways to update their routing tables. During this transition, some gateway connections may reset to optimistic mode temporarily. Frequent scaling events (e.g., aggressive auto-scaling) can keep gateways from ever settling.
Subscription churn on the remote cluster. If the remote cluster has services that rapidly create and destroy subscriptions, the gateway’s interest map keeps changing. Extreme subscription churn can delay or prevent stable convergence to interest-only mode.
Mismatched gateway configuration. If gateways between clusters have inconsistent configuration — different timeouts, missing cluster entries, or incorrect URLs — connections may flap, resetting interest mode each time. See CLUSTER_008 for configuration mismatches.
Large number of accounts. Interest mode is tracked per account per gateway connection. With many accounts, the convergence process takes longer. Combined with any of the above factors, this can keep individual account-gateway pairs in optimistic mode for extended periods.
The server’s gateway monitoring endpoint shows the current mode for each gateway connection:
# Check gateway status via monitoring endpointcurl -s http://localhost:8222/gatewayz | jq '.outbound_gateways, .inbound_gateways'Look for the interest_mode field on each gateway connection. The value will be Optimistic or Interest-Only. Any connection showing Optimistic after being established for more than a few seconds warrants investigation.
# List all servers with gateway informationnats server list
# Check gateway connections across the super-clusternats server report gatewaysNote which cluster-to-cluster gateway pairs are in optimistic mode. If all gateways from cluster A to cluster B are optimistic but A-to-C is interest-only, the problem is specific to the A-B relationship.
Gateway reconnections are the most common reason gateways stay in optimistic mode. Check server logs for reconnection patterns:
# Look for gateway reconnection events in server logsgrep -i "gateway" /var/log/nats/nats-server.log | grep -i "reconnect\|disconnect\|connect"If you see repeated connect/disconnect cycles on a gateway, address the underlying connectivity issue first.
# Check gateway traffic ratescurl -s http://localhost:8222/gatewayz | jq '.outbound_gateways[] | {name, msgs_sent: .connection.out_msgs, bytes_sent: .connection.out_bytes}'Compare the outbound message rate on optimistic gateways against the actual subscription interest in the remote cluster. A large disparity — many messages sent, few actually consumed — confirms the waste.
The most common fix is simply ensuring gateway connections stay up long enough to converge. Check the underlying network path between clusters:
# Check RTT between clusters via gateway connectionsnats rtt --server nats://gateway-remote:4222If RTT is high or unstable, address the network issue. Stable gateway connections will converge to interest-only mode automatically within seconds.
Verify the gateway is running NATS 2.9+. Starting with NATS 2.9, interest-only mode is the default for new gateway connections. If your servers are running an older version, gateways start in optimistic mode and must learn interest over time. Upgrading to 2.9+ significantly reduces the window where optimistic mode is active.
Check for high subscription churn preventing the transition. If the account has rapidly creating and destroying subscriptions, the gateway’s interest map keeps changing, which can delay or prevent the auto-transition to interest-only mode.
Fix gateway disconnection issues. If CLUSTER_007 is also firing, resolve that first. Common fixes include increasing gateway connection timeouts and ensuring all cluster servers are reachable:
1// Go client connecting through a super-cluster2// Ensure all gateway-connected cluster URLs are listed3nc, err := nats.Connect(4 "nats://cluster-a-1:4222,nats://cluster-a-2:4222,nats://cluster-a-3:4222",5 nats.Name("order-processor"),6 nats.ReconnectWait(2 * time.Second),7 nats.MaxReconnects(-1), // unlimited reconnects8)Reduce subscription churn. If remote cluster services create and destroy subscriptions rapidly, consider using durable subscriptions or restructuring subject hierarchies to reduce the frequency of interest map changes:
1# Python — use stable subscriptions instead of dynamic subscribe/unsubscribe2import nats3
4async def main():5 nc = await nats.connect("nats://cluster-b:4222")6
7 # Prefer long-lived wildcard subscriptions over many short-lived specific ones8 sub = await nc.subscribe("orders.>")9 async for msg in sub.messages:10 await process_order(msg)Lock down cluster membership. Avoid frequent auto-scaling of NATS cluster nodes. If you need elastic capacity, use leafnode connections for ephemeral workloads rather than adding/removing full cluster members, which forces gateway re-convergence.
Monitor gateway mode as a metric. Export the interest mode from /gatewayz and alert when any gateway stays in optimistic mode beyond a convergence threshold (e.g., 60 seconds after connection establishment). Synadia Insights automates this check across your entire super-cluster topology.
Audit gateway configuration consistency. Ensure all clusters define the same set of gateways with matching names and URLs. Mismatched configs cause asymmetric connectivity that leads to repeated reconnections:
1# Server config — gateway block should match across all clusters2gateway {3 name: "cluster-east"4 listen: "0.0.0.0:7222"5 gateways: [6 { name: "cluster-west", urls: ["nats://west-1:7222", "nats://west-2:7222", "nats://west-3:7222"] }7 { name: "cluster-eu", urls: ["nats://eu-1:7222", "nats://eu-2:7222", "nats://eu-3:7222"] }8 ]9}In a healthy super-cluster, gateways converge to interest-only mode within seconds of establishing a connection. The time depends on the number of accounts and active subscriptions in the remote cluster — more accounts means more interest maps to exchange. If a gateway is still in optimistic mode after 30-60 seconds, something is preventing convergence, typically subscription churn or connection instability.
No. Optimistic mode delivers strictly more messages than interest-only mode — it sends everything regardless of remote interest. The problem is the opposite: wasted bandwidth and resources. Messages sent to a remote cluster with no matching subscribers are simply discarded at the remote gateway, consuming network bandwidth for nothing.
No. Gateway interest mode is managed automatically by the NATS server protocol. You cannot set it via configuration. The correct approach is to ensure the conditions for convergence are met: stable gateway connections and stable subscription interest. If the gateway keeps resetting to optimistic mode, the root cause is always an upstream issue — connectivity, churn, or misconfiguration.
Interest mode is tracked per account per gateway connection. One account may be in interest-only mode while another on the same gateway is still in optimistic. This is normal — accounts with stable subscriptions converge faster. If a specific account is stuck in optimistic mode, look at subscription churn within that account rather than at gateway-level issues.
Synadia Insights evaluates this check across a time range, not at a single instant. A brief period of optimistic mode during gateway startup is expected and won’t trigger the check. The check flags gateway-account combinations that remain in optimistic mode persistently across the evaluation window, indicating a convergence failure rather than normal startup behavior.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community