NATS High Gateway Traffic Ratio: What It Means and How to Fix It

A high gateway traffic ratio means more than 30% of an account’s total byte throughput is crossing inter-cluster gateways rather than staying local to a single cluster. This indicates that workloads — streams, publishers, consumers — are not placed in the same cluster as the clients that use them, resulting in unnecessary latency, bandwidth consumption, and in cloud deployments, avoidable data transfer costs.

Why this matters

Gateways in NATS connect clusters and transparently route messages between them. This transparency is a feature — applications don’t need to know which cluster holds a stream or where a subscriber lives. But when gateway traffic dominates an account’s throughput, it means the system is doing a lot of work that better placement would eliminate. Every byte that crosses a gateway adds latency, consumes bandwidth on the inter-cluster link, and in cloud environments, incurs data transfer charges.

The 30% threshold is a signal, not a hard failure. Some cross-cluster traffic is normal and expected — that’s why gateways exist. But when a third or more of an account’s traffic is gateway traffic, it usually means that streams, consumers, or subscribers are systematically in the wrong cluster relative to the clients that interact with them. This is not a misconfiguration of any single component; it’s a placement problem across the account’s workload distribution.

The cost implications are concrete. In AWS, cross-AZ data transfer costs $0.01/GB in each direction. Cross-region costs are higher. For an account moving 100GB/day of gateway traffic, that’s $2/day — $730/year — in avoidable transfer charges. Scale that to multiple accounts and higher throughput, and gateway traffic becomes a meaningful infrastructure cost that placement optimization directly reduces. And that’s before accounting for the performance benefit of eliminating 20-100ms of latency per cross-cluster operation.

Common causes

Publishers and subscribers in different clusters. The most straightforward cause: publishers connect to cluster A, subscribers connect to cluster B. Every message published crosses the gateway to reach the subscriber. This pattern often emerges when teams in different offices or regions independently deploy their services without coordinating client placement.
Streams placed in a different cluster than their clients. JetStream streams have a leader in one cluster. If the clients that publish to or consume from those streams connect to a different cluster, all stream operations traverse the gateway. See also: OPT_PLACE_001 (Cross-Cluster Stream Access).
Optimistic gateway interest mode. In optimistic mode (the default for new gateway connections), all messages are forwarded to remote clusters until the gateway learns that there’s no interest. For subjects with intermittent subscribers, this can generate substantial gateway traffic for messages that are ultimately dropped by the remote cluster. See also: OPT_PLACE_004 (Gateway Interest Mode).
Wildcard subscriptions spanning clusters. A subscriber on events.> in one cluster triggers interest propagation across all gateways. Every message matching that wildcard, regardless of which cluster published it, is forwarded. The gateway traffic grows with the number of subjects and clusters.
No locality-aware client routing. Clients connect to whichever cluster the load balancer or DNS round-robin assigns them, without awareness of where their streams or subscribers are. This random distribution guarantees that some portion of traffic is always cross-cluster.
Centralized stream architecture. All streams live in a single “primary” cluster for administrative simplicity, while clients are distributed across multiple clusters. Every client outside the primary cluster generates gateway traffic for every operation.

How to diagnose

Measure gateway traffic per account

Check overall gateway traffic from the server’s monitoring endpoint:

curl -s http://localhost:8222/gatewayz | jq '.outbound_gateways | to_entries[] | {cluster: .key, sent_msgs: .value.connection.sent_msgs, sent_bytes: .value.connection.sent_bytes, recv_msgs: .value.connection.recv_msgs, recv_bytes: .value.connection.recv_bytes}'

This shows total gateway traffic per remote cluster. Compare these bytes with total account throughput to calculate the ratio.

Check account-level throughput

Get per-account throughput numbers to establish the baseline:

nats server report accounts

Compare the total bytes in/out for each account with the gateway traffic to identify which accounts have the highest gateway ratio.

Identify which subjects drive gateway traffic

Check which subjects have cross-cluster interest:

nats server report connections --sort in-msgs

Look for connections with high message rates that are connected to clusters different from where their target streams or subscribers are.

Map client-to-cluster distribution

Understand where each account’s clients are connected:

nats server report connections --account <account-name>

Group by cluster. If clients are evenly distributed across clusters but streams are in one cluster, the gateway ratio will be close to (N-1)/N for N clusters — nearly all traffic crosses the gateway.

Check gateway pending pressure

High gateway traffic may also cause pending pressure on the gateway connections:

curl -s 'http://localhost:8222/connz?sort=pending_bytes&limit=20' | jq '.connections[]'

If gateway connections show elevated pending bytes, the gateway link is becoming saturated — a compounding problem on top of the placement inefficiency.

How to fix it

Immediate: identify the biggest contributors

Rank accounts and subjects by gateway traffic. Not all cross-cluster traffic is worth optimizing. Focus on the highest-volume accounts first:

# Check which accounts have the most throughput
nats server report accounts --sort in-bytes

For each high-throughput account, determine what percentage of its traffic is gateway traffic. Start optimization with accounts where the absolute gateway byte count is highest — they yield the largest improvement.

Short-term: move workloads to match data

Relocate streams to the cluster where clients connect. For each misplaced stream, use placement tags to move it:

nats stream edit ORDERS --tag cluster:us-east --replicas 3

Create mirrors for multi-cluster read patterns. If an account legitimately has clients in multiple clusters that all need to read the same stream, create mirrors instead of routing all reads through the gateway:

1
// Go — create a local mirror to avoid gateway reads
2
js, _ := nc.JetStream()
3
_, err := js.AddStream(&nats.StreamConfig{
4
    Name: "EVENTS-WEST",
5
    Mirror: &nats.StreamSource{
6
        Name: "EVENTS",
7
    },
8
    Placement: &nats.Placement{
9
        Tags: []string{"cluster:us-west"},
10
    },
11
})
12
if err != nil {
13
    log.Fatal(err)
14
}

1
# Python (nats.py) — create a local mirror
2
js = nc.jetstream()
3
await js.add_stream(
4
    name="EVENTS-WEST",
5
    mirror={"name": "EVENTS"},
6
    placement={"tags": ["cluster:us-west"]},
7
)

Route clients to the cluster with their data. Update client connection URLs to prefer the cluster where their primary streams are located:

# Client connects to cluster with local stream leaders
nats sub "orders.>" --server nats://us-east-1:4222,nats://us-east-2:4222

Long-term: design for data locality

Establish a placement policy that ties streams to client clusters. Document which account/stream combinations should live in which cluster. Enforce this through CI/CD — validate stream placement tags before deployment.

Switch to interest-only gateway mode where appropriate. If accounts have stable subscription patterns, switching from optimistic to interest-only mode eliminates speculative gateway forwarding:

1
# Server config — gateway account configuration
2
gateway {
3
    name: "us-east"
4
    gateways: [
5
        {name: "us-west", urls: ["nats://us-west-1:7222"]}
6
    ]
7
}

Interest-only mode is configured per-account by the server automatically as subscriptions stabilize — but you can influence it by ensuring subscriptions are registered before high-volume publishing begins.

Monitor the ratio continuously. Track gateway traffic ratio as a key infrastructure metric. Set thresholds per account and alert when the ratio climbs above your target. Synadia Insights evaluates this automatically every epoch across all accounts and clusters, catching placement drift before it becomes a significant cost or performance issue.

Partition by geography. For global deployments, design the subject namespace with regional prefixes (orders.us-east.>, orders.eu-west.>) and place streams per region. Clients publish to their regional prefix, keeping traffic local. A central aggregation stream can use sources to pull from all regional streams for global views.

Frequently asked questions

What’s a normal gateway traffic ratio?

It depends on your architecture, but as a rule of thumb: below 10% is well-optimized, 10-30% is typical for multi-cluster deployments with some cross-cluster patterns, and above 30% indicates placement optimization is needed. Some cross-cluster traffic is expected and healthy — gateways exist for exactly this purpose. The check fires at 30% to flag accounts where the ratio is high enough to warrant investigation.

Does gateway traffic affect message ordering?

No. NATS gateways preserve message ordering within a subject. Messages published to the same subject are delivered to subscribers in the order they were published, regardless of whether they traverse a gateway. Gateway routing does not introduce reordering.

Can I see which specific messages cross the gateway?

Not directly at the message level. NATS gateways operate transparently — there’s no per-message routing log. You can infer cross-cluster traffic by comparing publisher and subscriber locations for each subject. Synadia Insights calculates the gateway traffic ratio by comparing gateway byte counters with total account throughput across collection epochs, giving you an account-level view without per-message tracing.

Does reducing gateway traffic improve publish latency?

Yes, for JetStream publishes that cross the gateway. When a publisher in cluster A publishes to a stream in cluster B, the publish acknowledgment must traverse the gateway twice (publish + ack). Moving the stream to cluster A or the publisher to cluster B eliminates this round trip. For core NATS publishes without acknowledgments, the latency improvement is on the subscriber side — messages arrive faster when they don’t cross the gateway.

How does this check interact with the cross-cluster stream access check?

They’re complementary. Cross-Cluster Stream Access (OPT_PLACE_001) identifies the structural problem — clients in a cluster with no local stream leaders. High Gateway Traffic Ratio (OPT_PLACE_003) measures the consequence — the actual byte overhead of cross-cluster routing. You can have a high gateway ratio without cross-cluster stream access (e.g., from core NATS subject routing), and you can have cross-cluster stream access with a low gateway ratio (if the affected streams are low-throughput). Fixing OPT_PLACE_001 typically reduces OPT_PLACE_003.

FEATURED

RESOURCES

Comparisons