NATS Cross-Cluster Stream Access: What It Means and How to Fix It

Cross-cluster stream access means client connections in one cluster are accessing JetStream streams whose leaders are in a different cluster. Every operation — publishes, acknowledgments, consumer fetches — routes through the inter-cluster gateway, adding latency and consuming gateway bandwidth that could be avoided with better stream placement.

Why this matters

In a multi-cluster NATS deployment, gateways connect clusters and transparently route messages between them. This transparency is powerful — a client in cluster A can publish to a stream whose leader is in cluster B without any application-level awareness. But that transparency hides a real performance cost. Every publish to a remote stream requires a round trip across the gateway for the acknowledgment. Every consumer fetch traverses the gateway twice: once for the request, once for the delivery.

The latency impact compounds with replication. For an R3 stream in cluster B, a publish from cluster A must cross the gateway to reach the stream leader, wait for the leader to replicate to its peers within cluster B, and then send the acknowledgment back across the gateway. What would be a sub-millisecond operation for a local client becomes a multi-millisecond operation dominated by network round trips. At high publish rates, this latency overhead translates directly into reduced throughput and higher tail latencies.

The cost impact is equally significant. In cloud deployments, cross-availability-zone and cross-region traffic incurs data transfer charges. Gateway traffic between clusters in different regions can become a substantial line item. If 100% of an account’s stream operations traverse the gateway because no streams are local, you’re paying a premium for every byte — and getting worse performance for it.

Common causes

Streams created in the default cluster. When a stream is created without explicit placement constraints, the meta leader assigns it to the cluster where the API request was processed. If the operator or CI/CD system that creates streams connects to a different cluster than the clients that use them, the stream ends up in the wrong place.
Client migration without stream migration. Clients are moved to a new cluster (closer datacenter, new region, infrastructure consolidation) but the streams they depend on are left in the original cluster. The streams continue to work — just more slowly and at higher cost.
Multi-region deployment with centralized streams. All streams are placed in a single “primary” cluster for simplicity, but clients are distributed across multiple clusters in different regions. Every client outside the primary cluster pays the cross-cluster penalty.
Organic growth without placement review. As clusters and clients are added over time, no one reviews whether stream placement still matches access patterns. What started as a well-placed stream becomes a cross-cluster access problem as the client topology evolves.
Account spanning multiple clusters. A single account has clients in multiple clusters. Streams are placed in one cluster, but the account’s clients are spread across several. Unless streams are replicated or mirrored to each cluster, some clients always access remotely.

How to diagnose

Identify where streams are placed

List all streams and their cluster placement:

nats stream report

The Cluster column shows which cluster holds the stream leader. Compare this with where your clients are connected.

Identify where clients are connected

Check client connection distribution across clusters:

nats server report connections --account <account-name>

Group connections by cluster. If clients are concentrated in a cluster that has no stream leaders, you’ve found the mismatch.

Measure gateway traffic

Check how much traffic is traversing gateways versus staying local:

curl -s http://localhost:8222/gatewayz | jq '.outbound_gateways | to_entries[] | {cluster: .key, sent_bytes: .value.connection.sent_bytes, recv_bytes: .value.connection.recv_bytes}'

High gateway byte counts relative to total throughput confirm that cross-cluster traffic is significant.

Check publish acknowledgment latency

From a client in the remote cluster, measure JetStream publish latency to gauge the gateway overhead:

nats bench js pub sync benchmark.publish --msgs 1000 --size 128 --clients 1

Compare this with the same benchmark run from a client in the stream’s local cluster. The difference is the gateway overhead.

Identify specific streams causing cross-cluster access

For each stream, check if any account with clients in remote clusters publishes to or consumes from it:

nats stream info <stream-name> --json | jq '{name: .config.name, cluster: .cluster.name, replicas: [.cluster.replicas[].name]}'

Cross-reference the stream’s cluster with the account’s client distribution to identify misplaced streams.

How to fix it

Immediate: quantify the impact

Measure the latency and bandwidth cost. Before moving streams, understand the scale of the problem. Calculate the gateway traffic percentage for each affected account and the latency difference between local and remote access. This prioritizes which streams to move first.

# Check per-account gateway traffic vs total
nats server report accounts

Focus on streams with the highest publish rates or the most consumers in remote clusters — these yield the biggest improvement when moved.

Short-term: place streams closer to clients

Use placement tags to move streams to the client’s cluster. If your servers are tagged with cluster or region identifiers, use placement constraints:

# Add a tag to servers in the target cluster (in server config)
# server.conf: server_tags: ["cluster:us-east"]

# Edit stream to prefer the target cluster
nats stream edit ORDERS --tag cluster:us-east

Create mirrors for read-heavy cross-cluster access. If clients in multiple clusters need to read from the same stream, create read-only mirrors in each cluster rather than routing all reads through gateways:

1
// Go — create a mirror stream in the local cluster
2
js, _ := nc.JetStream()
3
_, err := js.AddStream(&nats.StreamConfig{
4
    Name: "ORDERS-MIRROR",
5
    Mirror: &nats.StreamSource{
6
        Name: "ORDERS",
7
    },
8
    Placement: &nats.Placement{
9
        Tags: []string{"cluster:us-west"},
10
    },
11
})
12
if err != nil {
13
    log.Fatal(err)
14
}

1
# Python (nats.py) — create a local mirror
2
js = nc.jetstream()
3
await js.add_stream(
4
    name="ORDERS-MIRROR",
5
    mirror={"name": "ORDERS"},
6
    placement={"tags": ["cluster:us-west"]},
7
)

Use stream sources for aggregation patterns. If multiple clusters produce data that needs to be available centrally, use sources to pull data from remote streams rather than forcing publishers to write cross-cluster.

Long-term: design for data locality

Establish a placement policy. Define a standard that streams must be placed in the cluster where the majority of their clients connect. Document this as part of your stream creation workflow and enforce it in CI/CD pipelines.

Automate placement review. Periodically compare stream placement with client connection distribution. Synadia Insights does this automatically — for manual setups, create a script that flags streams where the majority of client connections are in a different cluster than the stream leader.

Consider per-region stream partitioning. For workloads where clients are distributed across many clusters, partition streams by region rather than centralizing them. Each region gets its own stream instance, eliminating cross-cluster access entirely. Use sources or mirrors for any cross-region aggregation needs.

Frequently asked questions

Does cross-cluster access cause data loss?

No. Cross-cluster stream access is fully functional — NATS gateways provide transparent routing with the same durability guarantees as local access. The issue is performance and cost, not correctness. Publishes are acknowledged only after the stream leader persists the message, regardless of whether the publisher is local or remote.

How much latency does gateway routing add?

The additional latency equals approximately one gateway round-trip time per operation. For clusters in the same region, this is typically 1-5ms. For cross-region gateways, it can be 20-100ms or more depending on geographic distance. For high-throughput streams with thousands of operations per second, even a few milliseconds per operation significantly reduces aggregate throughput.

Can I use mirrors to reduce cross-cluster read latency?

Yes. A mirror stream in the local cluster replicates all data from the source stream asynchronously. Consumers reading from the mirror get local latency for reads. The tradeoff is mirror lag — the local mirror will always be slightly behind the source. For most use cases, the lag is sub-second and acceptable. For use cases requiring strict ordering or real-time consistency, consumers must read from the source stream.

Should I place R3 replicas across clusters for fault tolerance?

NATS R3 replication places replicas across servers within a cluster, not across clusters. Cross-cluster replication is handled by mirrors and sources, which are asynchronous. Placing Raft replicas across clusters would make every write depend on cross-cluster latency for consensus, which defeats the purpose of local placement. Use mirrors for cross-cluster durability instead.

How does Insights identify cross-cluster access?

Insights compares client connection distribution per account across clusters with stream leader placement. If an account has client connections in a cluster that has zero stream leaders for that account, the check fires. This identifies the structural mismatch — clients exist in a cluster with no local JetStream leadership — without needing to trace individual message paths.

FEATURED

RESOURCES

Comparisons