Checks/JETSTREAM_002

NATS High Subject Cardinality: What It Means and How to Fix It

Severity
Warning
Category
Saturation
Applies to
JetStream
Check ID
JETSTREAM_002
Detection threshold
unique subjects >= 1,000,000 in a stream

High subject cardinality means a JetStream stream is tracking 1,000,000 or more unique subjects. Each unique subject consumes memory for metadata tracking, and at scale this degrades stream performance — slower lookups, larger snapshots, and increased memory pressure on the server.

Why this matters

JetStream streams can capture messages across many subjects using wildcards. A stream configured with orders.> stores every message published to any subject starting with orders. — and tracks each unique subject in an internal index. This index enables features like max_msgs_per_subject (per-subject retention), subject-based filtering for consumers, and the nats stream subjects command.

The problem is that this index lives in memory. At 1,000 unique subjects, the overhead is negligible. At 1,000,000, it’s measurable. At 10,000,000+, it dominates the server’s memory profile for that stream and introduces real latency to every operation that touches the index. nats stream info calls slow down because they enumerate subject metadata. Consumer filter matching becomes expensive because the server evaluates each filter against the full subject space. Raft snapshots grow large because the subject index must be serialized and transferred to followers during leader elections or replica sync.

The operational impact compounds over time. Subject cardinality typically grows monotonically — new entity IDs, session IDs, or device IDs create new subjects, but old subjects are rarely removed (even after their messages are purged). A stream that works fine in staging with 10,000 subjects exhibits completely different performance characteristics in production after six months of accumulating unique subjects. By the time someone notices the degradation, the stream’s subject index is already enormous, and fixing it requires careful migration rather than a simple configuration change.

Common causes

  • Per-entity subject naming. The most common pattern: embedding a unique identifier in the subject. orders.<customer_id>, events.<device_id>, telemetry.<sensor_id>. If you have millions of customers, devices, or sensors, you get millions of unique subjects. The subject hierarchy becomes a de facto database index.

  • Session or request IDs in subjects. Using subjects like responses.<request_id> for request-reply patterns over JetStream. Each request creates a new unique subject. At thousands of requests per second, subject cardinality grows by millions per day.

  • Wildcard capture on a broad namespace. A stream configured with > or a very broad wildcard like events.> captures every subject in a large namespace. The stream creator may not realize how many unique subjects exist under that hierarchy.

  • Timestamp or date-based subjects. Patterns like logs.<date>.<service>.<level> where the date component creates new subjects daily. Over months, the accumulated subjects add up — and unlike message retention, subject metadata isn’t automatically cleaned up.

  • No subject hierarchy design. Flat subject namespaces without deliberate hierarchy lead to sprawl. Instead of orders.created, orders.fulfilled, orders.cancelled (3 subjects), teams create order_created_<id>, order_fulfilled_<id>, order_cancelled_<id> (3 × N subjects).

How to diagnose

Check subject count for a stream

Terminal window
nats stream info <stream_name>

Look for the Subjects field in the State section. This shows the total number of unique subjects tracked by the stream. If this number is approaching or exceeding 1,000,000, this check fires.

List the actual subjects

For streams with manageable cardinality, you can inspect the subjects:

Terminal window
nats stream subjects <stream_name>

Warning: For streams with millions of subjects, this command produces enormous output and can take significant time. Pipe to head or use filtering:

Terminal window
nats stream subjects <stream_name> --filter "orders.*" | head -100

Check memory impact across all streams

Terminal window
nats stream report

Compare the memory usage of the high-cardinality stream against other streams. Disproportionate memory usage relative to message count and byte size indicates subject index overhead.

Monitor server memory

Terminal window
nats server report jetstream

If the server hosting the high-cardinality stream shows elevated memory usage compared to cluster peers, subject cardinality is a likely contributor.

Identify the subject pattern

Determine what’s driving cardinality by sampling subjects:

Terminal window
nats stream subjects <stream_name> | head -1000 | awk -F. '{print $1"."$2}' | sort | uniq -c | sort -rn

This groups subjects by their first two tokens, revealing which prefix generates the most unique subjects.

How to fix it

Immediate: assess the impact

Before making changes, determine whether the high cardinality is actively causing problems:

Terminal window
# Time how long stream info takes — should be <1s
time nats stream info <stream_name>
# Check server memory trend
nats server report jetstream

If nats stream info responds quickly and server memory is stable, the cardinality may be tolerable for now. If it takes seconds or the server is under memory pressure, act urgently.

Short-term: redesign subject hierarchy

The root fix is almost always changing how subjects are named. Move unique identifiers out of the subject and into message headers or the message payload:

Before (high cardinality):

1
// Every customer creates a new subject
2
js.Publish(fmt.Sprintf("orders.%s", customerID), orderData)

After (bounded cardinality):

1
// Fixed set of subjects, customer ID in header
2
msg := nats.NewMsg("orders.created")
3
msg.Header.Set("Customer-ID", customerID)
4
msg.Data = orderData
5
js.PublishMsg(msg)
1
# Python equivalent
2
await js.publish(
3
"orders.created",
4
order_data,
5
headers={"Customer-ID": customer_id},
6
)

This changes the cardinality from O(customers) to O(order_event_types) — typically single digits instead of millions.

Use max_msgs_per_subject if per-entity retention is needed. If you’re using per-entity subjects specifically for per-entity retention (keep the last N messages per customer), consider whether the same result can be achieved with a consumer filter or key-value bucket. KV buckets are designed for high-cardinality key spaces.

Split into multiple streams

If per-entity subjects are architecturally necessary, partition the subject space across multiple streams to cap per-stream cardinality:

Terminal window
# Instead of one stream capturing orders.>
# Create streams by region or partition
nats stream add ORDERS_US --subjects "orders.us.>"
nats stream add ORDERS_EU --subjects "orders.eu.>"
nats stream add ORDERS_APAC --subjects "orders.apac.>"

Each stream tracks only a fraction of the total subject space.

Long-term: enforce cardinality limits in your naming conventions

Document subject naming standards. Establish a team-wide rule: never embed unbounded identifiers in NATS subjects. Subjects are for routing, not for entity addressing. Use headers or payload fields for entity IDs.

Monitor cardinality as a deployment metric. Track subject counts per stream over time and alert before they reach the million-subject threshold.

Synadia Insights evaluates subject cardinality every collection epoch, flagging streams as they cross the 1,000,000 threshold so you can address the growth before it impacts server performance.

Frequently asked questions

Does purging a stream reset its subject count?

Purging removes all messages but does not necessarily remove all subject tracking metadata immediately. Subject metadata is cleaned up when the last message for a subject is removed through purge or retention. After a full purge, the subject count should eventually drop to zero, but this may not be instantaneous — it depends on the server’s internal cleanup cycle.

Can I disable subject tracking for a stream?

There is no explicit toggle to disable subject tracking. The server tracks subjects whenever messages are stored. However, if you don’t use max_msgs_per_subject and don’t need nats stream subjects, the subject index still exists but you can mitigate its impact by reducing cardinality through subject redesign. The overhead is proportional to the number of unique subjects, not to whether any feature explicitly queries them.

What’s the practical limit for subject cardinality?

There’s no hard server limit, but performance degrades progressively. Streams with 100,000 subjects work fine. At 1,000,000 (the check threshold), you’ll notice slower stream info responses and increased memory usage. At 10,000,000+, stream operations become significantly slower and memory pressure can affect other streams on the same server. The practical limit depends on your server’s available memory and your tolerance for operational latency.

Is a NATS KV bucket better for high-cardinality data?

Yes, for key-value workloads. KV buckets are backed by JetStream streams internally, but their access patterns are optimized for many keys with small values. If your use case is “store the latest state for millions of entities,” a KV bucket with max_msgs_per_subject=1 is the intended pattern. It still creates subjects internally, but the access API (Get/Put/Watch) is optimized for this cardinality profile.

How does high subject cardinality affect consumers?

Consumers with subject filters (e.g., filter_subject: "orders.us.*") must evaluate their filter against the stream’s subject space. With millions of subjects, this filter evaluation takes more CPU time. Consumers without filters are less affected — they receive all messages regardless of subject. If you have many filtered consumers on a high-cardinality stream, the combined CPU cost of filter evaluation can become significant.

Proactive monitoring for NATS high subject cardinality with Synadia Insights

With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.

Start a 14-day Insights trial
Cancel