A common community question is why a NATS JetStream node, especially a stream or consumer leader, can show high RAM usage even after clients stop consuming or publishing.
The short answer: JetStream uses memory for more than active client connections. Some memory is used for stream and consumer state, some for temporary file-store caching, some for duplicate-message tracking, and some can be related to the node’s leadership role in the cluster. If the usage looks unexpectedly high, the most useful next step is usually to capture a memory profile while the condition is happening.
JetStream memory usage depends on configuration and workload. Common contributors include the following.
JetStream supports publish deduplication using the Nats-Msg-Id header. To do that, it keeps duplicate-tracking state for the configured duplicate window.
The default duplicate window — two minutes — is commonly appropriate, but it can be changed or disabled on a per-stream basis. Reducing or disabling the window may reduce memory usage only if you are actually publishing with Nats-Msg-Id values and relying on JetStream deduplication.
If your publishers do not use Nats-Msg-Id, JetStream is not maintaining message-ID deduplication entries for those publishes, so disabling deduplication is unlikely to change memory use meaningfully.
The tradeoff is important: if you disable or shrink the duplicate window, you reduce the server-side protection against duplicate publishes during retries or reconnects.
Disk-backed streams still use memory. JetStream’s file store can hold caches in memory for some time before releasing them. That means memory may not immediately drop just because publishers or consumers have stopped.
Cache pressure is more likely when workloads repeatedly scan large ranges of stream data, create many short-lived consumers, or otherwise access historical messages in patterns that cause file-store data to be read and cached.
The number of messages in a stream is not always the main memory signal, especially for file-backed streams. The number of unique subjects can also matter because JetStream tracks per-subject information in memory.
A stream with hundreds of millions of messages on a single subject is different from a stream with hundreds of millions of messages spread across a very large subject space. If memory usage is surprising, check the subject cardinality as well as the message count.
In a clustered deployment, leaders do more work. A node that is the leader for streams or consumers may use more memory than followers for the same assets.
The cluster metadata leader can also be involved in cluster-wide stream and consumer metadata. If one node is the metadata leader and is using substantially more memory than peers, that fact is worth noting during investigation, but it does not by itself explain every high-memory case.
Stopping clients removes current client activity, but it does not necessarily remove all memory pressure immediately. For example:
If you are using GOMEMLIMIT, remember that it is a Go runtime soft memory limit, not a simple explanation for why memory is being retained. Seeing a process approach a configured limit is a reason to inspect a memory profile, not a reason to assume the largest streams are necessarily responsible.
For file-backed streams, the number of retained messages can be less important than the shape of the workload.
Useful questions include:
Nats-Msg-Id?For example, a file-backed WorkQueue-style stream with one long-lived consumer and one subject has a very different memory profile from a stream with many unique subjects and repeated historical scans.
Maybe, but only in a specific case.
If you have an external deduplication layer and you do not need JetStream’s publish deduplication, you can consider disabling or reducing the duplicate window per stream. This is most relevant when publishers set Nats-Msg-Id and the stream therefore maintains duplicate-tracking state.
If publishers are not using Nats-Msg-Id, JetStream is not storing duplicate IDs for those messages, so changing the duplicate window is unlikely to address high memory usage.
Before changing it, confirm the application-level behavior you need during retries. JetStream deduplication is often useful precisely when publishers retry after timeouts, reconnects, or uncertain publish acknowledgements.
If a node is using much more RAM than its peers and the cause is not obvious, capture a memory profile while the usage is high.
The NATS documentation describes how to gather profiling data:
A profile is especially useful when the environment looks simple on the surface, for example:
In those cases, guessing from stream size alone can be misleading. An allocation or heap profile can show whether the memory is associated with deduplication, cache behavior, subject tracking, consumer state, or something else.
When you see unexpected JetStream RAM usage, collect the following before making tuning changes:
Nats-Msg-Id and each stream’s duplicate window.This data helps distinguish expected cache or metadata use from a workload pattern or configuration that should be adjusted.
JetStream RAM usage can come from deduplication, file-store caching, per-subject tracking, stream and consumer state, and cluster leadership responsibilities. Disk-backed streams still use memory, and process memory may not immediately fall when clients stop.
If you are using Nats-Msg-Id, the duplicate window is a real tuning knob, but reducing it trades away server-side duplicate protection. If you are not using Nats-Msg-Id, deduplication state is probably not the source of high RAM usage.
For unusually high or asymmetric memory use, capture a NATS server memory profile while the issue is occurring. It is the fastest way to move from speculation to evidence.
Want help from the NATS experts? Meet with our architects to get help tailored to your use case and environment.



News and content from across the community