Checks/JETSTREAM_019

NATS Min Sources: Detecting Missing Stream Sources

Severity
Critical
Category
Health
Applies to
JetStream
Check ID
JETSTREAM_019
Detection threshold
source count below operator-defined io.nats.monitor.min-sources value

A JetStream stream can aggregate messages from multiple upstream source streams into a single unified stream. When the number of configured sources drops below the operator-defined minimum — set via the io.nats.monitor.min-sources metadata tag — this check fires. A missing source means the aggregated stream is no longer receiving a complete picture of upstream data, which typically indicates an operational problem: a source stream was deleted, renamed, or its configuration was changed without updating the downstream aggregate.

Why this matters

Source-backed streams are the backbone of cross-account, cross-cluster, and cross-domain data aggregation in NATS. A common pattern is to create a central analytics or audit stream that sources from several upstream streams — one per service, region, or tenant. If one of those sources disappears, the aggregate stream silently stops receiving data from that origin. There is no automatic error or backpressure signal; the stream simply has a gap.

In production, this creates several problems. Analytics dashboards that rely on the aggregate stream will undercount events. Alerting systems watching the aggregate for anomalies may miss signals from the missing source entirely. Compliance and audit requirements that depend on complete data capture are violated. The longer the gap persists, the harder it becomes to backfill, because the upstream source stream — if it still exists — may have already discarded messages based on its own retention policy.

The insidious part is that a missing source is easy to overlook. The aggregate stream keeps working; it just has fewer messages. If the missing source contributed a small fraction of total traffic, the volume change may not be noticeable in high-level metrics. Explicit source-count monitoring — which is exactly what this check provides — catches the problem immediately, before data loss accumulates.

Common causes

  • Upstream source stream was deleted. An operator or automated process removed a source stream without updating the aggregate stream’s configuration. This is the most common cause, especially in environments where stream lifecycle management is handled by separate teams or CI/CD pipelines.

  • Source stream was renamed. JetStream does not support renaming streams directly — the old stream is deleted and a new one is created. If the aggregate stream’s sources reference the old name, the source link breaks silently.

  • Cross-account or cross-cluster connectivity loss. If the source stream is in a different account or cluster accessed via a leafnode or gateway, connectivity disruptions can prevent the source relationship from functioning. The source is still configured but cannot replicate data.

  • Configuration drift during deployment. A new version of the aggregate stream was deployed with an incomplete sources list. A YAML or JSON config file was edited and a source entry was accidentally removed.

  • Source stream exists but subject filter no longer matches. The source configuration includes a subject filter transform. If the upstream stream’s subjects changed, the filter may no longer match any messages, making the source effectively dead even though it’s technically connected.

How to diagnose

Check the current source count

Use the NATS CLI to inspect the stream and count its sources:

Terminal window
nats stream info AGGREGATE_STREAM --json | jq '.config.sources | length'

Compare this value against your expected minimum. If you have set the io.nats.monitor.min-sources metadata tag, you can view it with:

Terminal window
nats stream info AGGREGATE_STREAM --json | jq '.config.metadata'

List all configured sources

Terminal window
nats stream info AGGREGATE_STREAM --json | jq '.config.sources[] | .name'

Compare this list against the expected set of upstream source streams. Identify which source is missing.

Verify source streams exist

For each expected source, confirm it still exists:

Terminal window
nats stream ls --json | jq '.[].config.name'

If a source stream is missing from this list, it was deleted.

Check source replication status

The stream info endpoint reports the state of each source, including lag and last seen time:

Terminal window
nats stream info AGGREGATE_STREAM --json | jq '.sources'

Each source entry shows lag (number of messages behind) and active (time since last message received). A source with high lag or a stale active timestamp may indicate connectivity problems even if the source is technically configured.

Programmatic detection in Go

1
js, _ := nc.JetStream()
2
info, _ := js.StreamInfo("AGGREGATE_STREAM")
3
4
minSources := 3 // your expected minimum
5
actualSources := len(info.Config.Sources)
6
7
if actualSources < minSources {
8
log.Printf("ALERT: stream has %d sources, expected at least %d",
9
actualSources, minSources)
10
for _, src := range info.Config.Sources {
11
log.Printf(" configured source: %s", src.Name)
12
}
13
}

Programmatic detection in Python

How to fix it

Immediate: restore the missing source

If the source stream still exists but was removed from the aggregate’s configuration, add it back:

Terminal window
nats stream edit AGGREGATE_STREAM --source MISSING_SOURCE_STREAM

If the source stream was deleted, recreate it first:

Terminal window
nats stream add MISSING_SOURCE_STREAM \
--subjects "events.service_a.>" \
--retention limits \
--max-msgs -1 \
--max-bytes 10GB \
--storage file \
--replicas 3

Then add it as a source to the aggregate:

Terminal window
nats stream edit AGGREGATE_STREAM --source MISSING_SOURCE_STREAM

Short-term: audit all aggregate streams

List all streams that use sources and verify each has the expected count:

Terminal window
for stream in $(nats stream ls -n); do
count=$(nats stream info "$stream" --json 2>/dev/null | jq '.config.sources | length')
if [ "$count" -gt 0 ]; then
echo "$stream: $count sources"
fi
done

Cross-reference with your infrastructure-as-code or documentation to identify any other streams with missing sources.

Long-term: codify source configuration

Use infrastructure-as-code. Define all stream configurations — including sources lists — in version-controlled configuration files. Use tools like Terraform with the NATS provider, or a custom CI/CD pipeline that applies stream configs declaratively. This prevents accidental source removal during manual edits.

Set the io.nats.monitor.min-sources metadata tag. This enables automated monitoring to catch source-count drops before they impact downstream consumers:

Terminal window
nats stream edit AGGREGATE_STREAM \
--metadata "io.nats.monitor.min-sources=3"

Implement lifecycle hooks. Before deleting any stream, check whether it is referenced as a source by other streams. A pre-deletion check prevents accidentally breaking aggregate streams:

Terminal window
# Check if any stream sources from TARGET_STREAM
nats stream ls --json | jq -r '.[] | select(.config.sources[]?.name == "TARGET_STREAM") | .config.name'

Monitor source health continuously. Beyond just counting sources, monitor the active and lag fields for each source. A source that exists but hasn’t delivered messages in an unexpectedly long time is functionally equivalent to a missing source.

Frequently asked questions

What is a stream source in NATS JetStream?

A stream source is a configuration that tells one stream to replicate messages from another stream. Unlike mirrors (which create a read-only copy of a single stream), sources allow a stream to aggregate messages from multiple upstream streams. The downstream stream can also have its own direct subject bindings in addition to sourced data. Sources are commonly used for cross-account aggregation, regional roll-ups, and building unified audit or analytics streams.

How is this different from the Max Sources check (JETSTREAM_020)?

Min Sources (JETSTREAM_019) fires when the source count drops below a minimum threshold, indicating a source was lost. Max Sources (JETSTREAM_020) fires when the source count exceeds a maximum, indicating an unexpected source was added. Together, they provide a bounds check on source configuration. Use both when you have a strict expectation of exactly how many sources a stream should have.

Can a source stream be in a different NATS cluster?

Yes. Sources can replicate data across clusters, accounts, and even JetStream domains. The source configuration supports specifying an external API prefix and delivery subject for cross-domain replication. When using cross-cluster sources, network connectivity issues between clusters can cause source replication to stall without removing the source from the configuration — which this check would not catch. Use the Mirror Lag Critical and Mirror Seen Critical checks for detecting stalled replication.

What happens to existing messages if I add a source back?

When you add a source to a stream, NATS begins replicating from the source’s current state. Messages that were published to the source stream while it was not configured as a source will be replicated if they are still available in the source stream (not yet removed by retention policy). If the source stream has already discarded those messages, they are permanently lost for the aggregate stream.

Should I use the min-sources tag or external monitoring?

Both. The io.nats.monitor.min-sources metadata tag enables Synadia Insights to detect source-count drops automatically as part of its continuous monitoring. External monitoring via Prometheus or custom scripts provides defense-in-depth and can integrate with your existing alerting infrastructure. The metadata tag requires no additional infrastructure to deploy.

Proactive monitoring for NATS min sources with Synadia Insights

With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.

Start a 14-day Insights trial
Cancel