The mirror stream has not received any data from its source within the operator-defined io.nats.monitor.seen-critical time window. Unlike JETSTREAM_015 (Mirror Last Seen Staleness), which uses a built-in 5-minute heuristic, this check fires when inactivity exceeds a threshold you explicitly set — meaning the mirror has been silent longer than you’ve determined is acceptable for this workload.
The “last seen” timestamp on a mirror stream indicates when the internal mirror consumer last received a message from the source. When this timestamp exceeds your critical threshold, the mirror has effectively stopped replicating. Unlike lag-based checks that measure how far behind the mirror is in message count, the seen-critical check measures how long the mirror has been disconnected from the data flow entirely.
This distinction matters for two reasons. First, a mirror can have zero lag and still be stale if the source was idle when the mirror disconnected — lag measures message count difference, not time since last activity. Second, a mirror with high lag is at least still receiving messages (slowly). A mirror that hasn’t been “seen” is receiving nothing at all.
For disaster recovery, the seen-critical threshold defines your maximum tolerable replication blackout. If you set io.nats.monitor.seen-critical to 120 seconds and the mirror hasn’t been seen in 3 minutes, you know the mirror is at least 3 minutes stale — and possibly much more, depending on source publish rate during that window. Every second beyond the threshold increases your potential data loss in a failover.
For compliance and SLA scenarios, the seen-critical threshold provides an auditable guarantee. You can configure it to match your RPO requirements and know that any violation of this threshold will generate an alert, regardless of other monitoring conditions.
Network connectivity loss between source and mirror clusters. Gateway or leaf node connections between the clusters hosting the source and mirror have dropped. This is the most common cause — the mirror consumer cannot reach the source, so it receives nothing.
Source stream deleted or reconfigured. If the source stream is deleted, the mirror has nothing to consume from. Similarly, if the source is reconfigured in a way that invalidates the mirror’s internal consumer (e.g., changing subjects in a way that breaks the mirror filter), replication stops.
Authentication or authorization change. If account credentials, tokens, or permissions change on the source cluster, the mirror’s internal consumer may lose authorization to read from the source stream. The connection fails silently — no new messages arrive.
Internal mirror consumer failure. The NATS server’s internal consumer that drives mirror replication may crash or enter an unrecoverable error state. This is similar to JETSTREAM_015 but detected by the operator-defined time threshold rather than the built-in heuristic.
Source cluster unavailable. If the entire source cluster is down — maintenance, outage, or network isolation — all mirrors sourcing from it will stop receiving messages. This is expected during planned maintenance but should still trigger alerts to confirm mirrors resume afterward.
DNS resolution failure for cross-cluster connections. If the mirror uses DNS-based gateway or leaf node URLs and DNS resolution fails or returns stale results, the mirror consumer cannot establish a connection to the source.
nats stream info MIRROR_STREAM_NAMECheck the Mirror section:
1Mirror Information:2 Stream Name: SOURCE_STREAM3 Lag: 12,8474 Last Seen: 4m18sA “Last Seen” exceeding your configured threshold confirms the check condition.
nats stream info MIRROR_STREAM_NAME --json | jq '.config.metadata["io.nats.monitor.seen-critical"]'This returns the threshold in seconds (e.g., "120" for 2 minutes).
# Check gateway connections to the source clusternats server report gateways
# Check leaf node connections if the mirror uses leaf nodesnats server report leafnodesIf the source cluster doesn’t appear in gateway or leaf node reports, the connectivity path is broken.
# From a client connected to the source clusternats stream info SOURCE_STREAM_NAMEIf the source stream is deleted or the command fails, the mirror has nothing to replicate from.
# Look for auth-related errors on the mirror's servergrep -i "authorization\|permission\|auth" /var/log/nats-server.log | tail -201package main2
3import (4 "fmt"5 "log"6 "strconv"7 "time"8
9 "github.com/nats-io/nats.go"10)11
12func main() {13 nc, _ := nats.Connect(nats.DefaultURL)14 js, _ := nc.JetStream()15
16 for name := range js.StreamNames() {17 info, err := js.StreamInfo(name)18 if err != nil {19 log.Printf("error: %v", err)20 continue21 }22 if info.Mirror == nil {23 continue24 }25 thresholdStr := info.Config.Metadata["io.nats.monitor.seen-critical"]26 if thresholdStr == "" {27 continue28 }29 thresholdSec, _ := strconv.ParseFloat(thresholdStr, 64)30 threshold := time.Duration(thresholdSec) * time.Second31 if info.Mirror.Active > threshold {32 fmt.Printf("CRITICAL: mirror %s last seen %s ago (threshold: %s)\n",33 name, info.Mirror.Active.Round(time.Second), threshold)34 }35 }36}Check and restore gateway/leaf node connections. If the connectivity path between clusters is broken, restoring it is the first priority:
# Verify gateways are configured and connectednats server report gateways
# If a gateway is down, check server config and restart if needednats server config reload <server-id>Step down the mirror’s leader. This forces a new leader election and recreation of the internal mirror consumer with a fresh connection to the source:
nats stream cluster step-down MIRROR_STREAM_NAMEAfter step-down, monitor for resumed activity:
watch -n 5 'nats stream info MIRROR_STREAM_NAME --json | jq "{lag: .mirror.lag, active: .mirror.active_ns}"'If the source stream was deleted or reconfigured: Recreate the source stream or update the mirror configuration to point to the correct source:
# Delete and recreate the mirror with the correct sourcenats stream delete MIRROR_STREAM -fnats stream add MIRROR_STREAM --mirror CORRECT_SOURCE_STREAMIf authentication changed: Update credentials on the mirror cluster to match the source cluster’s current authentication requirements. Reload the server configuration:
nats server config reload <server-id>If DNS resolution is failing: Verify DNS records for the source cluster’s gateway URLs. Consider using IP-based gateway URLs for critical mirror configurations to eliminate DNS as a failure point.
Monitor gateway health independently. Don’t rely solely on mirror activity to detect cross-cluster connectivity issues. Monitor gateway connections, RTT, and throughput as separate signals.
Set appropriate seen-critical thresholds. The threshold should be longer than the maximum expected gap between messages on the source stream. If the source publishes at least once per minute, a 120-second threshold is reasonable. If the source publishes in bursts with long idle periods, set the threshold accordingly or use lag-based monitoring instead.
nats stream edit MIRROR_STREAM --metadata "io.nats.monitor.seen-critical=120"Implement connectivity health checks between clusters. Use NATS service pings or a dedicated heartbeat subject that publishes at a known interval. If the heartbeat stops arriving at the mirror cluster, you know connectivity is broken before any mirror-specific check fires.
Document failover procedures. Since seen-critical violations indicate the mirror is stale, your runbook should include steps to assess data loss before failing over: compare the mirror’s last sequence with the source’s current sequence (if the source is still reachable from another path) to quantify the gap.
JETSTREAM_015 uses a built-in 5-minute heuristic and only fires when the mirror reports zero lag (suggesting the internal consumer thinks it’s caught up but has actually stalled). JETSTREAM_018 uses your explicitly configured io.nats.monitor.seen-critical threshold and fires regardless of lag value. JETSTREAM_018 is the operator-defined version — you control when it’s critical.
They measure different dimensions of mirror health. lag-critical (JETSTREAM_017) measures message count difference — how far behind the mirror is. seen-critical (this check) measures time since last activity — how long the mirror has been disconnected. A mirror can have low lag but high seen time (source went idle after mirror caught up, then mirror disconnected). It can also have high lag but low seen time (mirror is connected but slow). Both thresholds should be configured for comprehensive mirror monitoring.
If the source stream has no new messages, the mirror won’t receive any either. However, the NATS internal mirror consumer maintains a heartbeat with the source even when no messages flow. The “last seen” timer resets on heartbeats, not just data messages. If seen-critical fires during a legitimately idle source, the heartbeat itself has stopped — indicating a connectivity issue, not a data flow issue.
Yes. The io.nats.monitor.seen-critical threshold is per-stream metadata. Each mirror stream can have its own threshold based on its criticality and RPO requirements:
# Critical payment mirror: 60-second thresholdnats stream edit PAYMENTS_MIRROR --metadata "io.nats.monitor.seen-critical=60"
# Analytics mirror: 10-minute thresholdnats stream edit ANALYTICS_MIRROR --metadata "io.nats.monitor.seen-critical=600"For any mirror that serves a disaster recovery or real-time read purpose, yes. For mirrors used for batch analytics or non-time-sensitive processing, the built-in JETSTREAM_015 heuristic may be sufficient. The operator-defined threshold is most valuable when you have a specific RPO or SLA to enforce.
With 100+ always-on audit Checks from the NATS experts, Insights helps you find and fix problems before they become costly incidents.
No alert rules to write. No dashboards to maintain.
News and content from across the community