New: Synadia Insights, NATS-native observability. Catch issues early, understand why, and fix faster.

Automated Audit Checks

Insights continuously monitors your NATS deployment to surface operational issues and optimization opportunities — before they become incidents.

All Checks

Filter and sort to find specific checks. How do checks work? Learn more.

143 checks
Description
ACCOUNTS_001Account Connection LimitwarningAccountoperationalFlags accounts where connections are at or above 90% of the configured limit.
ACCOUNTS_002Slow ConsumerscriticalAccountoperationalFlags accounts with new slow consumer events since the previous epoch, aggregated across servers.
ACCOUNTS_003Inactive JWT ImportcriticalAccountoperationalDetects imports declared in the account JWT but not activated by the server. Diagnoses root cause: missing activation token, expired token, token signed by rotated signing key, or source export not found.
ACCOUNTS_004Orphaned ExportwarningAccountoperationalFlags exports with no matching importer in any account. Uses NATS wildcard subject matching.
ACCOUNTS_005No Subscription InterestinfoAccountoperationalFinds active imports where no client in the importing account subscribes to the imported subject. Uses NATS wildcard subject matching.
ACCOUNTS_006Account Subscription LimitwarningAccountoperationalFlags accounts where subscriptions are at or above 90% of the configured limit.
CHANGE_001Config Reload DetectedinfoChangeoperationalDetects servers whose configuration was reloaded by comparing config_load_time between consecutive epochs.
CHANGE_002JetStream Domain ChangedwarningChangeoperationalDetects servers whose JetStream domain value changed between consecutive epochs.
CHANGE_003Account Added or RemovedinfoChangeoperationalDetects accounts that appeared or disappeared between consecutive epochs.
CHANGE_004Stream Configuration ChangedinfoChangeoperationalDetects streams whose configuration fields (replicas, retention, limits) changed between consecutive epochs.
CLUSTER_001Memory Usage OutlierwarningClusteroperationalFlags servers whose memory usage exceeds the configured multiplier of their cluster average.
CLUSTER_003High HA AssetswarningClusteroperationalFlags servers with 1000 or more highly-available JetStream assets.
CLUSTER_004Cluster Name WhitespacewarningClusteroperationalFlags servers whose cluster name contains whitespace characters.
CLUSTER_005Route Count LowwarningClusteroperationalFlags servers with fewer cluster routes than expected based on cluster size.
CLUSTER_006Connection Count ChangewarningClusteroperationalFlags servers where the connection count changed dramatically between epochs, indicating a significant increase or decrease in connected clients.
CLUSTER_007Gateway DisconnectioncriticalClusteroperationalFlags servers that lost a gateway connection since the previous epoch.
CLUSTER_008Gateway Config MismatchwarningClusteroperationalFlags servers whose set of gateway connections differs from the cluster majority.
CONN_001High Client RTTwarningConnectionoperationalFlags client connections with round-trip time exceeding 100 ms.
CONN_002Client Pending PressurewarningConnectionoperationalFlags client connections with more than 1 MiB of pending bytes.
CONN_003Connection StoppedinfoConnectionoperationalFlags connections that disconnected with a non-empty reason.
CONSUMER_001Consumer Replica OfflinecriticalConsumeroperationalFlags consumer replicas that are reported as offline.
CONSUMER_002Consumer Replica LagwarningConsumeroperationalFlags consumer replicas lagging by more than 1000 operations behind the leader.
CONSUMER_003Consumer Quorum LostcriticalConsumeroperationalFlags replicated consumers where enough replicas are offline to lose quorum.
CONSUMER_004Consumer Delivered Below Stream First SequencecriticalConsumeroperationalFlags consumers whose last delivered position is below the stream's first sequence after a purge or truncation.
CONSUMER_005Consumer Sequence Ahead of Stream SequencecriticalConsumeroperationalFlags consumers whose delivered position is ahead of the stream's last sequence.
CONSUMER_006Outstanding Ack CriticalcriticalConsumeroperationalFlags consumers where num_ack_pending exceeds the operator-defined threshold.
CONSUMER_007Waiting CriticalcriticalConsumeroperationalFlags consumers where num_waiting exceeds the operator-defined threshold.
CONSUMER_008Unprocessed CriticalcriticalConsumeroperationalFlags consumers where num_pending exceeds the operator-defined threshold.
CONSUMER_009Last Delivery CriticalcriticalConsumeroperationalFlags consumers where the time since the last delivery exceeds the operator-defined threshold.
CONSUMER_010Last Ack CriticalcriticalConsumeroperationalFlags consumers where the time since the last acknowledgment exceeds the operator-defined threshold.
CONSUMER_011Redelivery CriticalcriticalConsumeroperationalFlags consumers where num_redelivered exceeds the operator-defined threshold.
CONSUMER_012Pinned Consumer Policy MismatchcriticalConsumeroperationalFlags consumers with io.nats.monitor.pinned metadata that are not using the overflow priority policy.
JETSTREAM_001Stream Replica LagwarningJetStreamoperationalFlags stream replicas whose last sequence number is more than 10% behind the leader.
JETSTREAM_002High Subject CardinalitywarningJetStreamoperationalFlags streams with one million or more unique subjects.
JETSTREAM_003Stream Message LimitwarningJetStreamoperationalFlags streams where message count is at or above 90% of the limit.
JETSTREAM_004JS API Request Rate HighwarningJetStreamoperationalFlags when the JetStream API request rate exceeds the threshold.
JETSTREAM_005JS API Pending HighwarningJetStreamoperationalFlags servers where JetStream API inflight requests exceed the threshold.
JETSTREAM_006Consumer Count ChangewarningJetStreamoperationalFlags when the total consumer count change between epochs exceeds the threshold, indicating a significant increase or decrease.
JETSTREAM_007JetStream Memory Utilization CriticalcriticalJetStreamoperationalFlags servers where JetStream memory usage exceeds the critical threshold.
JETSTREAM_008Stream Quorum LostcriticalJetStreamoperationalFlags replicated streams where enough replicas are offline to lose quorum.
JETSTREAM_009JS API Error Rate HighwarningJetStreamoperationalFlags servers where JetStream API errors exceed a percentage of total requests.
JETSTREAM_010Stream Byte LimitwarningJetStreamoperationalFlags streams where byte usage is at or above 90% of the limit.
JETSTREAM_011Stream Consumer LimitwarningJetStreamoperationalFlags streams where consumer count is at or above 90% of the limit.
JETSTREAM_012JetStream Storage Utilization CriticalcriticalJetStreamoperationalFlags servers where JetStream storage usage exceeds the critical threshold.
JETSTREAM_013Stream Subject/Message Count InconsistencywarningJetStreamoperationalFlags streams where the number of unique subjects exceeds the total message count. An invariant violation indicating filestore corruption.
JETSTREAM_014Stream Replica Message Count DivergencecriticalJetStreamoperationalFlags replicated streams where all replicas report current but have significantly different message counts, indicating filestore corruption or raft state reset.
JETSTREAM_015Mirror Last Seen StalenesswarningJetStreamoperationalFlags mirror streams where the mirror consumer has stalled. Zero lag but no activity while the source stream continues receiving messages.
JETSTREAM_016JetStream Storage vs Configured Limit CriticalcriticalJetStreamoperationalFlags servers where JetStream storage usage critically exceeds the configured max_store limit, risking imminent Raft failures.
JETSTREAM_017Mirror Lag CriticalcriticalJetStreamoperationalFlags mirror streams where mirror lag exceeds the operator-defined io.nats.monitor.lag-critical threshold.
JETSTREAM_018Mirror Seen CriticalcriticalJetStreamoperationalFlags mirror streams where the time since the mirror was last active exceeds the operator-defined io.nats.monitor.seen-critical threshold.
JETSTREAM_019Min SourcescriticalJetStreamoperationalFlags streams where the source count is below the operator-defined io.nats.monitor.min-sources threshold.
JETSTREAM_020Max SourcescriticalJetStreamoperationalFlags streams where the source count exceeds the operator-defined io.nats.monitor.max-sources threshold.
JETSTREAM_021Peer ExpectcriticalJetStreamoperationalFlags streams where the actual peer count does not match the operator-defined io.nats.monitor.peer-expect threshold.
JETSTREAM_022Peer Lag CriticalcriticalJetStreamoperationalFlags stream replicas where lag exceeds the operator-defined io.nats.monitor.peer-lag-critical threshold.
JETSTREAM_023Peer Seen CriticalcriticalJetStreamoperationalFlags stream replicas where the time since the replica was last active exceeds the operator-defined io.nats.monitor.peer-seen-critical threshold.
JETSTREAM_024Message Count ThresholdwarningJetStreamoperationalFlags streams where message count exceeds operator-defined thresholds. Direction is inferred from threshold ordering.
JETSTREAM_025Subject Count ThresholdwarningJetStreamoperationalFlags streams where subject count exceeds operator-defined thresholds. Direction is inferred from threshold ordering.
LEAF_001Leafnode Name WhitespacewarningLeafnodeoperationalFlags leafnode connections whose remote server name contains whitespace.
LEAF_002High Leaf RTTwarningLeafnodeoperationalFlags leafnode connections with round-trip time exceeding the threshold.
LEAF_003Leafnode Subscription Count HighwarningLeafnodeoperationalFlags leafnode connections carrying a large number of subscriptions, which can cause hub processing to exceed the stale connection timeout.
META_001Offline ReplicacriticalMeta ClusteroperationalFlags meta cluster replicas that are reported as offline.
META_002Leader DisagreementcriticalMeta ClusteroperationalFlags when multiple servers report themselves as the meta cluster leader.
META_003Meta Leader FlappingwarningMeta ClusteroperationalFlags when the meta cluster leader has changed more than the allowed number of times in the recent time window.
META_004Meta Snapshot SlowwarningMeta ClusteroperationalFlags when the meta cluster snapshot duration exceeds the warning or critical threshold.
META_005Meta State GrowthwarningMeta ClusteroperationalFlags when the total number of JetStream asset replicas exceeds the threshold.
META_006Meta Quorum LostcriticalMeta ClusteroperationalFlags when enough meta cluster peers are offline to lose quorum.
META_007Even Cluster SizewarningMeta ClusteroperationalFlags when the meta cluster has an even number of peers.
META_008Meta Pending HighwarningMeta ClusteroperationalFlags when the meta cluster leader has a high number of pending Raft operations.
META_009Meta Cluster Size DecreasedcriticalMeta ClusteroperationalFlags when the meta cluster size has decreased between consecutive epochs, indicating a peer was removed or lost.
OPT_ACCT_001Account Storage Quota Approaching LimitwarningAccountoptimizationFlags accounts where JetStream storage reservations approach the configured quota.
OPT_ACCT_002Excessive JWT SizewarningAccountoptimizationFlags accounts with unusually large JWT claims, indicating excessive permissions or revocations.
OPT_BALANCE_001Uneven Leader DistributioninfoBalanceoptimizationFlags servers hosting disproportionately many stream and consumer leaders.
OPT_BALANCE_002Connection HotspotinfoBalanceoptimizationFlags servers with more than double the cluster average connections.
OPT_BALANCE_003Subscription HotspotinfoBalanceoptimizationFlags servers with more than double the cluster average subscriptions.
OPT_BALANCE_004Stream Replica Count ImbalanceinfoBalanceoptimizationFlags servers hosting disproportionately many stream replicas.
OPT_BALANCE_005JetStream Storage SkewinfoBalanceoptimizationFlags servers whose JetStream storage exceeds double the cluster average.
OPT_BALANCE_006Account Connection ConcentrationinfoBalanceoptimizationFlags servers hosting more than 70% of an account's connections.
OPT_BALANCE_007Stream-Consumer Leader Co-locationinfoBalanceoptimizationFlags streams where the stream leader's server also hosts a disproportionate share of consumer leaders.
OPT_BALANCE_008JetStream Storage Saturation with SkewwarningBalanceoptimizationFlags servers with high JetStream storage utilization where the cluster also exhibits significant storage skew between nodes.
OPT_COST_001Over-Replicated Inactive StreaminfoCostoptimizationFlags R3+ streams with no new messages across the selected time range.
OPT_COST_002Memory Storage Large StreaminfoCostoptimizationFlags memory-backed streams using more than 100 MiB.
OPT_COST_003Wasted JetStream Memory ReservationinfoCostoptimizationFlags servers where JetStream memory usage is below 20% of reserved capacity.
OPT_COST_004Uncompressed Large StreaminfoCostoptimizationFlags file-backed streams exceeding 1 GiB with no compression enabled.
OPT_COST_005Wasted JetStream Storage ReservationinfoCostoptimizationFlags servers where JetStream storage usage is below 20% of reserved capacity.
OPT_IDLE_001Underutilized ServerinfoIdle ResourcesoptimizationFlags servers that remained nearly idle across the selected time range.
OPT_IDLE_002Inactive StreaminfoIdle ResourcesoptimizationFlags unsealed streams that received no new messages across the time range.
OPT_IDLE_003Inactive ConsumerinfoIdle ResourcesoptimizationFlags consumers that made no delivery progress across the time range.
OPT_IDLE_004Drained ConsumerinfoIdle ResourcesoptimizationFlags consumers fully caught up with zero pending on an inactive stream.
OPT_IDLE_005Inactive AccountinfoIdle ResourcesoptimizationFlags non-system accounts with no connections or throughput for the configured inactivity threshold (default 24h).
OPT_IDLE_006Disconnected UsersinfoIdle ResourcesoptimizationFlags non-system account users with no active connections at the current epoch.
OPT_IDLE_007Idle Client ConnectionsinfoIdle ResourcesoptimizationFlags client connections idle for more than 5 minutes with zero messages.
OPT_PLACE_001Cross-Cluster Stream AccessinfoPlacementoptimizationFlags accounts with clients in clusters that have no local stream leaders.
OPT_PLACE_002Consumer Leader Not Co-locatedinfoPlacementoptimizationFlags consumers whose leader is in a different cluster than the majority of connections.
OPT_PLACE_003High Gateway Traffic RatioinfoPlacementoptimizationFlags accounts where more than 30% of traffic is cross-cluster gateway traffic.
OPT_PLACE_004Gateway Interest ModeinfoPlacementoptimizationFlags gateway account combinations still using optimistic interest mode.
OPT_SYS_001Streams Without LimitsinfoSystem ImprovementoptimizationFlags streams with no message, byte, or age retention limits.
OPT_SYS_002High Consumer RedeliverywarningSystem ImprovementoptimizationFlags consumers with a redelivery rate exceeding 10%.
OPT_SYS_003Ack Pending BuildupwarningSystem ImprovementoptimizationFlags consumers approaching their maximum ack pending limit.
OPT_SYS_004Unbound Push ConsumerwarningSystem ImprovementoptimizationFlags push consumers with no subscriber currently bound.
OPT_SYS_005Route Pending PressurewarningSystem ImprovementoptimizationFlags route connections with more than 1 MiB of pending data.
OPT_SYS_006Leaf Compression DisabledinfoSystem ImprovementoptimizationFlags leaf connections with compression disabled.
OPT_SYS_007Raft Apply LagwarningSystem ImprovementoptimizationFlags Raft groups where committed-applied gap exceeds 100 entries.
OPT_SYS_008Unlimited JetStream AccountinfoSystem ImprovementoptimizationFlags non-system accounts with JetStream enabled but no storage limits.
OPT_SYS_009Leaderless Raft GroupcriticalSystem ImprovementoptimizationRaft group has no elected leader and cannot process writes.
OPT_SYS_010Raft IPQ BackpressurewarningSystem ImprovementoptimizationInternal queue lengths for a raft group exceed threshold, indicating processing backlog.
OPT_SYS_011Subscription Fanout AnomalyinfoSystem ImprovementoptimizationFlags servers where max fanout is disproportionately higher than average fanout.
OPT_SYS_012Subscription ChurninfoSystem ImprovementoptimizationFlags servers with excessive subscription insert and remove operations since the previous epoch.
OPT_SYS_013Raft Sustained Catching UpwarningSystem ImprovementoptimizationFlags Raft groups with a member in catching-up state.
OPT_SYS_014Gateway Pending PressurewarningSystem ImprovementoptimizationFlags gateway connections with more than 1 MiB of pending data.
OPT_SYS_015Consumer ACK Floor DivergencewarningSystem ImprovementoptimizationFlags consumers where the gap between delivered position and ACK floor is disproportionately large relative to max_ack_pending, indicating interleaved acknowledgments.
OPT_SYS_016Direct Gets DisabledinfoSystem ImprovementoptimizationFlags streams with allow_direct disabled, forcing read operations through the Raft consensus pipeline.
OPT_SYS_017Leafnode Auto Compression with High CountinfoSystem ImprovementoptimizationFlags servers with many leafnode connections using s2_auto compression, which can create a CPU feedback loop under load.
OPT_SYS_018High Interior Deletes on StreamwarningSystem ImprovementoptimizationFlags streams with a very high number of interior deletes, causing disproportionate memory pressure during recovery and catch-up.
OPT_SYS_019Large Deduplication WindowwarningSystem ImprovementoptimizationFlags streams with a deduplication window exceeding the threshold and active message flow, risking high memory consumption from the in-memory dedup map.
OPT_SYS_020KV Buckets Without max_ageinfoSystem ImprovementoptimizationFlags KV buckets with no max_age configured that have accumulated a large number of interior deletes (tombstones).
OPT_SYS_021R1 Streams in Multi-Node ClustersinfoSystem ImprovementoptimizationFlags R1 (single-replica) streams in multi-node clusters that have no redundancy.
OPT_SYS_022Subscription Count GrowthinfoSystem ImprovementoptimizationFlags servers where subscriptions are growing monotonically without a corresponding increase in connections, indicating a subscription leak.
OPT_SYS_023Raft WAL Size ExcessivewarningSystem ImprovementoptimizationFlags Raft groups with an excessively large write-ahead log, risking disk exhaustion and cascading OOM failures.
OPT_SYS_024WorkQueue Discard New with Aggressive Consumer SettingswarningSystem ImprovementoptimizationFlags WorkQueue streams using discard_policy=new where consumers have aggressive ack_wait or max_deliver settings, risking message loss.
OPT_SYS_025Sustained Consumer Growth on StreamwarningSystem ImprovementoptimizationFlags streams where consumer count has been growing steadily, indicating a consumer leak from ephemeral consumers.
OPT_SYS_026Raft Group Peer Count MismatchwarningSystem ImprovementoptimizationFlags Raft groups where the observed peer count exceeds the expected replica count from stream or consumer configuration.
SERVER_001Connection Readiness FailurecriticalServeroperationalFlags servers reporting connection readiness failures via the healthz endpoint.
SERVER_002Server Version MismatchwarningServeroperationalIdentifies servers running a different software version than the cluster majority.
SERVER_003High CPU UsagewarningServeroperationalFlags servers where per-core CPU usage meets or exceeds the threshold.
SERVER_004Slow ConsumerscriticalServeroperationalFlags servers with new slow consumer events since the previous epoch.
SERVER_005JetStream Memory PressurewarningServeroperationalFlags servers where JetStream memory usage is at or above 90% of reserved.
SERVER_006JetStream Domain WhitespacewarningServeroperationalFlags servers whose JetStream domain name contains whitespace characters.
SERVER_007Authentication Not RequiredcriticalServeroperationalFlags servers that do not require client authentication.
SERVER_008Unexpected Server RestartcriticalServeroperationalDetects servers that restarted without an accompanying version upgrade. Compares start times across consecutive epochs and excludes restarts where the server version changed (planned upgrade).
SERVER_010High Route RTTwarningServeroperationalFlags route connections with round-trip time exceeding the threshold.
SERVER_011Connection Count HighwarningServeroperationalFlags servers where active connections approach the configured maximum.
SERVER_012Stale ConnectionswarningServeroperationalFlags servers with new stale connection events since the previous epoch.
SERVER_013Stalled ClientswarningServeroperationalFlags servers with new stalled client events since the previous epoch.
SERVER_014JetStream Subsystem UnhealthycriticalServeroperationalFlags servers with JETSTREAM-type healthz errors.
SERVER_015Stream Recovery FailurecriticalServeroperationalFlags servers with STREAM or CONSUMER-type healthz errors.
SERVER_016Account Resolution FailurewarningServeroperationalFlags servers with ACCOUNT-type healthz errors.
SERVER_017JetStream Storage PressurewarningServeroperationalFlags servers where JetStream storage usage is at or above 90% of reserved.
SERVER_018High Gateway RTTwarningServeroperationalFlags gateway connections with round-trip time exceeding the threshold.
SERVER_019JetStream Storage vs Configured LimitwarningServeroperationalFlags servers where JetStream storage usage approaches the configured max_store limit, which when exceeded causes Raft failures.
SERVICE_001Service Version MismatchwarningServiceoperationalFlags services where instances report different client versions or languages.
SERVICE_002Service DowncriticalServiceoperationalFlags services that had instances in the previous epoch but zero in the current epoch.
USER_001Bearer Token UserwarningUseroperationalFlags bearer token users with active connections.
USER_002Excessive User ConnectionswarningUseroperationalFlags users with more than 100 active connections.
Get the NATS Newsletter

News and content from across the community


© 2026 Synadia Communications, Inc.
Cancel