This is Part 2 of the Monitoring NATS series. Part 1 covers HTTP monitoring endpoints in detail.
Part 1 covered the HTTP monitoring endpoints built into every NATS server—/varz, /connz, /jsz, and the rest. These endpoints provide accurate, point-in-time snapshots of server state, and they integrate well with tools like Prometheus and Grafana.
But HTTP monitoring is inherently pull-based. A scraper periodically polls an endpoint and records whatever metrics are available at that moment. If your scrape interval is 15 seconds, any transient event that happens and resolves within that window may never be observed.
NATS offers a different approach: the system account. Instead of relying only on external HTTP scraping, NATS servers can publish system events and respond to monitoring queries over the same secure NATS protocol your applications already use.
This article explains how the system account works, what it provides beyond HTTP monitoring, and when to use each approach.
| Aspect | HTTP Monitoring | System Account ($SYS) |
|---|---|---|
| Access method | HTTP/HTTPS on a separate port | NATS protocol over existing connections |
| Data model | Pull-based (point-in-time snapshots) | Push-based advisories + pull-based request/reply |
| Authentication | None built-in (requires network isolation or reverse proxy) | Full NATS auth via NKeys or JWTs |
| Real-time events | Not available | Advisories for connects, disconnects, auth errors, etc. |
| Firewall requirements | Requires exposing an additional port | Works over existing NATS client port |
| Best for | Prometheus scraping, load balancer health checks, nats-top | Event-driven alerting, audit trails, edge deployments |
Neither approach replaces the other—they’re complementary. HTTP monitoring is simpler to set up and integrates directly with standard observability tooling. The system account provides capabilities that HTTP cannot: real-time event streams, authenticated access, and operation in environments where exposing additional ports is impractical.
The NATS system account enables a set of system services and advisories that let operators observe and interact with servers using NATS subjects instead of HTTP endpoints.
With the system account enabled, you gain:
This model works especially well in environments where opening or scraping additional ports is difficult—such as edge deployments, leaf nodes, or locked-down networks.
The $SYS account is a special account configured on each NATS server to carry system-level traffic. It is not automatically usable—you must explicitly enable it and decide who can access it.
Conceptually, it acts as a control-plane channel:
This separation improves visibility isolation, but it’s important to understand the boundary:
The system account isolates subjects and permissions, not CPU, memory, or I/O. A severely overloaded server can still impact all traffic, including system traffic.
Enabling system monitoring requires a small configuration change.
Local / static config example
1system_account: SYSJWT-based (operator mode) example
1system_account: <SYS_ACCOUNT_PUBLIC_NKEY>In both cases, you then create users within that account who are authorized to subscribe to system subjects or issue monitoring requests. These credentials are separate from application users and should be treated as administrative access.
This is where the system account diverges most significantly from HTTP monitoring. Instead of polling for data, you subscribe to advisory subjects and receive events the moment they occur.
Two commonly used advisory subjects are:
$SYS.ACCOUNT.<account>.CONNECT$SYS.ACCOUNT.<account>.DISCONNECTThese fire when a client connects or disconnects and include metadata such as client ID, server ID, and timing information. Disconnect advisories include a reason field describing why the connection ended.
This is powerful for scenarios that HTTP monitoring cannot address:
Authentication failures are reported via:
$SYS.SERVER.<server>.CLIENT.AUTH.ERRThis distinction is important—auth errors should be monitored via the AUTH.ERR subject, not inferred from disconnects.
Authentication error advisories provide real-time insight into:
Each message includes structured data such as client IP, attempted user/account, and rejection reason. Because this data is already in JSON form and delivered over NATS, it can be streamed directly into a SIEM or alerting pipeline without parsing log files.
NATS servers also publish periodic statistics summaries:
$SYS.SERVER.<server>.STATSZThese messages include CPU usage, memory usage, connection counts, message rates, and slow-consumer statistics—similar to what you’d get from the /varz HTTP endpoint, but delivered as a push event.
While often informally called “heartbeats,” they are best thought of as periodic snapshots, not strict liveness guarantees. They are useful for trend analysis and alerting, but they are not instantaneous signals. For liveness checks, continue using /healthz via HTTP.
In addition to push-style advisories, the system account exposes the same monitoring endpoints covered in Part 1 via request/reply. Instead of making an HTTP request, you send a NATS request and receive the JSON response as a reply.
Supported endpoints include:
Instead of querying https://localhost:8222/connz, you can send a request to:
1$SYS.REQ.SERVER.<server-id>.CONNZThe server replies with the same JSON payload you would receive from the HTTP interface.
This allows you to build internal tooling that uses only the NATS protocol, without HTTP clients or additional firewall rules. It’s particularly valuable for:
If you don’t know all server IDs in advance, you can send a request to:
1$SYS.REQ.SERVER.PINGEvery server that receives the request responds with its ID and basic health information. Because this request returns multiple replies, your client must be prepared to collect responses until a timeout occurs.
This pattern enables dynamic discovery without maintaining static host lists—essential for auto-scaling clusters and dynamic infrastructure.
On NATS 2.10 and newer, the system account can also trigger certain administrative actions.
For example:
1$SYS.REQ.SERVER.<server-id>.RELOADThis instructs the server to reload its configuration file, allowing permission or account updates without restarting the server or disconnecting clients.
JetStream publishes its own advisories under the $JS.EVENT.ADVISORY.* namespace. These cover stream and consumer lifecycle events, leader elections, delivery failures, and more.
Examples include:
$JS.EVENT.ADVISORY.STREAM.CREATED.<stream> — stream created$JS.EVENT.ADVISORY.CONSUMER.MAX_DELIVERIES.<stream>.<consumer> — message exceeded delivery threshold$JS.EVENT.ADVISORY.STREAM.LEADER_ELECTED.<stream> — new stream leader elected$JS.EVENT.ADVISORY.API — API audit trailJetStream also publishes metrics under $JS.EVENT.METRIC.*, such as consumer ack latency data.
These advisories enable:
The exact set of advisories and request subjects evolves as new NATS features are added.
The authoritative list of core system advisories is maintained in the official documentation: System Accounts. For the complete reference, the subject constants are defined in events.go in the nats-server source.
The complete list of JetStream advisory subjects is defined in jetstream_api.go.
The NATS CLI is the fastest way to explore system monitoring.
To observe system traffic (with appropriate credentials):
nats sub '$SYS.>'This subscribes to all system subjects and is useful for learning and debugging. In production, it’s best to narrow subscriptions to specific subjects to avoid excessive volume.
Example:
nats sub '$SYS.ACCOUNT.*.DISCONNECT'System events do not replace Prometheus, Grafana, Datadog, or similar tools. Instead, they extend what’s possible:
Because system events are normal NATS messages, you can also persist them using JetStream to create an audit log or replay incidents after the fact.
Tools like NATS Surveyor implement this pattern, collecting system account data and exposing it as Prometheus metrics without requiring HTTP monitoring ports on each server.
For most deployments, use both approaches:
Use HTTP monitoring for:
/healthz)nats-topUse the system account for:
NATS makes traditional monitoring easier—no sidecars or probes required—and extends it by embedding observability directly into the messaging system.
By using the system account:
The system account enables robust, NATS-native observability pipelines.
If you want to go deeper, the NATS documentation on system accounts is the best next stop—and the NATS community Slack is an excellent place to ask real-world operational questions.
The team at Synadia are the creators and maintainers of NATS. If you need help architecting, monitoring, or scaling your deployment, get in touch.
News and content from across the community