RethinkConn is back — the biggest NATS event of the year returns June 4. Save your (virtual) spot.
All posts
Series: NATS Edge Eventing Architecture

Mirror, Merge, or Consume: How to Choose Your Edge-to-Core Streaming Pattern

Mirror, Merge, or Consume: How to Choose Your Edge-to-Core Streaming Pattern

This is the final post in the series. I want to end where the rubber meets the road: the moment when data actually has to move from edge to core, and you have to decide how.

Everything I’ve written in this series — the four hard constraints of edge-to-core systems, store-and-forward resilience, realm-based security, declarative flow control, hybrid eventing, full-mesh clustering, end-to-end observability — all of it exists to support the moment when a durable event stream at an edge node needs to make it reliably to a core system that can act on it.

That moment has a shape. And the shape you choose matters more than most architects realize when they’re designing it.

In the Living on the Edge white paper, I describe three edge-to-core streaming patterns: core consumes edge streams directly, mirror edge streams into core, and merge many edge streams into one core stream. On the surface, these look like three variants of the same thing — data moving from edge to core. In practice, they represent three different relationships between producers and consumers, with different tradeoffs around fidelity, coupling, complexity, and operational cost.

Most teams don’t choose between them deliberately. They default to whichever pattern is easiest to implement first and then spend months working around its limitations. The patterns are not interchangeable. Each exists because a different problem demands it.

Pattern 1: Consume — core pulls from edge streams directly

The consume pattern is the one that looks simplest and usually gets implemented first. A core consumer connects to an edge JetStream stream — via a leaf node connection or a hub — and reads from it directly. The edge stream stays at the edge. The core consumer pulls batches when it’s ready, applies filtering or transformation logic, and routes the results to downstream systems.

This is the right pattern when the core system’s relationship with the data is selective. Not everything on the edge stream is relevant to every core consumer. A predictive maintenance service doesn’t need all telemetry — it needs specific signals from specific machines that match specific conditions. A compliance system doesn’t need raw sensor readings — it needs filtered threshold-breach events. The consume pattern puts the selection logic where it belongs: at the consumer, at the point of ingest, rather than pushing everything across the boundary and filtering downstream.

The tradeoff is coupling. A consumer pulling directly from an edge stream depends on the edge node being reachable or the stream being mirrored somewhere accessible. In a NATS leaf node topology, this is manageable — once the stream is replicated to the hub during a connected window, consumers can read from the hub copy — but the architectural dependency is real and needs to be designed for explicitly.

When to use it: when consumer logic needs to be applied at the ingest boundary, when different core consumers need different subsets of the same edge data, and when the goal is to move only what’s needed rather than replicating raw data in full.

Pattern 2: Mirror — exact replica of the edge stream in core

A mirror creates an exact 1:1 replica of an edge stream in the core. Sequence numbers are preserved. Timestamps are preserved. The mirror is a faithful copy of the origin, synchronized asynchronously and designed to recover from connectivity gaps automatically — when the leaf node reconnects after a disconnect, the mirror catches up from where it left off without external intervention.

Critically, clients cannot write to a mirror directly. It is read-only by design. This makes mirrors appropriate for a specific set of use cases where the integrity and provenance of the data is what matters: compliance and audit, backup and disaster recovery, distributing read load away from the origin stream without risking writes, and any scenario where downstream systems need guaranteed access to the exact sequence as it occurred at the edge.

The distinction from the consume pattern is fidelity vs. selectivity. Consume applies logic and filters. Mirror preserves everything. If a regulatory requirement says you must retain a complete, unaltered record of sensor readings from a specific facility, a mirror is the right instrument — not a consumer that might apply filtering logic that inadvertently drops records.

The tradeoff is data volume and storage cost. Mirroring is inherently a full-copy operation. If the edge stream is high-volume and most of the data isn’t needed in core, you’re paying the cost of moving and storing it anyway. The consume pattern handles that case better. Mirrors earn their cost when completeness is the requirement.

When to use it: compliance and audit requirements, backup and disaster recovery, distributing read load from origin streams, and any case where downstream consumers must have access to the complete, unaltered sequence from the edge.

Pattern 3: Merge — multiple edge streams sourced into a single core stream

The third pattern — what JetStream calls sourcing — aggregates multiple edge streams into a single unified core stream. A core stream is configured with multiple sources: leaf node A’s telemetry stream, leaf node B’s telemetry stream, leaf node C’s, and so on. Messages from all sources flow into the merged stream as they arrive, interleaved by time. A single consumer reading the merged stream sees all events from all edge sites without needing to coordinate across geographically distributed sources.

This is the pattern for global analytics and cross-site correlation. Consider a manufacturer with plants in three regions, each running its own edge cluster, each producing a local telemetry stream. A central analytics pipeline that needs to detect cross-plant patterns — anomalies that only become visible when correlated across sites — needs a unified view. The merge pattern provides that unified view without requiring the analytics pipeline to manage multiple subscriptions, reconnect to multiple leaf nodes, or deduplicate records from overlapping consumers.

The important distinction from mirroring is sequence handling. A sourced stream does not preserve origin sequence numbers — messages from different sources are interleaved, and global ordering is not guaranteed across sources (though per-source ordering is preserved). This matters for consumers that depend on strict sequence. For analytics workloads that operate on time-windowed aggregations rather than strict sequence, it’s not a constraint.

Subject mapping becomes especially useful here: edge streams that use local subject naming conventions can be transformed as they enter the merged core stream, normalizing to a global taxonomy that makes cross-site analysis composable. As the NATS documentation on subject mapping describes, transforms can be applied as part of the source configuration, meaning the normalization lives in topology, not in application code.

When to use it: global analytics and cross-site correlation, unified monitoring across geographically distributed edge deployments, and any case where multiple edge sources contribute to a shared processing pipeline.

Why these patterns are not interchangeable — and why teams pick the wrong one

The failure pattern I’ve observed most often is defaulting to mirror when consume would be more appropriate, because mirror feels safer. “We’ll replicate everything to core and let consumers figure out what they need” has the appealing property of feeling complete. No data gets lost. Every consumer has access to the full history.

The cost of this choice compounds over time. Full-replica streams in core for every edge site, regardless of how much of that data any core consumer actually needs. Storage costs that scale with edge volume rather than with actual consumption requirements. And downstream consumers that end up doing in application code the filtering that should have happened at the ingest boundary — exactly the firehose antipattern I described in the flow control post.

The inverse failure is defaulting to consume when merge is needed for analytics. Individual consumers connecting to individual edge streams, each managing their own offset tracking, their own reconnection logic, their own deduplication. What should be a simple unified analytics query becomes a distributed coordination problem that the application has to solve.

The pattern choice isn’t about which is best. It’s about which relationship between producer and consumer is actually true — and encoding that relationship in the topology explicitly rather than working around it in code.

Patterns in combination

In a mature edge-to-core architecture, all three patterns often appear simultaneously on different data paths — and that’s the correct design, not a sign of confusion.

The MachineMetrics deployment — collecting high-frequency machine data from manufacturing floors for real-time anomaly detection and predictive maintenance — is an example of this in practice. Real-time signals processed directly (consume). Complete telemetry records preserved for compliance and retrospective analysis (mirror). Cross-plant pattern detection running against unified streams (merge). Each data path designed for the specific requirement of its consumers, not for a single universal pattern applied everywhere.

Rivitt’s oil and gas deployment similarly: operational data that needs to reach specific field systems immediately (consume), complete drilling records that need preservation and audit (mirror), and aggregated cross-rig analytics for fleet-level optimization (merge). Same platform, three patterns, three different consumer relationships with the data.

The platform that makes this composable without bridge services is what changes the implementation cost. Because JetStream implements mirrors and sources natively — without external connector processes, without separate mirror-maker jobs, without dedicated replication infrastructure — choosing between these patterns is a configuration decision, not a development project. The operational complexity doesn’t scale with the number of patterns in use. Synadia’s platform layer extends this with control-plane tooling that makes the configuration manageable across environments at scale.

The question that closes the series

I started this series with a framing I want to return to at the end: the edge is not a place. It’s an operating dimension — one with different connectivity, security, distribution, and observability characteristics than the core systems it serves.

The architecture decisions I’ve described across these eight posts are not independent. They form a stack. Get the mental model wrong (post 1) and every decision that follows is solving the wrong problem. Get resilience wrong (post 2) and the streaming patterns don’t matter because the data didn’t survive the disconnect. Get security wrong (post 3) and the topology you’ve designed becomes an attack surface. Get flow control wrong (post 4) and the consumers drown in the data you’re working so hard to move. Get the platform wrong (post 5), the clustering model wrong (post 6), the observability wrong (post 7), and the streaming patterns wrong (post 8) — and you have a system that works in the demo and fails in production at 2am.

The good news — and I mean this genuinely — is that these are all solvable problems. The patterns are documented. The platforms exist. The real work is accepting that edge-to-core architecture is not cloud-native architecture with a longer cable, and designing for it from the start with that truth in mind.

For anyone who wants to go deeper on any of these patterns, the full white paper — Living on the Edge: Eventing for a New Dimension — is the complete reference. The Synadia education resources cover NATS subjects, consumer patterns, and JetStream configuration in depth. And Synadia’s platform is where the production-grade version of this stack lives.

Thank you for following along. Build things that hold together when the network doesn’t.


This is the final post in the Living on the Edge series. The full series covers: The edge as an operating reality · Why retry logic fails at the edge · Why edge security is a topology problem · Why flow control is a day-one architecture decision · Why platform consolidation matters at the edge · Why your clustering model is your cost model · Why approximate observability costs you · And this post: mirror, merge, or consume. All patterns are drawn from Synadia’s white paper Living on the Edge: Eventing for a New Dimension.

Get the NATS Newsletter

News and content from across the community


Cancel