A community member asked how to design a NATS-based multi-tenant SaaS backend where events must be handled FIFO per tenant, while different tenants should be processed in parallel.
The short answer: a stream per tenant can be a reasonable JetStream design for this requirement, especially at roughly hundreds to low thousands of tenants, as long as the streams and consumers are relatively stable and the system is load-tested with realistic traffic. The main design choices are subject layout, stream boundaries, consumer sharing, and future tenant isolation.
For this kind of workload, there are two different goals:
A tenant-scoped stream aligns well with that model. If each tenant has its own stream, then each tenant has an independent sequence of events. A slow or blocked consumer for one tenant does not inherently stall the stream for another tenant.
That is often preferable to putting all tenants into one large stream and then relying heavily on many filtered consumers. A single stream with many sparse subject filters can become harder to reason about and may not scale as cleanly as more linear consumption from tenant-local streams.
A good subject structure starts with a stable application or domain token, not the tenant ID.
For example:
1{app}.tenants.{tenant}.{eventType}or a more concrete shape such as:
1billing.tenants.tenant-123.invoice.created2billing.tenants.tenant-123.payment.receivedThe important part is that the first token is fixed for the application or domain. Avoid a design where the first token is the tenant ID and common subscriptions need broad first-token wildcards such as:
1*.orders.createdKeeping the first token stable gives you a cleaner subject hierarchy for routing, permissions, stream definitions, imports/exports, and future operational changes.
A tenant stream can capture that tenant’s subject prefix:
1billing.tenants.tenant-123.>This gives each tenant an independent stream sequence. If FIFO is required per tenant, that independence is useful: each tenant can proceed, pause, retry, or accumulate backlog separately.
A simplified stream model might look like this:
1Stream: TENANT_tenant-1232Subjects: billing.tenants.tenant-123.>Then repeat that pattern for each tenant.
This design is most attractive when tenants are relatively stable. Creating a few hundred or a few thousand streams and consumers is a different operational profile from rapidly creating and deleting them at high frequency.
If multiple backend services need to observe the same tenant’s events, each service should have its own durable consumer on that tenant stream.
For example, for tenant tenant-123:
1Stream: TENANT_tenant-1232 Consumer: invoice-service3 Consumer: notification-service4 Consumer: analytics-serviceEach service durable tracks its own delivery state. That means invoice-service can fall behind without changing the position of notification-service.
If a service has multiple replicas, those replicas can share the same durable consumer so the service can distribute work across replicas. The exact mechanics depend on the consumer type: multiple replicas can bind to and fetch from a shared pull consumer, while a push consumer needs a deliver group (a queue group on the consumer’s delivery subject) for replicas to share its messages. Either way, the principle is the same: replicas of the same service share that service’s durable consumer state.
If you need strict processing order for a given service and tenant, do not allow multiple messages from that tenant/service consumer to be processed concurrently unless your application can tolerate out-of-order completion.
JetStream can deliver messages in stream order, but once multiple service replicas are handling messages concurrently, message 2 may finish before message 1. Redeliveries after timeouts can also affect the apparent order in which handlers see work.
For strict per-tenant FIFO, consider these constraints:
1, so the next message is not delivered until the current one is acknowledged.That approach intentionally gives up intra-tenant parallelism. The scalability comes from processing many tenants in parallel, not many events for the same tenant in parallel.
If your application only requires ordered delivery but not ordered completion, you can allow more concurrency per consumer. That is a throughput/semantics tradeoff and should be made explicitly.
For a design with 100 to 1,000 tenants, a stream per tenant is a plausible starting point. The total object count matters:
1number of streams = tenants2number of consumers = tenants × services that need durable consumptionSo, for 1,000 tenants and 5 services, you may be operating around:
11,000 streams25,000 durable consumersThat is a materially different design from 1 stream and 5 consumers, but it can be reasonable when the objects are stable and the event rate is modest. Avoid assuming the same approach will comfortably extend to tens of thousands of tenants without additional design work, benchmarking, and operational planning.
There is no single universal limit because capacity depends on factors such as:
The practical recommendation is to load-test the shape you intend to run: realistic tenant count, service count, event rate, message size, retention, and failure scenarios.
A tenant-per-stream model is most comfortable when tenants are long-lived. It is less attractive if your application constantly creates and deletes tenant streams at a high rate.
If tenants are stable SaaS customers, this is usually manageable. If tenants are short-lived sessions, jobs, devices, or temporary workloads, consider whether the tenant is really the correct stream boundary.
Even if authentication and authorization are simple today, it is worth designing for stronger isolation before tenant-connected leaf nodes are introduced.
A few practical considerations:
nsc tool. Static accounts are simpler for a small, fixed set of tenants; operator mode with decentralized JWT authentication scales better when tenants or leaf nodes need to manage their own users and credentials. If you expect the latter, adopting operator mode early avoids a disruptive migration.One account per tenant is not always necessary at the start. It adds operational complexity, especially if shared services need access to tenant-scoped messages. But designing so that tenant isolation can be increased later gives you more deployment flexibility.
This is especially relevant for leaf nodes running in tenant environments. A tenant leaf node that publishes only under that tenant’s subject prefix is much easier to secure and reason about if the subject hierarchy and account model were designed with isolation in mind.
For a multi-tenant SaaS backend with FIFO per tenant and cross-tenant parallelism, a practical starting design is:
A stream per tenant is a reasonable JetStream pattern when the business requirement is independent FIFO processing per tenant and parallelism across tenants. It keeps tenant backlogs isolated and avoids some of the scaling concerns of a single stream with many sparse filtered consumers.
The key tradeoff is concurrency: strict FIFO requires limiting per-tenant processing concurrency, while higher throughput within a tenant can weaken ordered-completion guarantees. Keep stream and consumer counts stable, test with realistic traffic, and design the subject and account model so stronger tenant isolation remains possible later.
Want help from the NATS experts? Meet with our architects to get help tailored to your use case and environment.



News and content from across the community