The Future of NATS - Roadmap & Vision

Simplifying the Edge: Building a Connected Future

In this talk, Byron Ruth, VP of Engineering at Synadia, recaps RethinkConn 2025 and provides a look ahead at priorities for NATS and Synadia in 2025 and beyond.

He outlines NATS' strategic focus on enabling edge-first distributed computing while maintaining simplicity for developers, architects, and operators. Key roadmap priorities include enhancing observability capabilities, introducing schema registry for better data governance, developing fleet management tools for platform components and user devices, and creating an improved "workbench" UI with collaboration features.

Big picture, the team is committed to supporting critical industries with location-independent technologies that work seamlessly across cloud and edge environments, while preserving NATS' core principles of simplicity and performance at scale.

"There's a whole mess of industries and use cases where we're trying to push everything down to the edge because we're trying to optimize the user experience. That's the end game. Right?
To do that though - the technology that supports that is actually quite challenging to develop in a simple way. That's why this is a fundamentally interesting problem to us."

— Byron Ruth, VP Engineering, Synadia

Go Deeper

Welcome to RethinkConn 2025

Opening remarks and statements for RethinkConn 2025

Start a free PoC

Talk to our team about how NATS fits into your architecture, use case, and goals.

Blog: RethinkConn 2025 Recap

Key highlights from RethinkConn 2025: NATS updates, success stories & more.

Full Transcript

Byron Ruth:

So I have the fun the fun pleasure of trying to end this out, and, we're over we're already over time, but I hope everybody is having a a good time for sure. I've been fortunate to, you know, do RethinkOn the past couple years. The team this year has been has been fantastic. Nate has been fantastic as the emcee. I I hope you enjoyed the talks.

So I'm gonna try to keep this fairly quick just so that we can you know, I know it's late for a lot of people. Like, this is you know, NATS is used globally, and, you know, there's a lot of people staying up late, including our own team, to be able to watch. So we just wanna make sure that everything is is going well. So alright. So, you know, first first immediate call out, obviously, is, like, our guest speakers, both customers, you know, various users.

Just huge shout out to them. We we wanna thank them for participating and being able to, you know, highlight all the different use cases. I think one of the things that, you know, we at Synadia are so impassioned by is really to to understand and and hear about the, you know, variety and very dispersed, you know, use cases that that's, you know, NATS and other Synadia tech can actually influence and and impact. And we know in increasingly, you know, even from Synadia's inception, you know, that was that was the bet with NATS. It's like we're gonna become a very critical piece to use cases, you know, for a variety of industries and variety of, you know, companies.

So I hope, you know, at a minimum, in lieu of all the all the stuff that the Synadia folks have have spoken to, These use cases, these examples have resonated in some way because they're amazing. And, again, that's one of the most interesting things that we get to experience every day with at at at Synadia to hear all these diverse customer use cases. So I always like to I always like to do this for people who know me. I like to step back just to reframe things. I won't take too too much time because I know we're already over our four hour, you know, limit.

This will be on YouTube. If you have to drop, that's fine, but I'm gonna just riff for a little bit. This is something that I always kind of post internally, and I always kind of try to try to reference. So this was a quote from Derek, you know, ages ago, many years ago. But I always like to kind of piece it apart.

And I I think it it it captures the essence of what Synadia is trying to do, why NATS is such a core component of it, and what we're trying to enable, frankly, for, you know, our users or customers, going going forward. So, you know, let's if we break it down, we have the distributed edge first. So we're kinda going to reverse order. Like, why is this interesting? Why are these problems?

As Derek said in in the beginning, you know, the world is just generally going more distributed. That's that's inevitable. The world is going more edge first. That's inevitable. The reason, again, is that people want, you know, at a in a very, very, you know, CS level, context.

It's like people just want lower latency. People want access to data locally. You know, that's pretty, you know, pretty, pretty basic from a from a conceptual standpoint. To do that in practice at scale is hard. And so, you know, we we heard about PowerFlex, you know, Akamai and, you know, machine metrics.

And it's like there's a whole there's a whole, you know, mess of industries and use cases where we're trying to push everything down to the edge because we're trying to optimize the user experience. That's, like, that's the end game. Right? So to do that, though, and the technology that supports that is actually quite quite challenging to develop in a simple way. So this is why this is fundamentally interesting to us.

So, you know, the second piece is that we're like, alright. So this is a hard problem. We wanna tackle this as a as a as a company. This is why Synadia was fundamentally formed. So how are we reimagining the architecture with this?

So for us, we always target and think about these three core personas. There's others, of course, but these are sort of the the the the key ones that we we we always think about. You know, developers just want a simple way to develop applications. Right? They want the primitives.

They want the whether it's, you know, real time messaging, key, request reply. They want, you know, data primitives to store their data when they need to store it and and so on. And we we obviously heard about the, workloads, which I'll get into in a in a minute. The architects and I don't wanna, you know, blur the lines. I know there's opinions about that term.

That's fine. But architecture from this perspective is just saying, well, where do we need where do we need these services and data to actually be located? So this is when we tell the story about multiple clouds, multiple geographies at the edge. This is where the architect is thinking of, like, we wanna optimize for latency or availability in these particular places. So with that, obviously, for a lot of people who who've been in the space for some time, you know that you can cluster NATS in a in a variety of different ways.

You have simple clusters, stretch clusters as a as a thing, super clusters that can go multi geo and and across clouds and things like that. And then our you know, one of our special special capabilities, which is, leaf nodes. And so that allows us to do our hub and hub and spoke kind of topology, and we're gonna be expanding on that this year for sure. And then operators are like distributed systems are difficult, and I don't wanna be able to I don't wanna have to run, you know, multiple NATS systems. Well, NATS comes with, multitenancy just out of the box.

Right? So you have an account, and you can send and receive messages over over subjects in that account that will never bleed over any other account boundary. And so you don't have to deploy many, many systems just to be able to achieve multitenancy. This is all baked in with the security model. You know, and then you have opt in, you know, the opt in ability to be able to share across accounts if you choose to.

And then the monitoring, and I'll get into this a little bit more. You know, we've done a tremendous effort over the past two releases with 2.10 and and and now 2.11 to be able to increase the degree of visibility and monitoring. Like, NATS, the NATS Server has so much information, and we're trying to expose it more and more because it's very valuable. And it makes it makes building distributed systems much easier. I think, you know, people hear microservices and hear distributed systems, you're like, oh my gosh.

Like, all these technologies and all of this, you know, monitoring solutions and all these kind of thing. And so, like, well, because NATS is like, every all the data is flowing through NATS, we can expose everything through it. And we can we can, you know, make it make it visible where necessary, and that's really powerful. So, you know, wrapping up this this particular point, it's like, you know, again, what Synadia is trying to do, you know, leveraging NATS is really to say, what's what's the opportunity here? We're we're trying to enable industries, you know, industry modernization.

We're always trying to focus on, like, keeping things simple. I think the the the, you know, technology today and and all the things that we we learn about, all the projects that we learn about in the cloud cloud space is just you know, I respect all of them, and I'm I'm speak on my behalf. But there's still a lot of complexity when you're trying to, you know, bridge all the things together. So we're trying to really focus on critical use cases, critical industries, decreasing time to market value for customers so that they can just move fast and just do things. And we're our road map and everything we're trying to do is really focused around that.

So let's quick recap on what we saw today. NATS 2.11. Neil did a fantastic job, you know, introducing some of the key key points here. We were really focused on visibility patterns, correctness. You're gonna see this as a theme for 2.12, which I'll touch on in a second.

Just a quick note for client releases, tooling releases, we're going to be rolling those out over the next week or so documentation. I know everyone's like, I want to immediately use this and learn more and try it out and things like that. So we're gonna be rolling that out over the the next, few days. Orbit, Tamash, you know, really, I think, nailed it. It's just like we have a particular you know, have a clear segregation between the two boundaries and it's going to be beneficial for both client developers to be able to introduce these patterns and then also the consumers of that, the users, the customers to be able to consume it and try it.

And then once they sort of get solidified, we can basically port and move that functionality into the the core clients if it makes sense. Otherwise, these will just become stabilized packages within the Orbit libraries. Control plane, I know Seth you know, just call out to Seth. He he is technically in a past life a lawyer. He has his JD.

So he's an amazing developer, but he can speak fast. I I I saw that as a comment. He's but he's amazing, and I know a lot of people have interacted with him, users through Synadia cloud. Control plane is really is really evolving, and, I'll touch on this in a in a second, but it's really evolving into a a more and more critical component that is gonna just simply, you know, ease ease onboarding. So, like, even if you're the, you know, you're the developer who knows NATS very well and you're trying to onboard it onto your team, we still believe that and we want to keep pushing control plane to be the easiest path to onboard people to understand and learn about NATS.

We actually had a very interesting use case a while back, which people are like, oh, the stream viewer, the KV viewer is fantastic. Because when we're we have QA engineers and they just want to, you know, come in. We're pushing messages, pushing data, and they just need to go in and actually validate that the data looks correct. And that was you know, they don't know anything about NATS. They don't need to care about NATS or anything.

They just they have a particular task that they need to do, and this this was one of the use cases that actually enabled them to do that. With, Scott and the ML inference, I hope that was exciting. I it was, you know, that that team that team joined a a little bit ago, and they're, amazing, and they have so much, you know, they they they bring so much, you know, insight into, you know, the the the space generally, AI and ML. And, we're definitely gonna be focused on a lot of things in that area, in that space. But we want to make sure that it's done in the right way, that we're not just following a trend.

We're doing it very you know, deliberately of where we actually integrate AI and ML in into our sort of product. And then, you know, the two the two key demos, you know, Workloads and Connectors. I'm not gonna go too much into this, but, you know, the the the kicker with both of these and the the essence of these projects were that we wanted to satisfy and and imbue the same principles that NATS has. So, you know, location independence, the ability to sort of just move things around, schedule things where they need to schedule. And the other reality is that CloudEdge, as as Derek and and Justina mentioned earlier, which is like it's a whole different paradigm.

When you go from to to an edge location, it's a whole different paradigm in the sense that edge is messy. You know? And we're not we're not talking about we're not talking about cloud edge. We're not talking about CDNs. We're not talking about, you know, you know, know, for cells and and cloud flares and things like that.

We're talking about the user, the customer, the user defined edge. Wherever you wanna run it, Jordan, you show you you know, you saw that firsthand. He was just running in it in his house. It's a it's a user defined edge. I think that is a key differentiator where you can run NATS and Nex in particular as, like, two things, and then obviously piggybacking off that with, with Connectors and things like that.

You can run that software where you need to run it to define your own edge. It's could be your factory, your car. You know? Again, we we heard we heard talks about this. So the the kicker with the architecture behind Connectors and and Workloads was really that we wanna make it flexible enough and pluggable enough that you can you have the ability we have the ability to create the runtimes, the agents, the Nexlets, you know, whatever your terminology to be able to adapt and and and being able to run-in those those environments.

And so we're trying to homogenize, you know, these various places. So whether you're running, you know, scheduling workloads in ECS or Docker or Podman or, you know, some random subprocess on a Windows machine, you still wanna have sort of a a a purview of all those workloads, you know, no matter where they're running. And that's just something that just doesn't exist today. So just two quick announcements, and I know I'm I'm I I thank everybody for sticking with me. Two quick announcements for Synadia cloud and and platforms.

So workloads and Connectors are in preview. We're gonna be rolling this out over the next couple, you know, two to three weeks and providing access to our paid Synadia cloud customers right now just to sort of give them access to it so we get feedback. We have fixed capacity and things like that. But we really want to encourage, you know, the these these users to be able to, you know, test out the workloads and try services, try functions, and we're gonna be working closely with you to to, experience this Similarly with, Connectors as well. For Synadia platform, you know, we have our managed platform and self managed platform customers.

And, you know, the the the most valuable thing that we can do right now is is essentially get, you know, let's schedule calls. Let's understand the use cases. If you have ideas, you know, after hearing this and you're like, this is exciting. How can Workloads and Connectors work in my environment or for my use case? That's really impactful.

And again, that's the reason we developed workloads and Connectors in the way that we did so that we can adapt and create things that are more adaptable for the environments that have to run-in sort of the heterogeneous use cases that we actually serve. It's one of those things that you know, in the in the cloud the cloud era, which I I feel like we are we are going beyond that now, thankfully. It's still very valuable, but, you know, people now have edge use cases, and we're moving in that direction. So how do you transition to that? How do you build technology that actually can service that?

How can it bridge back to the cloud environment? And you know, with Workloads and Connectors, we we have designed that from the ground up to to support. Alright. So not belaboring because I know I'm the last presentation here. So what's what's next?

So NATS 2.12. You know, 2.11 just got released. It's a big milestone. I know it's been a long time coming. I know everybody has been waiting for it.

With 2.12, you know, we're doing more of the same thing. The the the the interesting thing with NATS is that and what we've realized at Synadia is that, like, we always have to try to stay, like, six to twelve to eighteen months ahead of the curve with demand scale, you know, all that stuff. So that's it's a it's a fun it's a it's a fun problem to have. And we're always trying to say, great. You know, we we we know our customers that we have.

And and just seen a showed a a handful of logos on there. It's like we're in vehicles, we're in retail stores, we're in you know and there's a certain scale that that that we we are constantly being, pushed to satisfy. And so, you know, on you know, in our mind, we're always wanting to do performance and scale. We're always focusing on that. That's just, like, inevitable.

And then to Tomas's point that he brought up before, we're focusing on higher order patterns. So some of these require, you know, native NATS Server capabilities so that we can build client client side higher order patterns. And, you know, that's that's just a a a key focus because we're really good at, like, the NATS, you know, NATS and and the clients that we have today are very good at the primitives, but we have to keep going up the stack. We have to keep improving from that standpoint. For people who have been around the NATS community for releases for quite some time, I promise you that's not a joke.

I know. I know NATS 2.10, NATS 2.11 at this point. It's been a long haul, but we're transitioning to be much more diligent about scoping down the time. And then we just know that, you know, if if something doesn't land in the first in this in this release, then it can just get kicked over in the next release, and it's gonna be another six months. So we're being we're trying to always actively get better at at at at that.

We have a new docs site coming soon. And then in terms of the planning planning list here, these are a couple, quick bullets. That QR code goes to the GitHub issues that are pinned to the 2.12 issue label. So you can scan it, or you can just kinda go to the GitHub issues directly. But you can see sort of the laundry list of things.

We are not gonna be doing all of those things just to be very clear. We are scoping that down, but we're in the active area of planning and design around the thing the the key the key issues, key features, and things like that that we're gonna be focused on. Alright. So promise, final two slides. On the Synadia roadmap side, I'm just gonna talk very briefly to these to these areas.

So these are some areas. And, again, just going from the theme of and just in that stepping back section, you know, NATS had very humble be beginnings. It has evolved, you know, with the data. You know, we have connectivity. We have the data.

Nex introduces the workloads side of things. We are seeing our you know, we are seeing this this, you know, Synadia is driving driving this forward, but we're seeing collectively and holistically what what the potential is and what we're trying to enable for users and and and and customers and trying to simplify things, keep things simple, not not grow into the complexity that exists in, you know, in the tech ecosystem today. And realizing that if we're trying to satisfy distributed systems, if we're trying to satisfy cloud edge systems, that's a hard problem. But we have the tools. We have the, you know, ability to deliver on really good tooling and really good products to be able to, support that and not make it complex for for, you know, end users.

So the first two kind of points here is that we're gonna be focusing on, and these are high level and, you know, coming later this year. I'm not gonna, define specific timelines around this. So observability as a term, you know, it's a it's a very weighted term, heavy term to people, but it means something to us and how we're gonna manifest this into a a control plane in particular. So there's a lot of metrics. There's a lot of data that's that flows out of, the NATS Server in particular.

And we wanna make this not just visible and more useful in the terms of being able to say, hey. I see all my I see my NATS cluster, my NATS super cluster, my leaf nodes, but I also get to see my application tier. I get to see the clients that are connected. I get to see my microservices, meaning the micro, you know, package services. I get to see my workloads.

I get to see my connectors, you know, all running my so we call it this application tier. So we have the, like, system tier or the application tier. You know? And being able to visually see that at a high level. And then being able to drill down into that and then be able to see, you know, are there you know, if there's things going wrong, great.

I should be able see that. I should be able to see alerts. So integration with, you know, tooling around alerting Prometheus, for example, since that's, you know, our default monitoring system that we support. So there's a lot of opportunity there to provide more accessibility around observability around these big, large scale systems that, you know, you don't otherwise get in, you know, out of the box observability products, let's say. Schema registry, which is interesting.

You know, we've always said, and you've many many who have been around for quite some time, it's like, you know, NATS doesn't care about the payload. You know, if you've been in that Slack, you you would hear that probably from someone at some point. We're trying to go up the stack a little bit on the data side. And so data governance and schema and so on and so forth. So schema registry is sort of a first pass to that.

And so Tomash sort of referenced this a little bit, and we're gonna be going up the stack a little bit with a with schema registry, the codex, as as, he mentioned. And then we have some other ideas, that that are coming, but we are gonna sort of start going into that space as well. And then the two final things. We internally call this fleet, fleet management. And this might be a different, subtly different interpretation of what fleet means to different people.

So there's a difference of, you know, fleet management of, let's say, platform components. So, you know, arseny platform and just to to level set with that, it's like, you know, we have NATS Servers. We have control plane. Now we have Nex nodes. Now we have Connectors.

We have all these types of things. And the value that that brings is that, again, because of the design, it allows people to distribute and place, you know, these processes wherever they need to be placed, whether in the cloud, the edge, what have you. So there's value of being able to support a fleet of these components generally. There's the other definition of fleet, which is like, these are I have a I you know, I have a I'm I'm a cuss you know, I'm I'm a customer. I'm a user.

I have a fleet of my own devices. I have, you know, cars or vehicles or, you know, distribution sites, things like that. And it's like, how do I represent that and model that in in in in a UI and be able to heartbeat those things, you know, and and check the health of those things? And so both of these both of these are being considered in sort of this general effort. We we don't see them as we see them as two sides of the same coin.

They're not fundamentally different problems. It's just knowing basically how do you deploy and life cycle that component, How do you monitor that component? How do you redeploy that component? How do you help check that component? Right?

And then finally, the the workbench, effort. And we have a couple others that are floating around, but I just wanted to highlight a kinda a a number of of them, here. The workbench effort, I I saw, you know, during the live chat, there was a number of questions about is there gonna be a NATS, you know, UI and local first and things like that. So we do have an effort, coming for this. And, it's being, you know, built and driven by a very, you know, a very local first perspective and being able to say, I wanna be able to see what I wanna see.

And this is where the sort of like non modal layouts and sharing and these might be fancy terms, you can think about, I wanna be able to see what I wanna see. I wanna have panes laid out as I need them. And then I wanna be able to maybe collaborate with a coworker or another developer or something like that to be able to share, or I wanna be able to do live live sharing with with this. So we have a lot of ideas and and, focus on on this, area as well. So I hope that was useful, and I think we're well over time, so I'm gonna end it there.

The Future of NATS - Roadmap & Vision

Simplifying the Edge: Building a Connected Future

Go Deeper

Full Transcript

Get the NATS Newsletter