NATS JetStream: Streams, Storage, and Data Lifecycle
Liked this content? Check out the related Consumer screencast
Or you can Check out the repo and learn at your own pace!
NATS JetStream is a powerful data persistence capability that can be activated in your NATS cluster, providing guaranteed at-least-once delivery and sophisticated data retention policies.
In this screencast, Colin shows us how to configure JetStream Streams and implement various retention strategies to manage your data effectively.
What is NATS JetStream?
JetStream is the underlying data persistence capability in NATS that allows you to store messages with guaranteed delivery semantics. Unlike traditional NATS pub/sub messaging where messages are fire-and-forget, JetStream provides:
- Persistent storage: Messages are stored on disk or in memory, ensuring data survives server restarts and failures
- Replication: Data can be replicated across multiple NATS servers for high availability and fault tolerance
- Flexible retention policies: Automatically manage data lifecycle with configurable limits on message count, storage size, and age
- At-least-once delivery guarantees: Messages are preserved even when consumers are offline, ensuring no data loss
Architecture Overview
In the demonstration setup, Colin is using a NATS Super Cluster spanning two Kubernetes clusters:
- EKS cluster on the US East Coast
- Azure cluster on the US West Coast
The architecture includes four microservices (adder, multiplier, subtractor, divider) that process mathematical operations, with a requester service that:
- Sends random number pairs to each service
- Collects responses from all services
- Publishes results to JetStream streams based on business logic
Enabling JetStream
JetStream must be enabled at two levels:
1. Cluster Level Configuration
1jetstream:2 enabled: true
2. Account Level Configuration
1accounts:2 teamA:3 jetstream: enabled
Once both configurations are set and your infrastructure has persistent storage (like PVCs in Kubernetes), JetStream is ready to use.
Creating Your First Stream
A NATS Stream is dedicated data storage for a series of subjects. Here’s how to configure one using OpenTofu/Terraform:
1resource "jetstream_stream" "answers" {2 name = "answers"3 subjects = ["answers.significant", "answers.throwaway"]4 storage = "file"5
6 # Retention policies (we'll explore these below)7 max_msgs = 10008 max_bytes = 1048576 # 1MB9 max_age = 432000 # 5 days in seconds10}
Key Configuration Options
Storage Type: Choose between file
(disk-based, durable) or memory
(faster but volatile)
Subjects: Define which message subjects this stream will capture. You can use wildcards like answers.*
to capture multiple related subjects.
Replication: When running clustered NATS servers, you can set replication factors for high availability.
Data Retention Strategies
JetStream provides multiple retention policies that can work together to automatically manage your data lifecycle:
1. Size-Based Retention (max_bytes)
Limit total storage space used by the stream:
1max_bytes = 1024 # 1KB limit
When the size limit is reached, older messages are automatically purged to make room for new ones.
2. Count-Based Retention (max_msgs)
Limit the total number of messages stored:
1max_msgs = 30 # Keep only latest 30 messages
This creates a sliding window of the most recent messages, perfect for keeping only the latest state or recent events.
3. Time-Based Retention (max_age)
Automatically expire messages after a specified duration:
1max_age = 432000 # 5 days (60 * 60 * 24 * 5 seconds)2duplicate_window = 120 # Must be less than max_age
This ensures data doesn’t accumulate indefinitely and is ideal for regulatory compliance or storage cost management.
Practical Example: Message Classification
In Colin’s demonstration, the requester service classifies responses based on their values:
1# Pseudocode for message classification2for response in math_responses:3 if response.value > 1:4 publish_to("answers.significant", response)5 else:6 publish_to("answers.throwaway", response)
This creates two logical data streams within the same JetStream Stream, allowing for different processing strategies downstream.
Monitoring Your Streams
Use the NATS CLI to monitor stream status:
# List all streamsnats stream ls
# Get detailed stream informationnats stream info answers
# View subject-specific message countsnats stream subjects answers
Example output shows message distribution:
answers.throwaway
: 92 messagesanswers.significant
: 308 messages- Total: 400 messages
Configuration Management Options
JetStream configuration can be managed through multiple approaches:
- NATS CLI: Great for development and testing
- Terraform/OpenTofu: Infrastructure as Code approach
- GitHub Actions: CI/CD integration
- Kubernetes JetStream Controller: Native Kubernetes resources
Best Practices
Security
- Use TLS for all connections in production
- Implement proper authentication and authorization
- Consider using certificate-based authentication for services
Retention Policies
- Don’t use unlimited retention in production: always set appropriate limits
- Combine multiple retention strategies for comprehensive data management
- Consider downstream consumers when setting retention periods
- Test retention policies in non-production environments first
Performance
- Choose
file
storage for durability,memory
for speed - Set appropriate replication factors based on availability requirements
- Monitor disk usage and adjust retention policies as needed
Conclusion
NATS JetStream Streams provide a robust foundation for building resilient, distributed systems with guaranteed message delivery. The flexible retention policies ensure you can manage data lifecycle automatically while the multi-subject capability allows for sophisticated message routing and classification.
The combination of simple configuration, powerful retention policies, and guaranteed delivery makes JetStream an excellent choice for building modern, cloud-native applications that need reliable messaging with persistence.
Ready to dive deeper? Check out the official NATS documentation.
Liked this content? Check out the related Consumer screencast
Ready to get started? Check out the repo and learn at your own pace!