All Episodes

NATS JetStream: Streams, Storage, and Data Lifecycle

Episode 3

Liked this content? Check out the related Consumer screencast

Or you can Check out the repo and learn at your own pace!


NATS JetStream is a powerful data persistence capability that can be activated in your NATS cluster, providing guaranteed at-least-once delivery and sophisticated data retention policies.

In this screencast, Colin shows us how to configure JetStream Streams and implement various retention strategies to manage your data effectively.

What is NATS JetStream?

JetStream is the underlying data persistence capability in NATS that allows you to store messages with guaranteed delivery semantics. Unlike traditional NATS pub/sub messaging where messages are fire-and-forget, JetStream provides:

  • Persistent storage: Messages are stored on disk or in memory, ensuring data survives server restarts and failures
  • Replication: Data can be replicated across multiple NATS servers for high availability and fault tolerance
  • Flexible retention policies: Automatically manage data lifecycle with configurable limits on message count, storage size, and age
  • At-least-once delivery guarantees: Messages are preserved even when consumers are offline, ensuring no data loss

Architecture Overview

In the demonstration setup, Colin is using a NATS Super Cluster spanning two Kubernetes clusters:

  • EKS cluster on the US East Coast
  • Azure cluster on the US West Coast

The architecture includes four microservices (adder, multiplier, subtractor, divider) that process mathematical operations, with a requester service that:

  1. Sends random number pairs to each service
  2. Collects responses from all services
  3. Publishes results to JetStream streams based on business logic

Enabling JetStream

JetStream must be enabled at two levels:

1. Cluster Level Configuration

1
jetstream:
2
enabled: true

2. Account Level Configuration

1
accounts:
2
teamA:
3
jetstream: enabled

Once both configurations are set and your infrastructure has persistent storage (like PVCs in Kubernetes), JetStream is ready to use.


Creating Your First Stream

A NATS Stream is dedicated data storage for a series of subjects. Here’s how to configure one using OpenTofu/Terraform:

1
resource "jetstream_stream" "answers" {
2
name = "answers"
3
subjects = ["answers.significant", "answers.throwaway"]
4
storage = "file"
5
6
# Retention policies (we'll explore these below)
7
max_msgs = 1000
8
max_bytes = 1048576 # 1MB
9
max_age = 432000 # 5 days in seconds
10
}

Key Configuration Options

Storage Type: Choose between file (disk-based, durable) or memory (faster but volatile)

Subjects: Define which message subjects this stream will capture. You can use wildcards like answers.* to capture multiple related subjects.

Replication: When running clustered NATS servers, you can set replication factors for high availability.


Data Retention Strategies

JetStream provides multiple retention policies that can work together to automatically manage your data lifecycle:

1. Size-Based Retention (max_bytes)

Limit total storage space used by the stream:

1
max_bytes = 1024 # 1KB limit

When the size limit is reached, older messages are automatically purged to make room for new ones.

2. Count-Based Retention (max_msgs)

Limit the total number of messages stored:

1
max_msgs = 30 # Keep only latest 30 messages

This creates a sliding window of the most recent messages, perfect for keeping only the latest state or recent events.

3. Time-Based Retention (max_age)

Automatically expire messages after a specified duration:

1
max_age = 432000 # 5 days (60 * 60 * 24 * 5 seconds)
2
duplicate_window = 120 # Must be less than max_age

This ensures data doesn’t accumulate indefinitely and is ideal for regulatory compliance or storage cost management.


Practical Example: Message Classification

In Colin’s demonstration, the requester service classifies responses based on their values:

1
# Pseudocode for message classification
2
for response in math_responses:
3
if response.value > 1:
4
publish_to("answers.significant", response)
5
else:
6
publish_to("answers.throwaway", response)

This creates two logical data streams within the same JetStream Stream, allowing for different processing strategies downstream.


Monitoring Your Streams

Use the NATS CLI to monitor stream status:

Terminal window
# List all streams
nats stream ls
# Get detailed stream information
nats stream info answers
# View subject-specific message counts
nats stream subjects answers

Example output shows message distribution:

  • answers.throwaway: 92 messages
  • answers.significant: 308 messages
  • Total: 400 messages

Configuration Management Options

JetStream configuration can be managed through multiple approaches:

  1. NATS CLI: Great for development and testing
  2. Terraform/OpenTofu: Infrastructure as Code approach
  3. GitHub Actions: CI/CD integration
  4. Kubernetes JetStream Controller: Native Kubernetes resources

Best Practices

Security

  • Use TLS for all connections in production
  • Implement proper authentication and authorization
  • Consider using certificate-based authentication for services

Retention Policies

  • Don’t use unlimited retention in production: always set appropriate limits
  • Combine multiple retention strategies for comprehensive data management
  • Consider downstream consumers when setting retention periods
  • Test retention policies in non-production environments first

Performance

  • Choose file storage for durability, memory for speed
  • Set appropriate replication factors based on availability requirements
  • Monitor disk usage and adjust retention policies as needed

Conclusion

NATS JetStream Streams provide a robust foundation for building resilient, distributed systems with guaranteed message delivery. The flexible retention policies ensure you can manage data lifecycle automatically while the multi-subject capability allows for sophisticated message routing and classification.

The combination of simple configuration, powerful retention policies, and guaranteed delivery makes JetStream an excellent choice for building modern, cloud-native applications that need reliable messaging with persistence.

Ready to dive deeper? Check out the official NATS documentation.


Liked this content? Check out the related Consumer screencast

Ready to get started? Check out the repo and learn at your own pace!

Cancel