NATS Weekly #34

Week of July 4 - 10, 2022

🗞 Announcements, writings, and projects

A short list of announcements, blog posts, projects updates and other news.

⚡Releases

Official releases from NATS repos and others in the ecosystem.

nats.rs - async v0.16.0 and sync v0.22.0
nats.java - v2.15.5
nats.py - v2.1.4

📖 Articles

Blog posts, tutorials, or any other text-based content about NATS.

Distributed Message Streaming in Golang using Nats JetStream - by Ebubekir Yiğit

💬 Discussions

Github Discussions from various NATS repositories.

nats.go - Fetch timeout

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

What exactly is the $G account?

NATS supports multi-tenancy using accounts. When no accounts are configured, a default account is created called $G along with the $SYS system account. This can be observed on the accountz monitoring endpoint.

Likewise, if JetStream is enabled, you will see the $G directory since stream/consumer data are organized and stored by account.

How can you check if a message has been re-delivered?

For JetStream which provides at-least-once quality of service (QoS), a message will be redelivered if the server doesn’t receive an ACK (or TERM) from the client. This could be due to:

the client fails to process the message within the ack wait duration
the client crashed or had an error while processing the message
the ACK message to the server was dropped due to a connection issue

Every JetStream message has metadata encoded in the reply subject for the server. Each client has a method or function to parse this metadata from a message. The metadata contains the field NumDelivered which indicates the number of times the message has been delivered.

How large of a deduplication window is supported?

Thanks to Brent Dillingham for the question and analysis of what needs to be considered!

Streams can be configured to deduplicate messages when the Nats-Msg-Id header is present. The docs state (emphasis mine):

The default window to track duplicates in is 2 minutes, this can be set on the command line using --dupe-window when creating a stream, though we would caution against large windows.

The takeaways from Brent include:

Write performance may take a hit when the dedupe set is periodically pruned, but only if it’s a unique write; duplicate write detection would be unaffected because the mutex lock isn’t needed
Write performance should otherwise be fairly constant since the Go map lookup is O(log(N)) and insertion is O(1)
Server memory usage will increase
Restoring streams on reboot will take longer because the dedupe data structures have to be rebuilt

So how large can the window be? Ultimately, it is a function of message rate and available server memory. Given a fixed size of memory, if the message rate is low, you can have a larger window. With a high message rate, the memory will grow faster so the window needs to be smaller.

The other consideration is the use case of the stream and/or what are the origin of the published messages being de-duplicated. In other words, ask “how likely is it for a duplicate message to be published after the initial attempt?” If you are simply wanting to handle the case of a retry after a brief network interruption, the default window is likely fine.

However, if the origin of the publish come from some upstream source that itself may contain duplicates (after a long period of time), then you may want to consider an alternative strategy for deduplicating.