NATS Weekly #34
Week of July 4 - 10, 2022
🗞 Announcements, writings, and projects
A short list of announcements, blog posts, projects updates and other news.
⚡Releases
Official releases from NATS repos and others in the ecosystem.
-
nats.rs - async v0.16.0 and sync v0.22.0
-
nats.java - v2.15.5
-
nats.py - v2.1.4
📖 Articles
Blog posts, tutorials, or any other text-based content about NATS.
- Distributed Message Streaming in Golang using Nats JetStream - by Ebubekir Yiğit
💬 Discussions
Github Discussions from various NATS repositories.
- nats.go - Fetch timeout
💡 Recently asked questions
Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.
What exactly is the $G account?
NATS supports multi-tenancy using accounts. When no accounts are configured, a default account is created called $G
along with the $SYS
system account. This can be observed on the accountz
monitoring endpoint.
Likewise, if JetStream is enabled, you will see the $G
directory since stream/consumer data are organized and stored by account.
How can you check if a message has been re-delivered?
For JetStream which provides at-least-once quality of service (QoS), a message will be redelivered if the server doesn’t receive an ACK
(or TERM
) from the client. This could be due to:
-
the client fails to process the message within the ack wait duration
-
the client crashed or had an error while processing the message
-
the
ACK
message to the server was dropped due to a connection issue
Every JetStream message has metadata encoded in the reply subject for the server. Each client has a method or function to parse this metadata from a message. The metadata contains the field NumDelivered
which indicates the number of times the message has been delivered.
How large of a deduplication window is supported?
Thanks to Brent Dillingham for the question and analysis of what needs to be considered!
Streams can be configured to deduplicate messages when the Nats-Msg-Id
header is present. The docs state (emphasis mine):
The default window to track duplicates in is 2 minutes, this can be set on the command line using --dupe-window
when creating a stream, though we would caution against large windows.
The takeaways from Brent include:
-
Write performance may take a hit when the dedupe set is periodically pruned, but only if it’s a unique write; duplicate write detection would be unaffected because the mutex lock isn’t needed
-
Write performance should otherwise be fairly constant since the Go map lookup is O(log(N)) and insertion is O(1)
-
Server memory usage will increase
-
Restoring streams on reboot will take longer because the dedupe data structures have to be rebuilt
So how large can the window be? Ultimately, it is a function of message rate and available server memory. Given a fixed size of memory, if the message rate is low, you can have a larger window. With a high message rate, the memory will grow faster so the window needs to be smaller.
The other consideration is the use case of the stream and/or what are the origin of the published messages being de-duplicated. In other words, ask “how likely is it for a duplicate message to be published after the initial attempt?” If you are simply wanting to handle the case of a retry after a brief network interruption, the default window is likely fine.
However, if the origin of the publish come from some upstream source that itself may contain duplicates (after a long period of time), then you may want to consider an alternative strategy for deduplicating.