How Form3 built a multi-cloud low-latency payments service with NATS.io + JetStream
May 1, 2022
Location: UK Remote workforce
Customers: Square, Mastercard, Lloyds, Barclays
Tech Stack: Logz, Linkerd, Docker, NATS, Kubernetes, K3S, HashiCorp Vault, AWS, CockroachDB, PostGres, Linux
Application Language: Go
Founded In 2016, Form3 is revolutionizing the world of payment processing with an always on, cloud-native, Payments-as-a-Service platform. Today, Form3 is trusted by some of the UK and Europe’s biggest Tier1 banks and fastest-growing FinTech’s to handle their critical payments architecture. Back-end and inter-bank payment processing is a highly regulated business that is expensive, infrastructure-intensive, and suffers from a slow pace of innovation. Form3 sought to radically improve on the old payments processing technology model by moving the entire operation into the cloud and designing a cloud-native application structure with microservices. “When we are talking about banks, it's not the most modern stack,” explains Form3 evangelist Adelina Simion. “To have a platform fully hosted in the cloud, that is cloud native, is a huge step forward. We don’t have actual hosted servers running our services.”
How it Works: A Single Payments API to Remove Complexity
Form3’s service sits in between banks and the external payments processing systems such as Swift, BACS Faster Payments and SEPA. Each of these systems has their own schema for formatting messages and retries, and has different security requirements. The multiple standards and requirements generate complexity and require considerable maintenance and troubleshooting. Form3 provides a single payment processing API and abstract away the complexity, allowing banks to consume payment processing as a service. “We take away the burden of multiple integrations and we can do this at scale and in the cloud,” explains Simion.
Architecturally, putting payments processing into the cloud makes it much simpler for Form3 to stand up and expand a truly distributed application infrastructure. This would push them closer to end users, reduce latency, and improve availability. Form3 deploys a cloud-native architecture running on Kubernetes, with loosely coupled microservices.
The Form3 Application Architecture
The application is written in Go and composed of multiple services running on Elastic Kubernetes Service (EKS) with the Linkerd service mesh handling microservice routing, observability, and performance guarantees. Customer financial services institutions and fintechs send a payment processing request to the Form3 API, which is hosted in AWS. Form3 translates the request into their internal request format and uses AWS’ Simple Notification Service (SNS) and SQS to create a paired pub-sub event bus for processing messages. In this system, payment requests “fan out” to one of the four supported external payment processing systems (SWIFT, BACS, SEPA, Faster Payments). uses the AWS Simple Queuing Systems as a pub sub system to define rules for processing the request, including recipient, retries, and more. For SWIFT, BACS and SEPA. Form3 validates each payment request and provides a secure gate connecting to those services.
For Faster Payments, the regulations require a separate physical instance. To process Faster Payments requests, which have a guaranteed SLA of two-hours transaction time, Form3 is required to lease physical lands and hardware. The Faster Payments processing cannot communicate with the cloud except using these dedicated circuits and dedicated hardware. “We previously had a third-party doing that for us but as we were growing, it was not good enough so we made our own physical data centers with leased lines in it.” Both in AWS and the FPS-dedicated data center, the Form3 application runs on Kubernetes.
What Form3 Needed: A MultiCloud, Lightweight Lighting-Fast Event Bus
The company guarantees it will complete a payment request in no more than 500 milliseconds. The AWS overall uptime SLA is 99.9%. This was not sufficient for critical payments infrastructure. Says Simion, “Some of our customers were asking what would happen if AWS goes down? We can’t lose our processing capability.” Creating a higher availability service would require creating a multi-cloud capability with instant switching from one cloud to another in an outage. To summarize, Form3 needed an event bus that was faster to reduce latency and lighter weight to run in physical data centers. To back up tight 500 millisecond customer SLAs, Form3 needed to run replicas of its applications running in multiple public clouds in an active-active immediately consistent configuration. For compliance, Form3 needed to run its full application in low-resource physical data centers.
NATS + JetStream Benefits and Results for Form3
Form3 chose to replace its legacy pub-sub event bus and message services with NATS and the JetStream persistence layer, which is part of the NATS architecture. Because NATS is so simple to set up, has such low latency and high capacity per server, NATS works equally well in almost any environment and in any cloud.
Because it is a complete pub-sub system, NATS can replace both SQS and SNS with a single service. NATS also includes NATS Streaming, which can enable NATS to act as a data streaming service with configurable message persistence. This is important for mission critical message streams — such as payment status — that require “at-least once” delivery rather than NATS Core’s “at-most once” delivery. However, NATS Streaming is being replaced by JetStream, a more advanced, more flexible and easier to deploy persistence layer that also includes a key/value database, data-at-rest encryption by default, and horizontal scalability.
Form3 Benefits and Results
The Form3 engineering team tested NATS and NATS Streaming as a replacement event bus for their SQS and SNS pairing. The Form3 team quickly determined that NATS met their expectations in terms of latency, reliability and resilience and compute requirements. Form3 worked closely with the Synadia team to ensure that the planned architecture would meet the sub-500ms latency requirements. After this, Form3 made the architectural switch, replacing SQS and SNS with NATS. The top latency for messages immediately dropped to less than 50 milliseconds, a more than 6x improvement that offered Form3 strong assurances it could meet SLA requirements for transaction processing, even under heavy loads.
NATS in Production at Form3
What also impressed the Form3 engineering team was how far smaller instances of NATS in terms of CPU and RAM could process even more transactions than the company's previous architecture. For the Faster Payments physical data center and leased lines, Form3 also has deployed NATS JetStream because it needed to apply more configurable persistence. NATS fit into the limited footprint environment of the data center without difficulty and actually enabled Form3 to handle a greater volume of payments than the previous architecture, something that will simplify and reduce future scaling costs in this more expensive environment.
Form3 is in the process of deploying NATS to create a cloud-agnostic architecture leveraging JetStream and its highly configurable persistence layer. Workloads could easily shift from one cloud to the next, using NATS JetStream and Leaf Nodes, a method that NATS uses to easily bridge NATS servers. “It lets us break free from the SLAs of any one cloud and create our own,” says Simion. “None of this would happen if we couldn’t have a proper event bus across the clouds.”
6x decline in average latency, from 300ms to 50ms
Higher messaging throughput on smaller cloud footprint
High reliability — never requires restarts
Lightweight - no JVM, no Zookeeper
MultiCloud ready in active/active with immediate consistency
Flexibility to handle streaming or message bus
Highly configurable persistence layer
Runs in low resource environments
Supports active-active architectures with minimal configuration
Download the entire Form3 case study to see the full indepth technical detail.