Transactional Outbox Pattern: Architecture Pattern
The outbox pattern allows you to keep two disparate data sources in a consistent(eventually) state while being loosely coupled!
As we deal with more complex distributed systems, we’ll often come across use cases where we need to atomically perform operations on multiple data sources in a consistent manner.
So, let’s assume that we are persisting order data into an RDBMS. The ML team might want to perform some analytics on this data. So, we have the following options -
Grant the ML team access to our DB. It creates a tight coupling between the Order Service and the ML team and any changes to the Order schema need to be coordinated across both teams and hence isn’t the preferred approach.
The Order Service writes to another DB owned by the ML team using the 2PC protocol. 2PC protocol is not as performant because of the need to coordinate across multiple nodes and is a blocking protocol. Hence it isn’t the preferred approach.
Push the Order data onto a message broker like Kafka. The ML team can then have a consumer that reads data off Kafka and persists in their DB and perform analytics against that DB.
We’ve happily decoupled the Order Service with the Analytics Service and everyone is happy! (Or so you think!)
There are multiple failure scenarios here:
Order Service successfully persisted the message on the Database but crashed before it could send the message to the Broker. This leads to a loss of messages, which means the ML team will not have all the orders to run their analytics on(making the analytics wrong/skewed).
Order Service successfully sent the message to the Broker but the transaction on the Database failed. This will lead to orphaned/false records with the ML team again impacting their analytics.
Outbox Pattern
Outbox Pattern comes to the rescue here. We make use of an Outbox table, which can be used to store the operations we’re performing on the database. Order Service will write to both the Order table and the Outbox table, as part of the same transaction, ensuring the operation will always be atomic(1).
Once the record is inserted into the Outbox table, it can be read by an asynchronous process(2) that reads the data and publishes it to the Message Broker(3).
QQ: What does the Outbox pattern remind you of? Hint: WAL
Advantages of Outbox Pattern
The Outbox Pattern provides several benefits over other messaging patterns. Some of the major advantages of the Outbox Pattern are as follows:
Reliability: With the Outbox Pattern, messages are persisted in a database transactionally with the business transaction. This ensures that messages are always delivered, even if there are system failures or network issues.
Scalability: The Outbox Pattern can handle high volumes of messages without overwhelming the message broker. Since messages are persisted in the database, the message broker can consume them at a more controlled rate.
Performance: The Outbox Pattern can be faster than other synchronous messaging patterns because it eliminates the need for synchronous communication between microservices. The microservice that produces the message can quickly complete the business transaction and return a response, while the message is sent asynchronously in the background.
Decoupling: The Outbox Pattern allows microservices to be loosely coupled. Each microservice can focus on its specific business logic and ignore the details of how messages are sent and received.
Alternatives to Outbox Pattern
If the Outbox Pattern is not suitable for your use case, there are a few alternative messaging patterns you can consider:
Direct Messaging: This pattern involves a direct synchronous request between microservices. It can be a good option for low-latency, low-volume communication.
Database Trigger: Another option is to use a database trigger to write the messages to the messaging infrastructure. The trigger can detect changes in the database and write the messages to the messaging infrastructure.
CDC: Just like the database triggers, we can make use of CDC to read messages from the transaction log. This way, you can rely on CDC as your source of truth as only committed transactions would show up in the CDC stream. The caveat here is you might not have direct access to the binlog/might need 3rd party systems like Debezium to read data from the transaction log.
Publish-Subscribe Pattern: This pattern involves a message broker that allows multiple microservices to subscribe to specific message types. It can be a good option for high-volume, low-latency communication. So in the above case, there could be a common message broker that can be used by Order Service as well as the Analytics Service.
Sample Implementation
Here is a simple example of how you can implement the Outbox Pattern in Golang using a PostgreSQL database:
Create a message struct that contains the message data:
type Message struct {
ID string `json:"id"`
EventType string `json:"event_type"`
Payload []byte `json:"payload"`
}
2. Create an Outbox table in the database:
CREATE TABLE outbox (
id uuid PRIMARY KEY,
event_type text NOT NULL,
payload bytes NOT NULL,
created_at timestamp NOT NULL DEFAULT NOW()
);
3. Insert a message into the outbox table in a database transaction:
func sendMessage(db *sql.DB, message *Message) error {
tx, err := db.Begin()
if err != nil {
return err
}
defer func() {
if r := recover(); r != nil {
tx.Rollback()
}
}()
_, err := tx.Exec("INSERT INTO orders(id, order_value, order_qty) VALUES ($1, $2, $3)", ...)
if err != nil {
tx.Rollback()
return err
}
_, err := tx.Exec("INSERT INTO outbox(id, event_type, payload) VALUES ($1, $2, $3)", ...)
if err != nil {
tx.Rollback()
return err
}
err = tx.Commit()
if err != nil {
panic(err)
}
}
This brings us to the end of this article. We talked about the problem of where the outbox pattern is beneficial, its advantages and what the alternatives to the outbox pattern could be. We even see a sample snippet on how you could implement a transactional outbox pattern in Golang & Postgres. Please post comments on any doubts you might have and will be happy to discuss them!