ByteByteGo Newsletter

Share this post

Data Sharing Between Microservices

blog.bytebytego.com

Data Sharing Between Microservices

ByteByteGo
Oct 17, 2024
∙ Paid
220
  • Share this post with a friend

    Since you liked this post, why not share it to help spread the word?
Share this post

Data Sharing Between Microservices

blog.bytebytego.com
12
Share

Microservices architecture has become popular for building complex, scalable software systems.

This architectural style structures an application as a collection of loosely coupled, independently deployable services. Each microservice is focused on a specific business capability and can be developed, deployed, and scaled independently.

While microservices offer numerous benefits, such as improved scalability, flexibility, and faster time to market, they also introduce significant challenges in terms of data management.

One of the fundamental principles of microservices architecture is that each service should own and manage its data. This principle is often expressed as "don't share databases between services” and it aims to ensure loose coupling and autonomy among services, allowing them to evolve independently. 

However, it's crucial to distinguish between sharing a data source and sharing data itself. While sharing a data source (e.g., a database) between services is discouraged, sharing data between services is often necessary and acceptable.

In this post, we’ll look at different ways of sharing data between microservices and the various advantages and disadvantages of specific approaches.

Sharing Data Source vs Sharing Data

As mentioned earlier, it's crucial to understand the difference between sharing a data source and sharing data itself when working with microservices. 

This distinction has significant implications for service coupling, independence, and overall system design.

Sharing a data source means multiple services directly access the same database, potentially leading to tight coupling and dependencies. It violates the principle of service data ownership in microservices.

On the other hand, sharing data involves services exchanging data through well-defined APIs or messaging patterns, maintaining their local copies of the data they need. This helps preserve service independence and loose coupling.

Let's consider an example scenario involving an Order service and a Product service to illustrate the data-sharing dilemma.

  • Order Service:

    • Manages order information.

    • Stores order details in its database.

  • Product Service:

    • Handles product information.

    • Maintains a separate database for product details.

  • Relationship:

    • Each order is associated with a set of products

The challenge arises when the Order service needs to display product information alongside order details. 

In a monolithic architecture, this would be straightforward since all the data resides in a single database. However, in a microservices architecture, the Order service should not directly access the Product service database. Instead, it should fetch the data via the Product Service.

This scenario highlights the need for effective data-sharing strategies in microservices:

  • Data Independence: Each service owns its data, preventing direct database access between services.

  • Data Retrieval: The Order service must find a way to retrieve product information without violating service boundaries.

  • Performance Considerations: The chosen data-sharing approach should not significantly impact system performance or introduce excessive latency.

  • Consistency: Ensuring data consistency between services becomes crucial when sharing data across service boundaries.

Let us now look at different ways of sharing data.

Synchronous Data Sharing in Microservices

Synchronous data sharing in microservices involves real-time communication between services, where the requesting service waits for a response before proceeding. This approach ensures immediate consistency but introduces challenges in terms of scalability and performance.

Request/Response Model

The request/response model is the most straightforward synchronous approach:

  • Service A sends a request to Service B for data.

  • Service B processes the request and sends back a response.

  • Service A waits for the response before continuing its operation.

For example, whenever the Order Service is called to provide order details (which includes product information), it has to fetch product data by calling the Product Service.

As expected, the synchronous data-sharing approaches face several challenges:

  • Increased Response Time: Each synchronous call adds to the overall response time of the system.

  • Cascading Failures: If one service is slow or down, it can affect the entire chain of dependent services.

  • Resource Utilization: Services may need to keep connections open while waiting for responses, potentially leading to resource exhaustion.

  • Network Congestion: As the number of inter-service calls increases, network traffic can become a bottleneck.

Due to these challenges, services can share data using duplication. For example, storing the product name with order details within the order table. This way the Order Service does not have to contact the Product Service to fetch the product’s name.

However, this creates consistency issues during data updates. For example, the product name is now duplicated across two different services. If the product name gets changed, the updates have to be done in both the product table and the order table to maintain consistency.

Note that this is just a simple example for explanation purposes. Even more critical consistency requirements can exist in a typical application. 

The Gateway Approach

The Gateway approach is a simple approach to ensure data consistency while sharing data across multiple services. 

Here’s an example scenario of how it can work:

  • A coordinator that acts like a gateway initiates the update process.

  • The coordinator sends update requests to all participating services.

  • Each service performs the update and sends back a success or failure response.

  • If all services succeed, the transaction is considered complete. If any service fails, the coordinator must initiate a rollback on all services.

See the diagram below where a possible change in the product name requires updates in the order service where a copy of the product name is shared. The coordinator ensures that both the updates are successful before informing the user.

Some advantages of the gateway approach are as follows:

  • Simpler to implement and reason about

  • Maintains consistency

However, there are also disadvantages:

  • Less reliable in failure scenarios.

  • Poor experience for the users in case one of the services fails.

  • Difficult to handle partial failures.

As evident, while synchronous approaches offer strong consistency and simplicity, they come with significant trade-offs in terms of scalability, performance, and resilience. 

This is also the reason why asynchronous approaches are usually more popular.

Asynchronous Data Sharing in Microservices

Asynchronous data sharing in microservices architectures enables services to exchange data without waiting for immediate responses. This approach promotes service independence and enhances overall system scalability and resilience. 

Let's explore the key components and concepts of asynchronous data sharing.

Event-Driven Architecture

In an event-driven architecture, services communicate through events:

  • When a service performs an action or experiences a state change, it publishes an event to a message broker.

  • Other services interested in that event can subscribe to it and react accordingly.

  • This loosely coupled approach allows services to evolve independently.

The example below describes such an event-driven approach

  • The Order Service publishes an "OrderCreated" event when a new order is placed.

  • The Inventory Service subscribes to the "OrderCreated" event and updates stock levels.

  • The Shipping Service also subscribes to the event and initiates the shipping process.

Message Queues

Message queues, such as RabbitMQ, serve as the backbone of asynchronous communication in microservices:

  • They provide a reliable and scalable way to exchange messages between services.

  • Services can publish messages to specific topics or queues.

  • Other services can consume those messages at their own pace.

Eventual Consistency

Asynchronous data sharing often leads to eventual consistency. 

Services maintain their local copies of data and update them based on received events. It means that data across services may be temporarily inconsistent but will eventually reach a consistent state.

This approach allows services to operate independently and improves performance by reducing synchronous communication.

Here’s an example of the impact of eventual consistency on the earlier example about data sharing between Product and Order Service:

  • The Product Service maintains a local copy of the product data.

  • When a product’s details (such as name) are updated, the Product Service publishes a "ProductUpdated" event.

  • The Order Service consume the event and updates their local copies of the product data asynchronously.

  • There may be a short period where product data is inconsistent across services, but it will eventually become consistent.

See the diagram below:

The advantages of asynchronous data sharing are as follows:

  • Loose Coupling: Services can evolve independently without tight dependencies on each other.

  • Scalability: Services can process messages at their own pace, allowing for better scalability and resource utilization.

  • Resilience: If a service is temporarily unavailable, messages can be buffered in the queue and processed later, improving system resilience.

  • Improved Performance: Asynchronous communication reduces the need for synchronous requests, resulting in faster response times and improved overall performance.

However, there are also some disadvantages:

  • Eventual Consistency: Data may be temporarily inconsistent across services, which can be challenging to handle in certain scenarios.

  • Increased Complexity: Implementing event-driven architectures and handling message queues adds complexity to the system.

  • Message Ordering: Ensuring the correct order of message processing can be challenging, especially in scenarios where message ordering is critical.

  • Error Handling: Dealing with failures and errors in asynchronous communication requires careful design and implementation of retry mechanisms and compensating actions.

Hybrid Approach

A hybrid approach to data sharing in microservices combines elements of both synchronous and asynchronous methods. This approach aims to balance the benefits of local data availability with the need for up-to-date and detailed information from other services.

Let's consider a ride-sharing application with two microservices: Driver and Ride.

  • The Ride service stores basic driver information (ID, name) locally.

  • For detailed information (e.g., driver rating, car details), the Ride service makes a synchronous call to the Driver service.

This hybrid approach allows the Ride service to have quick access to essential driver data while still retrieving more comprehensive information when needed.

See the diagram below:

The hybrid approach offers a couple of benefits such as:

  • Reduces Redundant Calls: By storing basic information locally, the Ride service can reduce the frequency of service-to-service calls. This improves performance and minimizes the load on the network.

  • Smaller Database Footprint: Only essential data is duplicated across services, resulting in a smaller database footprint. This saves storage space and reduces the overhead of managing redundant data.

However, it also has some drawbacks:

  • Still Affects Response Time: Although basic information is available locally, retrieving detailed information through synchronous calls to the Driver service still impacts the overall response time of the Ride service.

  • Some Data Duplication: Even though only essential data is duplicated, there is still some level of data duplication across services. This requires careful management to ensure data consistency and avoid discrepancies.

​​When implementing a hybrid approach, consider the following:

  • Data Consistency: Ensure that the duplicated data remains consistent across services. Implement mechanisms to update the local copy of data when changes occur in the source service.

  • Caching Strategies: Employ caching strategies to store frequently accessed detailed information locally, reducing the need for repetitive synchronous calls to other services.

  • Asynchronous Updates: Consider using asynchronous methods, such as event-driven architecture or message queues, to propagate updates to the duplicated data asynchronously, minimizing the impact on response times.

Trade-offs and Decision Making

When designing a microservices architecture that involves data sharing, it's crucial to carefully evaluate the trade-offs between consistency, performance, and scalability. 

This decision-making process requires a deep understanding of your system's requirements and constraints.

1 - Evaluating Consistency Requirements

One of the key factors in choosing an appropriate solution is the consistency requirement.

Strong Consistency

Strong consistency ensures that all services have the most up-to-date data at all times. It is suitable for systems where data accuracy is critical, such as financial transactions or medical records.

The main characteristics are as follows:

  • Often implemented using synchronous communication patterns like the gateway pattern or request-response approach.

  • Ensures immediate data consistency across all services.

  • Drawbacks include higher latency and reduced availability.

Eventual Consistency

Eventual consistency allows for temporary data inconsistencies but guarantees that data will become consistent over time. It is appropriate for systems that can tolerate temporary inconsistencies, such as social media posts or product reviews.

The primary characteristics of a system based on eventual consistency are as follows:

  • Implemented using asynchronous communication patterns like event-driven architecture.

  • Allows for faster response times and higher availability.

  • Requires careful design to handle conflict resolution and compensating transactions.

2 - Performance Considerations

The key performance-related considerations are as follows:

  • Latency

    • Strong consistency often leads to higher latency due to synchronous communication.

    • Eventual consistency can provide lower latency but may serve stale data.

  • Throughput

    • Eventual consistency generally allows for higher throughput as services can operate more independently.

    • Strong consistency may limit throughput due to the need for coordination between services.

  • Resource Utilization

    • Strong consistency may require more computational and network resources to maintain data integrity.

    • Eventual consistency can be more efficient in terms of resource usage but requires additional storage for local data copies.

3 - Scalability 

The data-sharing solution has a significant impact on the application's scalability. Some of the key aspects are as follows:

  • Horizontal Scaling: Eventual consistency models often scale better horizontally as services can be added without significantly impacting overall system performance. Strong consistency models may face challenges in horizontal scaling due to increased coordination overhead.

  • Data Volume: As data volume grows, maintaining strong consistency across all services becomes challenging. Eventual consistency can handle large data volumes more gracefully but requires careful management of data synchronization.

  • Service Independency: Eventual consistency allows services to evolve more independently, facilitating easier scaling. Strong consistency models create tighter coupling between services.

Summary

In this article, we’ve taken a detailed look at the important topic of sharing data between microservices. This is a critical requirement when it comes to building robust and high-performance microservices.

Let’s summarize the key learnings:

  • The principle of each service owning its data aims to ensure loose coupling and autonomy among services, allowing them to evolve independently. 

  • However, it's crucial to distinguish between sharing a data source and sharing data itself.

  • Synchronous data sharing in microservices involves real-time communication between services, where the requesting service waits for a response before proceeding.

  • Some approaches for synchronous data sharing are the request/response model and the gateway approach

  • Asynchronous data sharing in microservices architectures enables services to exchange data without waiting for immediate responses.

  • The key components of the asynchronous data-sharing approach are event-driven architecture, message queues, and the concept of eventual consistency.

  • A hybrid approach to data sharing in microservices combines elements of both synchronous and asynchronous methods. This approach aims to balance the benefits of local data availability with the need for up-to-date and detailed information from other services.

  • Multiple trade-offs need to be taken into consideration while choosing the appropriate data-sharing approach. Some key factors are consistency requirements, performance needs, and the desired scalability of the application.




220 Likes
·
12 Restacks
220
  • Share this post with a friend

    Since you liked this post, why not share it to help spread the word?
Share this post

Data Sharing Between Microservices

blog.bytebytego.com
12
Share

Discussion about this post

Understanding Database Types
The success of a software application often hinges on the choice of the right databases. As developers, we're faced with a vast array of database…
Apr 19, 2023 • 
Alex Xu
953
Share this post

Understanding Database Types

blog.bytebytego.com
12
A Crash Course in Networking
The Internet has become an integral part of our daily lives, shaping how we communicate, access information, and conduct business.
Jan 18 • 
ByteByteGo
885
Share this post

A Crash Course in Networking

blog.bytebytego.com
5
System Design PDFs
High Resolution PDFs/Images Big Archive: System Design Blueprint: Kuberntes tools ecosystem: ByteByteGo Newsletter is a reader-supported publication. To…
May 17, 2022 • 
Alex Xu
2,169
Share this post

System Design PDFs

blog.bytebytego.com
102
A Crash Course in CI/CD
Introduction
Apr 4 • 
ByteByteGo
542
Share this post

A Crash Course in CI/CD

blog.bytebytego.com
4
Software Architecture Patterns
Software architects often encounter similar goals and problems repeatedly throughout their careers.
Sep 26 • 
ByteByteGo
342
Share this post

Software Architecture Patterns

blog.bytebytego.com
6
HTTP1 vs HTTP2 vs HTTP3 - A Deep Dive
What has powered the incredible growth of the World Wide Web? There are several factors, but HTTP or Hypertext Transfer Protocol has played a…
May 9 • 
ByteByteGo
504
Share this post

HTTP1 vs HTTP2 vs HTTP3 - A Deep Dive

blog.bytebytego.com
5
Netflix: What Happens When You Press Play?
This week's newsletter features a chapter from one of my favorite books, Explain the Cloud Like I’m 10.
Jan 4 • 
ByteByteGo
693
Share this post

Netflix: What Happens When You Press Play?

blog.bytebytego.com
5
EP128: The Ultimate Software Architect Knowledge Map
This week’s system design refresher:
Sep 7 • 
ByteByteGo
428
Share this post

EP128: The Ultimate Software Architect Knowledge Map

blog.bytebytego.com
8
API Gateway
API Gateways are essential components in modern software architectures, particularly microservices-based systems.
Oct 3 • 
ByteByteGo
353
Share this post

API Gateway

blog.bytebytego.com
Netflix: What Happens When You Press Play - Part 2
Remember how we said a CDN has computers distributed all over the world? Netflix developed its own computer system for video storage. Netflix calls them…
Jan 11 • 
ByteByteGo
645
Share this post

Netflix: What Happens When You Press Play - Part 2

blog.bytebytego.com
2
© 2024 ByteByteGo
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great culture
Share

Update your profile

undefined subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to wenranlu@gmail.com, or click here to sign in.