Microservice combat (3): In-depth inter-process communication of microservice architecture

[Editor's Note] This is the third article in the series of creating your own application using the microservice architecture. The first article introduces the microservices architecture pattern, compares it with the monolithic pattern, and discusses the advantages and disadvantages of using a microservices architecture. The second part describes how to use API Gateway to communicate between application clients using the microservice architecture. In this article, we will discuss how system services communicate with each other.

 

Introduction

 

In a monolithic application, calls between modules are implemented through programming language-level methods or functions. But a distributed application based on microservices runs on multiple machines. Generally speaking, each service instance is a process. Therefore, as shown in the figure below, the interaction between services must be achieved through inter-process communication (IPC).

 

Richardson-microservices-part3-monolith-vs-microservices-1024x518.png

 

 

We will introduce the IPC technology in detail later, and now let's look at the design-related issues first.

 

interactive mode

 

When choosing IPC for a service, the first thing to consider is how the services will interact. There are many interaction patterns between client and server, which we can categorize in two dimensions.

Whether the first dimension is one-to-one or one-to-many:

  • One-to-one: Each client request has one service instance to respond to.
  • One-to-many: each client request has multiple instances of the service to respond to

The second dimension is whether these interactions are synchronous or asynchronous:

  • Synchronous mode: Client requests require immediate responses from the server, and may even block due to waiting.
  • Asynchronous mode: The client request will not block the process, and the server's response can be non-instant.

 

The following table shows the different interaction modes:

 

74.pic_.jpg

 

There are several ways of one-to-one interaction mode:

 

  • Request/Response: A client makes a request to the server and waits for a response. The client expects this response to arrive instantly. In a thread-based application, the waiting process may cause the thread to block.
  • Notification (also known as one-way request): A client request is sent to the server, but the server does not expect a response.
  • Request/Asynchronous Response: The client sends a request to the server, and the server responds to the request asynchronously. The client does not block, and is designed so that the default response does not arrive immediately.

There are several ways of one-to-many interaction mode:

 

  • Publish/Subscribe: The client publishes notification messages that are consumed by zero or more interested services.
  • Publish/Asynchronous Response Pattern: The client publishes a request message and then waits for a response from the service of interest.

Each service is a combination of the above modes. For some services, one IPC mechanism is sufficient; for other services, a combination of multiple IPC mechanisms is required. The following diagram shows how services communicate in a ride-hailing request.

 

Richardson-microservices-part3-taxi-service-1024x609.png

 

The service communication in the above figure uses notification, request/response, publish/subscribe, etc. For example, a passenger sends a notification to the "Itinerary Management Service" through a mobile terminal, hoping to apply for a rental service. "Travel Management Service" sends a request/response message to "Passenger Service" to confirm that the passenger account is valid. The trip is then created, and the publish/subscribe interaction pattern is used to notify other services, including a dispatch service that locates available drivers.

 

Now that we understand interaction patterns, let's take a look at how to define an API.

 

Define API

 

An API is a contract between a server and a client. Regardless of the IPC mechanism chosen, it is important to use some kind of Interactive Definition Language (IDL) to precisely define a service's API. There are even some good reasons to use an API-first approach to defining services. Before developing, you need to define the interface of the service and discuss it with the client developer in detail. Such discussions and designs will greatly contribute to the usability and satisfaction of the API.

 

As you'll see later in this article, the API definition essentially depends on which IPC is chosen. If the message mechanism is used, the API consists of the message channel (channel) and the message type; if the HTTP mechanism is selected, the API consists of the URL and the request and response format. IDL will be described in detail later.

 

Evolution of APIs

 

The server-side API is constantly changing. In a monolithic application it is common to modify the API directly and then update it to all callers. In an application based on a microservice architecture, this is very difficult. Even if only one service uses this API, it is impossible to force the user to keep synchronous updates with the server. In addition, developers may try to deploy a new version of the service, and at this time, the old and new services will run together. You need to know how to deal with these problems.

 

How you handle API changes depends on how big those changes are. Some changes are minor and compatible with previous versions. For example, you might just add an attribute to a request and response. It is important to follow robustness principles when designing clients and servers. Clients using the old version of the API should work with the new version as well. The server still provides default response values, and the client ignores responses that are not required for this version. Using the IPC mechanism and message format is helpful for API evolution.

 

But sometimes, the API needs to undergo extensive changes and may not be compatible with previous versions. Because you can't force all clients to upgrade at once, services that support older clients will need to run for a while. If you are using IPC over HTTP based mechanisms such as REST, one solution is to embed the version number in the URL. Each service may handle multiple versions of the API at the same time. Alternatively, you can deploy multiple instances, each responsible for handling one version of the request.

 

Processing part failed

 

In our last article on API gateway, we learned that partial failure is a common problem in distributed systems. Because the client and server are independent processes, a server may stop service due to failure or maintenance, or the service may stop due to overload or slow response.

 

Consider the partially failed scenario described in this post. Assuming that the recommendation service cannot respond to the request, the client will be blocked due to waiting for a response, which will not only bring a poor experience to the customer, but also occupy a lot of resources in many applications, such as threads, so that in the end, waiting for More and more clients are blocked in response, and thread resources are exhausted. As shown below:

 

Richardson-microservices-part3-threads-blocked-1024x383.png

 

To prevent this problem, services must be designed with partial failure in mind.

 

Netflix provides a better solution, and the specific countermeasures include:

 

  • Network timeout: When waiting for a response, do not block indefinitely, but use a timeout strategy. Use a timeout policy to ensure that resources are not occupied indefinitely.
  • Limit the number of requests: You can set an access limit for client requests to a specific service. If the request has reached the upper limit, the request service must be terminated immediately.
  • Circuit Breaker Pattern: Records the number of successful and failed requests. If the failure rate exceeds a threshold, the circuit breaker is triggered and subsequent requests fail immediately. If a large number of requests fail, it may be that the service is unavailable, and there is no point in sending more requests. After an expiration period, the client can try again and, if successful, close this circuit breaker.
  • Provide rollback: When a request fails, the rollback logic can be performed. For example, returning cached data or a system default value.

Netflix Hystrix is ​​an open source library that implements related patterns. If using the JVM, it is recommended to consider using Hystrix. And if using a non-JVM environment, you can use a library with similar functionality.

 

IPC technology

 

There are many different IPC technologies now. Communication between services can use a synchronous request/response pattern, such as REST or Thrift over HTTP. Alternatively, you can choose an asynchronous, message-based communication mode, such as AMQP or STOMP. In addition to this, there are other message formats to choose from, such as JSON and XML, which are both readable, text-based message formats. Of course, there are also binary formats (more efficient), such as Avro and Protocol Buffer. Next we will discuss asynchronous IPC mode and synchronous IPC mode, first look at asynchronous.

Asynchronous, message-based communication

 

When using process communication based on asynchronous exchange of messages, a client submits a request by sending a message to the server. If the server needs to reply, it will send another separate message to the client. Because the communication is asynchronous, the client is not blocked by waiting, instead, the client takes it for granted that the response will not be received immediately.

 

A message consists of a header (metadata such as sender) and a message body. Messages are sent through a channel, and any number of producers can send messages to the channel, and similarly, any number of consumers can receive data from the channel. There are two types of channels, peer-to-peer and publish/subscribe. The point-to-point channel will accurately send the message to a consumer who reads the message from the channel. The server uses point-to-point to implement the one-to-one interaction mode mentioned above; while the publish/subscribe will deliver the message to all readers from the channel. For consumers who fetch data, the server uses the publish/subscribe channel to implement the one-to-many interaction mode mentioned above.

 

The following diagram shows how a ride-hailing software uses pub/sub:

 

Richardson-microservices-part3-pub-sub-channels-1024x639.png

 

The trip management service creates a trip message in the publish-subscribe channel and notifies the dispatch service of a new trip request. The dispatch service finds an available driver and writes a Driver Proposed message to the publish-subscribe channel to notify other service.

 

There are many messaging systems to choose from, preferably one that supports multiple programming languages. Some messaging systems support standard protocols such as AMQP and STOMP. Other messaging systems use proprietary protocols, and there are a number of open source messaging systems available, such as RabbitMQ, Apache Kafka, Apache ActiveMQ, and NSQ. They both support some form of messaging and channels, and are both reliable, performant, and scalable; however, their messaging models are completely different.

 

There are many advantages to using the message mechanism:

 

  • Decoupled client and server: the client only needs to send the message to the correct channel. The client does not need to know the specific service instance at all, let alone a discovery mechanism to determine the location of the service instance.
  • Message Buffering: In a synchronous request/response protocol, such as HTTP, all clients and servers must remain available during the interaction. In the message mode, the message broker manages all messages written to the channel in a queue until processed by the consumer. That is, an online store can accept customer orders, even if the ordering system is slow or unavailable, as long as the order message is kept in the queue.
  • Elastic client-server interaction: The message mechanism supports all the interaction modes mentioned above.
  • Direct inter-process communication: Based on the RPC mechanism, trying to wake up a remote service looks the same as waking up a local service. However, because of the laws of physics and the possibility of partial failure, they are actually very different. The message makes these differences so clear that the developer will not have problems.

However, the message mechanism also has its own disadvantages:

 

  • Additional operational complexity: Messaging systems need to be installed, configured, and deployed separately. The message broker (agent) must be highly available, otherwise the system reliability will be affected.
  • The complexity of implementing the request/response interaction pattern: The request/response interaction pattern requires additional work. Each request message must contain a reply channel ID and correlation ID. The server sends a response message containing the correlation ID to the channel, using the correlation ID to map the response to the requesting client. Perhaps at this point, it would be easier to use an IPC mechanism that directly supports request/response.

Now that we have learned about message-based IPC, let's take a look at IPC based on the request/response pattern.

 

Synchronous, request/response based IPC

 

When using a synchronous, request/response-based IPC mechanism, the client sends a request to the server, the server processes the request, and returns a response. Some clients may be blocked waiting for a response from the server, while others may use asynchronous, event-driven client-side code (Future or Rx Observable wrappers). However, unlike using the message mechanism, the client needs to respond in a timely manner. There are many optional protocols in this pattern, but the two most common ones are REST and Thrift. First let's look at REST.

 

REST

 

It is very popular to use RESTful style APIs these days. REST is based on the HTTP protocol. In addition, an important concept to understand is that REST is a resource that generally represents a business object, such as a customer or a product, or a set of business objects. REST uses the HTTP syntax protocol to modify resources, usually through URLs. For example, a GET request returns simple information about a resource, and the response format is usually in XML or JSON object format. A POST request creates a new resource, and a PUT request updates a resource. Here is what Roy Fielding, the father of REST, said:

 

When a holistic system that values ​​module interaction extensibility, interface generality, component deployment independence and latency reduction, and provides security and encapsulation is required, REST can provide such a set of architectures that meet the requirements.

The diagram below shows how a ride-hailing software uses REST.

 

Richardson-microservices-part3-rest-1024x397.png

 

The passenger submits a POST request to the /trips resource of the trip management service through the mobile terminal. After the trip management service receives the request, it will send a GET request to the passenger management service to obtain the passenger information. After confirming the passenger information, an itinerary is created and a 201 (Translator's Note: Status Code) response is returned to the mobile terminal.

 

Many developers have stated that their HTTP-based APIs are RESTful. However, as Fielding says in his blog, these APIs may not all be RESTful. Leonard Richardson defines a maturity model for REST, which includes the following 4 levels (taken from IBM):

  • Web services at the first level (Level 0) just use HTTP as the transport, and are actually just a specific form of Remote Method Call (RPC). Both SOAP and XML-RPC fall into this category.
  • The second level (Level 1) of Web services introduces the concept of resources. Each resource has a corresponding identifier and expression.
  • The third level (Level 2) Web services use different HTTP methods to perform different operations, and use HTTP status codes to indicate different results. Such as HTTP GET method to get resources, HTTP DELETE method to delete resources.
  • The fourth level (Level 3) Web services use HATEOAS. Link information is included in the representation of the resource. Clients can discover actions that can be performed based on the link.

Using an HTTP-based protocol has the following benefits:

 

  • HTTP is very simple and familiar to everyone.
  • The API can be tested using a browser extension (such as Postman) or a command line like curl.
  • Built-in support for request/response mode communication.
  • HTTP is firewall friendly.
  • No intermediate agents are required, simplifying the system architecture.

Shortcomings include:

 

  • Only request/response mode interaction is supported. HTTP notifications can be used, but the server must always send HTTP responses.
  • Because the client and server communicate directly (no proxy or buffer mechanism), both must be online during the interaction.
  • The client must know the URL of each service instance. This is also an annoying problem as mentioned in the previous article on API Gateway. Clients must use the service instance discovery mechanism.

The developer community has recently rediscovered the value of RESTful API interface definition languages. So there are some RESTful service frameworks, including RAML and Swagger. Some IDLs such as Swagger allow to define the format of request and response messages. Others, such as RAML, require the use of another markup, such as JSON Schema. For describing APIs, IDL generally has tools to define client and server skeleton interfaces.

 

Thrift

 

Apache Thrift is an interesting alternative to REST. It is an efficient framework for remote service invocation in multiple programming languages ​​implemented by Facebook. Thrift provides a C-style IDL definition API. Client and server-side code skeletons can be generated using the Thrift compiler. The compiler can generate code in many languages, including C++, Java, Python, PHP, Ruby, Erlang, and Node.js.

 

Thrift interfaces include one or more services. A service definition is similar to a JAVA interface, which is a set of methods. Thrift methods can return responses and can also be defined as one-way. The method that returns the value is actually the implementation of the request/response type interaction pattern. The client waits for a response and may throw an exception. The one-way method corresponds to the interactive mode of the notification type, and the server does not return a response.

 

Thrift supports multiple message formats: JSON, binary, and compressed binary. Binary is more efficient than JSON because binary decoding is faster. For the same reason, the compressed binary format can provide a higher level of compression efficiency. JSON, is readable. Thrift can also choose between bare TCP and HTTP, bare TCP seems to be more efficient than HTTP. However, HTTP is more friendly to firewalls, browsers and people.

message format

 

After understanding HTTP and Thrift, let's look at the problem of message format. If you use a messaging system or REST, you can choose the message format. Other IPC mechanisms, such as Thrift, may only support part of the message format, perhaps only one. Either way, it's very important that we use a cross-language message format. Because one day you will use another language.

 

There are two types of message formats: text and binary. Examples of text formats include JSON and XML. The advantage of this format is that it is not only readable, but also self-describing. In JSON, an object is a set of key-value pairs. Similarly, in XML, attributes consist of names and values. Consumers can select elements of interest and ignore other parts. At the same time, minor format modifications can be very container backward compatible.

 

XML document structure is defined by XML schema. Over time, the developer community realized that JSON needed a similar mechanism. One option is to use JSON Schema, either standalone or IDL such as Swagger.

 

The biggest disadvantage of text-based message formats is that messages can become verbose, especially XML. Because messages are self-describing, each message contains properties and values. Another disadvantage is that the burden of parsing the text is too great. So, you might want to consider using binary format.

 

There are also many binary formats. If you are using Thrift RPC, you can use binary Thrift. If you choose a message format, commonly used also include Protocol Buffers and Apache Avro. They both provide typical IDL to define message schema. One difference is that Protocol Buffers use tagged fields, whereas Avro consumers need to know the schema to parse the message. Therefore, with the former, the API is easier to evolve. This blog is a good comparison of the differences between Thrift, Protocol Buffers, and Avro.

 

Summarize

 

Microservices must use an interprocess communication mechanism to interact. When designing a communication schema for services, you need to consider several issues: how the services interact, how each service identifies the API, how to upgrade the API, and how to handle partial failures. There are two types of IPC mechanisms available for microservice architecture, asynchronous message mechanism and synchronous request/response mechanism. In the next article, we will discuss the problem of service discovery in a microservice architecture.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326379741&siteId=291194637