The difference between microservice architecture and soa framework, an architect's study notes

Difference Between Microservice Architecture and SOA

Microservices are so hot now, but the popular comparisons in the industry are the so-called Monolithic monolithic applications, and a large number of systems were already distributed systems more than ten years ago, so as a new concept and the original microservices What is the difference between distributed systems, or SOA (Service Oriented Architecture)?

Let's look at the same points first:

  • Registry is required to implement a dynamic service registration discovery mechanism;
  • It is necessary to consider the transaction consistency under the distribution. Under the CAP principle, two-stage submission cannot guarantee performance, and the transaction compensation mechanism needs to be considered;
  • Synchronous call or asynchronous message delivery, how to ensure message reliability? SOA integrates all messages by ESB;
  • All require a unified Gateway to aggregate and orchestrate interfaces, implement a unified authentication mechanism, and provide external RESTful interfaces used by APPs;
  • Similarly, we should pay attention to the problem of how to redistribute the positioning system and how to do log tracking, just like the signaling tracking function that we have done for more than ten years in the telecommunications field;

So where is the difference?

  • Is it continuous integration, continuous deployment? For CI, CD (continuous integration, continuous deployment), which are intertwined with agile and DevOps, I think this is more inclined to the field of software engineering rather than microservice technology itself;
  • Is it different to use different communication protocols? The benchmark communication protocol for microservices is RESTful, while traditional SOA is generally SOAP, but at present, there are many lightweight RPC frameworks such as Dubbo, Thrift, and gRPC. In Spring Cloud, there is also Feign framework that converts standard RESTful into code. APIs are RPC-like behaviors, and these communication protocols should not be the core difference between microservice architecture and SOA;
  • Is it a popular container-based framework or a virtual machine-based one? Docker, virtual machine or physical machine are both ways of implementing the architecture, not the core difference;

The essence of microservice architecture is segmentation

  • There is a big difference in the segmentation of services. SOA originally appeared as an "integration" technology. Many technical solutions are to encapsulate the original internal services of the enterprise into an independent process, so that new business development can reuse these services. , these services are likely to be very large particles such as supply chain and CRM; and the "micro" of microservices shows that he is particular about segmentation and does not compromise. Countless cases have proved that if your segmentation is wrong, then you will not get the advantages of "low coupling, no impact on upgrades, and high reliability" promised by microservices, but you will have more trouble than using Monolithic .
  • Microservices that do not split storage are pseudo-services: In practice, we often see an architecture in which the back-end storage is all in one database, and only the front-end business logic is split into different service processes. The above is the same as a Monolithic, except that the in-process call between modules is changed to an inter-process call. This kind of segmentation is not desirable and violates the first principle of distribution. The module coupling is not resolved, but the performance is affected.

First Principle of Distributed Design — "Don't Distribute Your Objects"

  • The word "Micro" of microservices is not as small as possible, but relative to the coarse-grained services of SOA, we need a smaller and more suitable granularity. This kind of Micro is not infinitely small.

If we combine two-way (synchronous) communication with small/microservices, and according to principles like "1 class = 1 service", then we're actually back in the 1990s with Corba, J2EE, and distributed objects years. Sadly, the new generation of developers have no experience with distributed objects and therefore don't realize how bad the idea is, they are trying to repeat history, only this time with new technologies like HTTP instead of RMI or IIOP .

Microservices and Domain Driven Design

A simple book management system certainly does not require a microservice architecture. Since the micro-service architecture is adopted, the problem space faced must be relatively large, such as the entire e-commerce and CRM.

How to disassemble the service?

What method is used to disassemble the service? In the industry, methods such as 1 class = 1 service, 1 method = 1 service, 2 Pizza teams, and 2 weeks can be rewritten and completed are popular, but these lack the implementation basis. We must look for some software design methods, the problem space where object-oriented and design patterns are applicable is a module, and the idea of ​​functional programming is more at the code level.

The book Domain-Driven Design by Eric Evans is a great reference for microservice architecture, which presents a technique for breaking down a large problem space into relationships and behaviors between domains and entities. At present, this is the most reasonable solution to the splitting problem. Through the concept of Bounded Context (hereinafter referred to as BC), we can encapsulate the implementation details, so that BC can realize SRP (Single Responsibility). )in principle. And each microservice is the physical mapping of BC in the real world, and the microservices that conform to the BC idea are independent and loosely coupled to each other.

Microservice architecture is a good thing, forcing everyone to pay attention to the rationality of designing software. If the domain analysis and object-oriented design in Monolithic are not well done, changing to microservices will magnify the problem exponentially.

Taking the two fields of order and commodity in e-commerce as an example, according to DDD dismantling, they should be two independent bounded contexts, but the order must contain commodities. If it is rashly split into two BCs, the query and call relationship will be Coupled together, there is even a troublesome distributed transaction problem, how to dismantle this association? BC theory holds that in different BCs, even a term has different concerns. In commodity BC, the focus is on attributes, specifications, details, etc. Wait, it is unreasonable to treat him as a separate BC. In order to simplify the example, we first think that the commodity BC is the basic information of the commodity), and in the order BC, we pay more attention to the inventory and price of the commodity. Therefore, in the actual coding design, the order service often redundantly focuses on the commodity name, price and other attributes in the order. This design relieves the strong association with the commodity BC, and the two BCs can provide services independently and store data independently.

summary

The microservice architecture should first focus not on the concepts of RPC/ServiceDiscovery/Circuit Breaker, nor the technical frameworks such as Eureka/Docker/SpringCloud/Zipkin, but on the boundaries of services and division of responsibilities. If the division is wrong, it will fall into the interaction between a large number of services. In calls and distributed transactions, this situation brings not convenience but hassle to microservices.

DDD has brought us a reasonable means of division, but DDD has many concepts, which are obscure and difficult to understand. How to grasp the key points and reasonably apply it to the microservice architecture?

I think the following architectural ideas are the most important

  • hyperemia model
  • event driven

We talked about the relationship between DDD of microservices above. Many people still feel that it is illusory. DDD is such a complex theory, aggregation root, value object, and event source. How do we start?

In fact, DDD is inextricably linked with theories such as object-oriented design, design patterns , etc. If you are not familiar with OOA and OOD, DDD is not well used. However, when learning these OO theories, people often feel useless, because most Java programmers start their development careers by learning the classic J2EE layering theory (Action, Service, Dao), in this layering theory In , we basically have no chance to use those so-called "behavioral" design patterns. The core reason here is that the classic layered development method of J2EE is an "anemic model". The knowledge points used in the full text of the architecture technology can be obtained for free in the group 619881427. Those who are interested can join in.

Martin Fowler proposed two development methods "transaction script" and "domain model" in his book "Enterprise Application Architecture Patterns", which correspond to "anemia model" and "congestion model" respectively.

Transaction Script Development Mode

The core of the transaction script is the process. It can be considered that most of the business processing is a piece of SQL. The transaction script organizes a single SQL into a piece of business logic. When the logic is executed, the transaction is used to ensure the ACID of the logic. The most typical is the stored procedure. Of course, in the usual J2EE classic layered architecture, we often use transaction scripts in the Service layer.

Using this development method, the object is only used to transfer data between layers. The object here is the "anemic model", with only data fields and Get/Set methods, and no logic in the object.

Let's take an example of an inventory deduction scenario:

  • Business scene

First of all, let's talk about the business scenario, an order to deduct inventory (lock inventory), this is very simple

First determine whether the inventory is sufficient, then deduct the saleable inventory, increase the inventory occupied by the order, and then record an inventory change log (as a voucher)

  • Design of anemia model

First design an inventory table Stock with the following fields

Design a Stock object (getter and setter omitted)

1
2
3
4
5
6
public class Stock {
private String spuId;
private String skuId;
private int stockNum;
private int orderStockNum;
}

 

  • Service entry

Design a StockService and write logic in its lock method

Input parameters are (spuId, skuId, num)

Implement pseudocode

1
2
3
4
5
6
7
count = select stocknum from stock where spuId=xx and skuid=xx
if count>num {
update stock set stocknum=stocknum-num, orderstocknum=orderstocknum+num where skuId=xx and spuId=xx
} else {
//库存不足,扣减失败
}
insert stock_log set xx=xx, date= new Date()
  • ok, finish the work, if you do better, you can combine update and select count, so that you can use one statement to complete the spin and solve the concurrency problem (master).

To summarize:

Have you ever found that in the operation process of placing an order to deduct inventory, which is a very important core logic in this business area, the Stock object does not appear at all, all of which are database operation SQL, and the so-called business logic is composed of multiple SQLs. Stock is just a data object of CRUD, there is no logic at all.

  • The "anemia model" defined by Martin Fowler is an anti-pattern. It is no problem to develop a simple small system with a transaction script. The business logic is complicated. The business logic and various states are scattered in a large number of functions, and the cost of maintenance and expansion is reduced. Immediately, the anemia model has no basis for implementing microservices.
  • Although we use an object-oriented language such as Java to develop, it is actually the same as a procedural language, so in many cases, it will be better to use the stored procedure of the database instead of Java to write logic, (ps: Spring boot is used. nor microservices),

Domain Model Development Patterns

  • A domain model encapsulates data and behavior and maps to real-world business objects. Various types have a clear division of responsibilities, so that the logic is distributed to the appropriate objects. Such an object is the "congestion model".
  • In specific practice, we need to clarify a concept, that is, the domain model is stateful, and it represents an actual thing. Continuing the above example, we design the Stock object to represent the actual inventory of a commodity, and add business logic methods to this object

When placing an order to lock the inventory business logic, you must first restore the Inventory object from the Repository according to the primary key load, and then execute the corresponding lock(num) method to change the state of the Inventory object (attributes are also a type of state), and then The object is then persisted to the storage through the save method of the Repository.

It is the Application that completes the above series of operations, and the Application provides an interface for this integrated operation.

The most important thing in the domain model development method is to put the details of the state change caused by the deduction into the Inventory object for execution, which is the encapsulation of the business logic.

The lock method of the Application object can be compared with the lock of the StockService of the transaction script method. The StockService fully grasps all the details. Once there is a change (for example, the inventory can be deducted from 0), the Service method should be changed accordingly; This method does not need to be changed, as long as it is calculated inside the Inventory object. The code is in the right place, the calculations are at the right level, and everything is sensible. This design can make full use of various OOD and OOP theories to achieve beautiful business logic.

  • Disadvantages of the hyperemia model

From the above example, it takes a certain amount of time to load the Repository, execute the business method, and then go back to save, but in this process, if multiple threads request the lock on the Inventory at the same time, it will lead to inconsistencies in the state. , the trouble is that concurrency against inventory is not only intractable but common. The knowledge points used in the full text of the architecture technology can be obtained for free in the group 619881427. Those who are interested can join in.

The anemia model completely relies on the support of the database for concurrency, and the implementation can be simplified a lot, but the congestion model has to be implemented by itself. Whether it is through the lock object in memory or using the remote lock mechanism of Redis, it is more complicated and less reliable than the anemia model. , which is a challenge posed by the hyperemia model. Even better, concurrency can be eliminated through an event-driven architecture.

The relationship between the domain model and microservices

The implementation of the domain model is discussed above, but what is the relationship between it and microservices? In practice, this Inventory is an aggregate root of a bounded context, and we can think of an aggregate root as a microservice process.

But the problem comes again. An inventory of inventory must be related to the product information. It is not enough to rely on the redundant product ID in the inventory. The status of the product on and off the shelf is required by business logic. Introduce heavy objects such as commodity Sku into this microservice? Two heavy objects in one service? Such microservices cannot be disassembled, or do we have to rely on commodity libraries? !

Continuing from the above, we adopted the domain-driven development method, used the congestion model, and enjoyed its benefits, but we also had to face its disadvantages. This drawback is magnified under the distributed microservice architecture.

Transactional Consistency

The problem of transaction consistency is not a big problem under Monolithic, but it is fatal under microservices. Let's review the so-called ACID principle

  • Atomicity - Atomicity, changing data state either completes together or fails together
  • Consistency - Consistency, the state of the data is completely consistent
  • Isolation - isolation line, even if there are concurrent transactions, they do not affect each other
  • Durability – Durability, once a transaction is committed, irrevocable

In the case of monolithic services and relational databases, we can easily achieve ACID through the characteristics of the database. But once you split the aggregate root-microservice architecture according to DDD, their databases have been separated, you have to face distributed transactions independently, and you have to meet ACID in your own code.

For distributed transactions, people generally think of the previous JTA standard, 2PC two-stage commit. I remember that in the Dubbo group, people would ask Dubbo when it would support distributed transactions every week. In fact, according to the CAP principle in distributed systems, when P (partition tolerance) occurs, forcibly pursuing C (consistency) will lead to (A) availability and throughput decline. At this time, we generally use eventual consistency to ensure our AP capability of the system. Of course, it is not to give up C, but in general, CAP can guarantee that in the case of partition, we can ensure data consistency through eventual consistency.

example:

In the scenario of placing an order and freezing inventory in an e-commerce business. It is necessary to determine whether the order is filled according to the stock situation.

Suppose you have adopted a distributed system, where the order module and the inventory module are two services, each with its own storage (relational database),

In a database, one transaction can modify two tables, but in microservices, this is not possible.

In the DDD concept, a transaction can only change the internal state of one aggregate. If the state needs to be consistent between multiple aggregates, then eventual consistency must be used. Orders and inventory are obviously the aggregation of two different bounded contexts, where event-driven architecture is required to achieve eventual consistency.

Event-driven for eventual consistency

The event-driven architecture synchronizes the state between domain objects through asynchronous messages. Some messages can also be published to multiple services at the same time. After a message causes synchronization of one service, it may cause another message, and the event will spread. Event-driven in the strict sense is not called synchronously.

example:

After the order service adds an order, the status of the order is "opened", and then an Order Created event is published to the message queue

After receiving the Order Created event, the inventory service subtracts the saleable inventory from a sku in the inventory table, increases the inventory occupied by the order, and then sends an Inventory Locked event to the message queue

The order service receives the Inventory Locked event and changes the status of the order to "Confirmed"

Some people ask, what if the inventory is insufficient and the lock is unsuccessful? Simple, the Inventory service sends a Lock Fail event, and when the Order Service receives it, it sets the order to "Canceled".

Good news, we can do without locks!

A big advantage of event-driven is that it cancels concurrency, and all requests are queued in, which is very helpful for us to implement the congestion model, and we don't need to manage locks in memory ourselves. Cancel the lock, the queue processing efficiency is very high, and event-driven can be used in high concurrency scenarios, such as panic buying.

Yes, the user experience has changed,

With this event-driven, user experience may change. For example, when there is no inventory in the original synchronization structure, it will immediately tell you that the condition cannot be placed and the order will not be generated. However, the event mechanism is changed, and the order is generated immediately. Yes, it is likely that the system will notify you that the order has been cancelled after a while. Just like rushing to buy "Xiaomi mobile phones", hundreds of thousands of people are queuing up, waiting for a long time to tell you that they are out of stock, come back tomorrow. If you want the user to get the result immediately, you can find a way in the front end to use a lock such as CountDownLatch in BFF (Backend For Frontend) to convert the asynchronous of the back-end into the synchronization of the front-end, of course, the consumption of BFF is relatively large.

No way, the product manager doesn't accept it,

The product manager said that the user's experience must be that no orders will be generated if there is no inventory. This solution will continue to generate canceled orders. He can't accept it. What should I do? Then, when querying the order list, skip these canceled orders, maybe you need an additional view to do it. I am not an idealist. Solving the current problem is my first consideration. The purpose of designing microservices is to solve the business concurrency. What I am facing now is the problem of user experience, so the architecture design also needs to be compromised :( But at least after the analysis, I know where I compromised and why, and it may change in the future.

Multi-field multi-table Join query

  • I personally think that the mode of the aggregate root is particularly suitable for modifying the state, but it is really inconvenient for searching data. For example, to filter out a batch of eligible orders, the aggregate root object itself cannot undertake batch query tasks, because this Not his responsibility. Then you must rely on the "Domain Service" facility.

When a method cannot be placed on an entity or value object, using a domain service is the best solution. Make sure that the domain service is stateless.

  • Our query tasks are often very complex, such as querying a list of products, requiring sorting by the sales of the previous month; sorting by the return rate of products, and so on. But after microservices and DDD, our storage model has been detached, and the above queries all involve data in multiple fields of orders, users, and commodities. How to do it? At this point we want to introduce the concept of a view. For example, for the operation of querying the user name to place an order, it is undoubtedly very inefficient to directly call two services to join themselves in memory. Coupled with some filter conditions and paging, it cannot be done. So we broadcast the events, receive these events by a separate view service, and form a materialized view (materialized view), the data has been joined, processed, placed in a separate query library, waiting for query, this It is a typical way of processing space for time.

After analysis, in addition to simply querying based on the primary key Find or List queries that do not have too many associations, most of our query tasks can be placed in a separate query database, which can be a ReadOnly database of relational databases or a NoSQL database. , in fact we used ElasticSearch as a special query view in the project, the effect is very good

Bounded Context and Data Coupling

In addition to the multi-domain join problem, we often encounter some scenarios in our business. For example, the commodity information in e-commerce is basic information and belongs to a separate BC, while other BCs, whether it is marketing service, price service, shopping cart service , Order service needs to refer to this product information. However, the required product information is only a small part of the whole. The marketing service needs the id and name of the product, and the status of the product on and off the shelf; the order service needs the product id, name, catalog, price, and so on. This is only a small subset compared to defining a commodity in the commodity center (commodity id, name, specification, specification value, details, etc.). This illustrates the same term in different bounded contexts, but refers to different concepts. Such a problem is mapped to our implementation. It is definitely inappropriate to directly query the commodity module in the order and marketing modules each time, because

  • The commodity center needs to adapt to the data required by each service and provide different interfaces
  • Concurrency must be huge
  • The coupling between services is serious, and once the system is down, the scope of the upgrade is very large.

Especially the last one, which seriously limits us to obtain the advantages provided by microservices "loose coupling, each service can be updated frequently without affecting other modules". This requires us to appropriately redundant some data to different BCs through event-driven methods to disassemble this coupling. This coupling is sometimes embedded in the entity through the Value Object, which is redundant when the entity is generated. For example, the information of the product is redundant when the order is generated; sometimes it is through an additional Value Object list. The marketing center redundant part of the relevant product list data, and monitors the status of the upper and lower levels of the product at any time, and synchronously replaces the product list in this bounded context.

The following is an analysis of an ordering scenario. In the e-commerce system, we can think that members and commodities are the basic data of all businesses. Their changes should be released to various fields through broadcast, and each field retains the information it needs.

Guaranteed eventual consistency

The success of eventual consistency depends on many conditions

  • Depending on the reliability of message delivery, system A may change the state, and the message is lost when it is sent to system B, resulting in inconsistent state of AB
  • Depending on the reliability of the service, if system A changes its state, it hangs before it can send a message. also lead to inconsistent state

I remember that JMS in the JavaEE specification has processing requirements for these two problems. One is that JMS ensures the reliability of message delivery through various confirmation messages (Client Acknowledge, etc.), and the other is that the message delivery operation of JMS can be added to the database. In the transaction - that is, no message is sent, it will cause the rollback of the database (no information is checked, the description is not very accurate, please correct me). However, there are very few MQs that conform to the JMS specification. In particular, ensuring consistency requires performance degradation. Now, MQs that boast high throughput have left the problem to our own applications to solve. So here are a few common methods to improve the effect of eventual consistency.

Use local transactions

Or the example of deducting credit from the above order

  • The order service starts a local transaction, and first adds an order;
  • Then insert the Order Created event into a special Event table, and the transaction is committed;
  • There is a separate timed task thread that scans the Event table regularly, scans out what needs to be sent and throws it to MQ, and sets the Event to "sent".

The advantage of the scheme is that it uses the transaction of the local database. If the event is not inserted successfully, the order will not be created; after the thread scans, the event is set as sent, which also ensures that the message will not be missed (our goal is to prefer Retransmit, and don't miss out, because Event processing will be designed to be idempotent). The knowledge points used in the full text of the architecture technology can be obtained for free in the group 619881427. Those who are interested can join in.

The disadvantage is that the event needs to be processed separately and published in the business logic, which is cumbersome and easy to forget; the sending of the event is somewhat delayed; the performance of the regular scan is high, and the hidden danger of the database high water level will be generated;

We made a slight improvement, using the database-specific MySQL Binlog tracking (Ali's Canal) or Oracle's GoldenGate technology to obtain the change notification of the database's Event table, so that we can avoid scanning through scheduled tasks.

However, tools that use these database logs will be bound to specific database implementations (or even specific versions). Please be careful when making decisions.

Using Event Sourcing Event Sourcing

Event source tracing is a special idea for us. It does not persist the Entity object, but only records the initial state and the Event of each change, and restores the latest state of the Entity object in the memory according to the Event. It is very similar to the implementation of Redolog of the database, but he puts this mechanism into the application layer.

Although event sourcing has many claimed advantages, special care should be taken when introducing this technology. First of all, it is not necessarily suitable for most business scenarios. Once there are many changes, efficiency is indeed a big problem; other query problems are also troublesome.

We only use Event Souring and AxonFramework exploratively in individual businesses. Due to the complexity of implementation, the specific situation needs to be summarized after a period of practice, and an additional article may be required to describe in detail

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324398977&siteId=291194637