Comparison of advantages and disadvantages of DDD CQRS architecture and traditional architecture

In recent years, in the field of DDD, we often see the concept of CQRS architecture. I personally also wrote an ENode framework specifically to implement this architecture. The idea of the CQRS architecture itself is actually very simple, that is, the separation of reading and writing. It's a well-understood idea. Just like we use the primary and secondary MySQL database, data is written to the primary, and then queries are checked from the secondary. The synchronization of primary and secondary data is the responsibility of the MySQL database itself, which is a kind of read-write separation at the database level. There are actually a lot of introductions about the CQRS architecture. You can Baidu or Google by yourself. Today, I mainly want to summarize the similarities and differences between this architecture and the traditional architecture (three-tier architecture, DDD classic four-tier architecture) in terms of data consistency, scalability, availability, scalability, and performance. I hope to summarize some advantages. And shortcomings, to provide a reference for everyone to do the architecture selection.

Foreword

The CQRS architecture itself is only an idea of separation of reading and writing, and there are various ways of implementation. For example, the data storage is not separated, only the code-level read-write separation is also the embodiment of CQRS; then the read-write separation of the data storage, the C-side is responsible for data storage, the Q-side is responsible for data query, and the data on the Q-side is synchronized through the Event generated by the C-side , which is also an implementation of the CQRS architecture. The CQRS architecture I'm discussing today refers to this implementation. Another very important point is that we will also introduce the two architectural ideas of Event Sourcing+In Memory on the C side. I think these two ideas and the CQRS architecture can be perfectly combined to give full play to the maximum value of the CQRS architecture.

Data Consistency In

traditional architecture, data is generally strongly consistent. We usually use database transactions to ensure that all data modifications in one operation are in one database transaction, thus ensuring strong data consistency. In distributed scenarios, we also want strong data consistency, that is, using distributed transactions. However, as we all know, the difficulty and cost of distributed transactions are very high, and the throughput of systems using distributed transactions will be relatively low, and the availability of the system will also be relatively low. Therefore, in many cases, we will also give up the strong consistency of data and adopt eventual consistency; from the perspective of the CAP theorem, we will give up consistency and choose availability.

The CQRS architecture fully adheres to the concept of eventual consistency. This architecture is based on an important assumption that the data the user sees is always old. For a multi-user operating system, this phenomenon is common. For example, in the seckill scenario, before you place an order, you may see the number of products on the interface, but when you place an order, the system prompts that the products are sold out. In fact, as long as we think about it carefully, it is indeed the case. Because the data we see on the interface is taken from the database, once it is displayed on the interface, it will not change. But it is very likely that someone else has modified the data in the database. This phenomenon is especially common in most systems, especially high-concurrency WEB systems.

So, based on this assumption, we know that even if our system achieves strong data consistency, users are likely to see old data. Therefore, this provides us with a new idea for designing the architecture. Can we do this: We just need to ensure that the data on which all additions, deletions, and modifications to the system are based is up-to-date, and the data that is queried does not have to be up-to-date. This naturally leads to the CQRS architecture. The C-side data is kept up-to-date and the data is strongly consistent; the Q-side data does not need to be up-to-date, it can be updated asynchronously through the events of the C-side. Therefore, based on this idea, we began to think about how to implement both ends of CQ specifically. Seeing this, maybe you still have a question, why does the data on the C-side have to be up-to-date? This is actually very easy to understand, because if you want to modify the data, you may have some modified business rules to judge. If the data you base on is not the latest, it means that the judgment is meaningless or inaccurate, so based on the old data The modifications made are meaningless.

Extensibility

In the traditional architecture, there is a strong dependence between various components, and they are all direct method calls between objects; while the CQRS architecture is an event-driven idea; from the microscopic aggregate root level, the traditional architecture is that the application layer coordinates multiple processes through procedural codes. Each aggregate root completes the entire business operation in a transactional manner at one time. The CQRS architecture, based on Saga's idea, finally realizes the interaction of multiple aggregate roots through an event-driven approach. In addition, the CQ ends of the CQRS architecture also synchronize data asynchronously through events, which is also an embodiment of event-driven. Rising to the architectural level, the former is the idea of SOA, and the latter is the idea of EDA. SOA is a service calling another service to complete the interaction between services, and the services are tightly coupled; EDA is a component subscribes to the event message of another component, and updates the component's own state according to the event information, so the EDA architecture, each component is Does not depend on other components; components are only associated through topics, and the coupling is very low.

The coupling of the two architectures is mentioned above. Obviously, the architecture with low coupling has good scalability. Because of the SOA idea, when I want to add a new function, I need to modify the original code; for example, the original A service calls two services B and C, and then we want to call one more service D, we need to change the logic of the A service; In the EDA architecture, we do not need to touch the existing code. Originally, there were two subscribers B and C subscribed to the messages generated by A. Now we only need to add a new message subscriber D.

From the perspective of CQRS, there is also a very obvious example, which is the scalability of the Q-end. Suppose we originally used the database to implement the Q-side, but then the system access volume increased, and the database update was too slow or could not meet the high concurrent query, so we hope to increase the cache to deal with the high concurrent query. That's easy for the CQRS architecture, we just need to add a new event subscriber to update the cache. It should be said that we can easily increase the data storage type of the Q-end at any time. Databases, caches, search engines, NoSQL, logs, etc. We can choose the appropriate Q-side data storage according to our own business scenarios to achieve the purpose of fast query. All of this is due to the fact that our C-side records all model change events. When we want to add a new View storage, we can get the latest state of View storage according to these events. This kind of scalability is difficult to achieve under the traditional architecture.

Availability

Availability , whether it is a traditional architecture or a CQRS architecture, can be highly available, as long as we make each node in our system have no single point. However, in contrast, I feel that the CQRS architecture can have more room for avoidance and choice in terms of usability.

In the traditional architecture, because read and write are not separated, it is more difficult to combine read and write for availability. Because of the traditional architecture, if the concurrent writes of a system during the peak period are large, such as 2W, the concurrent reads are also large, such as 10W. Then the system must be optimized to support such high concurrent writes and queries at the same time, otherwise the system will hang at peak times. This is the shortcoming of the system based on the idea of synchronous calling. There is no one thing to cut the peaks and fill the valleys, and save the extra requests in an instant, and the system must be able to process them in time no matter how many requests it encounters, otherwise it will cause an avalanche effect. , causing the system to collapse. However, a system will not be at the peak all the time, and the peak may only be half an hour or an hour; but in order to ensure that the system does not hang during the peak, we must use enough hardware to support this peak. Most of the time, such high hardware resources are not needed, so it will cause a waste of resources. Therefore, we say that the implementation cost of a system based on synchronous calls and SOA thinking is very expensive.

Under the CQRS architecture, because the CQRS architecture separates reading and writing, availability is equivalent to being considered in two parts. We only need to consider how the C-side solves the write availability, and how the Q-side solves the read availability. I think it is easier for the C-side to solve usability, because the C-side is message-driven. When we want to modify any data, we will send Command to the distributed message queue, and then the back-end consumer processes the Command-> generates domain events-> persists events-> publishes events to the distributed message queue-> The event is finally processed by the Q side Consumption. This link is message driven. Compared with the direct service method invocation of the traditional architecture, the availability is much higher. Because even if our back-end consumer processing Command hangs temporarily, it will not affect the front-end Controller sending Command, and the Controller is still available. From this point of view, the CQRS architecture is more available for data modification. But you might say, what if the distributed message queue hangs? Oh, yes, it is indeed possible. However, the general distributed message queue belongs to middleware, and the general middleware has high availability (supports clustering and active-standby switching), so the availability is much higher than our application. In addition, because commands are sent to the distributed message queue first, the advantages of the distributed message queue can be fully utilized: asynchrony, pull mode, peak shaving, and queue-based horizontal scaling. These features can ensure that even if the front-end Controller sends a large number of Commands in an instant during peak hours, it will not cause the back-end application processing Commands to hang up, because we are pulling Commands according to our own spending power. This is also the advantage of CQRS C-side in terms of availability. In fact, it is also an advantage brought by distributed message queues. Therefore, from here we can realize that the EDA architecture (event-driven architecture) is very valuable, and this architecture also reflects the idea of our current popular Reactive Programming (responsive programming).

Then, for the Q-side, it should be said that it is no different from the traditional architecture, because it is all about processing high-concurrency queries. How to optimize this point before, how to optimize now. But as I emphasized in the scalability above, the CQRS architecture can more conveniently provide more View storage, database, cache, search engine, NoSQL, and the updates of these storages can be carried out in parallel without dragging each other down. The ideal scenario, I think it should be, if your application needs to implement a complex query such as full-text indexing, you can use a search engine such as ElasticSearch on the Q side; if your query scenario can be satisfied by the data structure of keyvalue, then we Redis, a NoSQL distributed cache, can be used on the Q side. In conclusion, I think CQRS architecture, we will solve query problem easier than traditional architecture, because we have more options. But you may say that my scenario can only be solved with relational databases, and the concurrency of queries is also very high. There is no other way. The only way is to query IO in a decentralized manner. We do sub-database and sub-table for the database, and do one master and multiple backups for the database, and query the standby machine. At this point, the solution is the same as the traditional architecture.

Performance and Scalability

Originally I wanted to write performance and scalability separately, but considering that these two are actually related, I decided to write them together.

Scalability means that when a system is accessed by 100 people, the performance (throughput, response time) is very good, and the performance is also good when accessed by 100W people. This is scalability. 100 people visit and 100W people visit, the pressure on the system is obviously different. If our system, in terms of architecture, can improve the service capability of the system by simply adding machines, then we can say that this architecture is highly scalable. Then let's think about the performance and scalability of the traditional architecture and the CQRS architecture.

When it comes to performance, people generally think about where the performance bottleneck of a system is. As long as we solve the performance bottleneck, the system is meant to achieve scalability through horizontal expansion (of course, the horizontal expansion of data storage is not considered here). Therefore, we only need to analyze where the bottlenecks of the traditional architecture and the CQRS architecture are.

In traditional architectures, the bottleneck is usually the underlying database. Then our general approach is, for reading: usually using cache can solve most query problems; for writing: there are many ways, such as sub-database sub-table, or using NoSQL, and so on. For example, Alibaba adopts the scheme of sub-database and sub-table in a large number, and in the future, it should all use the high-level OceanBase to replace the sub-database and sub-table scheme. Through sub-database and sub-table, a database server may have to bear 10W of high concurrent writes during peak hours. If we put the data on ten database servers, each machine only needs to bear 1W of writes, compared to 10W to bear. It is much easier to write 1W now. Therefore, it should be said that data storage is no longer a bottleneck for traditional architectures.

The steps of a data modification in the traditional architecture are: 1) fetch data from DB to memory; 2) modify data in memory; 3) update data back to DB. A total of 2 database IOs are involved.

Then the CQRS architecture, the time spent on both ends of the CQ combined must be more than the traditional architecture, because the CQRS architecture has up to 3 database IOs, 1) persistent commands; 2) persistent events; 3) update the read database according to the event. Why say the most? Because this step of persisting the command is not necessary, there is a scenario where the persisting command is not required. The purpose of persistent commands in the CQRS architecture is to do idempotent processing, that is, we want to prevent the same command from being processed twice. In which scenario is the persistence command unnecessary? That is, when the aggregate root is created in the command, the persistent command is not required, because the version number of the event generated by the creation of the aggregate root is always 1, so we can detect this type of event based on the event version number when persisting the event. repeat.

Therefore, we say that if you want to use the CQRS architecture, you must accept the eventual consistency of the CQ data, because if you complete the operation processing based on the update of the read library, the time used for that business scenario is likely to be longer than that of the traditional architecture. many. However, if we end up with C-side processing, the CQRS architecture may be faster because the C-side may only need one database IO. I think it's important here. For the CQRS architecture, we pay more attention to the time it takes for the C-side processing to complete; it doesn't matter if the Q-side processing is slightly slower, because the Q-side is only for us to view the data (eventual consistency). When we choose the CQRS architecture, we must accept the disadvantage that the Q-side data update has a little delay, otherwise this architecture should not be used. Therefore, I hope that everyone must fully realize this when making architecture selection according to your business scenario.

So, in general, performance bottlenecks can be overcome by both architectures. And as long as the performance bottleneck is overcome, scalability is not a problem (of course, I did not consider the problem of system unavailability caused by data loss here. This problem is an unavoidable problem for all architectures, and the only solution is data Redundancy, not expanded here). The bottleneck of both lies in the persistence of data, but the traditional architecture can only use the scheme of sub-database and sub-table because most systems store data in relational databases. In the CQRS architecture, if we only focus on the bottleneck of the C-side, because the things to be saved on the C-side are very simple, that is, commands and events; if you believe in some mature NoSQL (I think using a document database such as MongoDB is more suitable for storage) commands and events), and you have enough ability and experience to operate and maintain them, you can consider using NoSQL for persistence. If you feel that NoSQL is unreliable or out of your control, you can use a relational database. But in this way, you also need to make efforts. For example, you need to be responsible for sub-database and sub-table to save commands and events, because the data volume of commands and events is very large. However, some cloud services, such as Alibaba Cloud, have provided DRDS, a database storage solution that directly supports sub-database and sub-table, which greatly simplifies the cost of storing commands and events. Personally, I think I will still use the scheme of sub-database and sub-table, the reason is very simple: to ensure that the data is logical, mature, reliable and controllable. Therefore, through this comparison, we know that the traditional architecture, we must use sub-database and sub-table (unless Ali can use OceanBase); and the CQRS architecture can give us more choices. Because persistent commands and events are very simple, they are all unmodifiable read-only data, and they are friendly to kv storage. Document-based NoSQL can also be selected. The C-side is always new data, and no data is modified or deleted. Finally, it is about the bottleneck of the Q-end. If your Q-end is also using a relational database, it is the same as the traditional architecture, and you can optimize it as you please. The CQRS architecture allows you to use other architectures to implement Q, so there are relatively more optimization methods.

concluding remarks

I think both the traditional architecture and the CQRS architecture are good architectures. The traditional architecture has a low threshold and many people who understand it, and because most projects do not have a large amount of concurrent writing and data. So it should be said that for most projects, the traditional architecture is OK. However, through the analysis of this article, everyone also knows that the traditional architecture does have some shortcomings, such as the solution to scalability, availability, and performance bottlenecks, which is weaker than the CQRS architecture. Everyone has other opinions, welcome to make bricks, exchanges can make progress, huh, huh. Therefore, if your application scenario is high concurrent writing, high concurrent reading, and big data, and you want to perform better in scalability, availability, performance, and scalability, I think you can try the CQRS architecture. But there is still a problem, the threshold of CQRS architecture is very high, I think it is difficult to use without mature framework support. As far as I know, there are not many mature CQRS frameworks in the industry. The java platform includes axon framework and jdon framework; .NET platform and ENode framework are working hard in this direction. So, I guess this is one of the reasons why there are few mature cases using the CQRS architecture. Another reason is that the use of the CQRS architecture requires developers to have a certain understanding of DDD, otherwise it will be difficult to practice, and DDD itself will be difficult to apply in practice after a few years of understanding. There is another reason, the core of the CQRS architecture is very dependent on high-performance distributed message middleware, so it is also a threshold to select a high-performance distributed message middleware (Java platform has RocketMQ), .NET platform I personally Specifically developed a distributed message queue EQueue, huh, huh. In addition, without the support of a mature CQRS framework, the coding complexity will also be very complicated, such as Event Sourcing, message retry, message idempotent processing, sequential processing of events, and concurrency control. These problems are not so easy to solve. And if there is framework support, and the framework can help us solve these purely technical problems, developers only need to focus on how to model, implement the domain model, how to update the reading library, and how to implement the query, then it is possible to use the CQRS architecture. Possibly simpler to develop than traditional architectures, and to reap many of the benefits of a CQRS architecture.

Comparison of advantages and disadvantages of DDD CQRS architecture and traditional architecture

Guess you like