Distributed Architecture Principles and Practice Reading Notes

Distributed Architecture Principles and Practice Reading Notes

Changes in IT software architecture: from monolithic architecture, to cluster architecture, to the current distributed and microservice architecture.

Distributed architecture has the characteristics of distribution, autonomy, parallelism, and globality.

In order to cope with the high concurrency of requests and the complexity of the business, application services need to be reasonably split from the original large and centralized to small and decentralized;

In order for these distributed services to complete computing tasks together, it is necessary to solve the communication and collaboration problems between them;

Like services, the database responsible for storage will also be dispersed, so distributed storage needs to be considered;

If all services and databases require hardware resources as support, then resource management and scheduling are also essential;

In addition, after the software system is launched, key indicators need to be monitored.

High concurrency, high availability, scalability, extensibility, and sufficient security have always been the goals pursued by architecture design.

The evolution of the architecture

1. Application and data integration model

Insert image description here

2. Application and data separation mode

As the business develops, the number of users and requests gradually increases, and problems arise in the performance of the server. A simpler solution is to add resources and store business applications and data separately.

Insert image description here

3. Introduction of caching technology

With the development of information systems and the increase in the number of Internet users, business volume, user volume, and data volume are all increasing. At the same time, we also found that users have particularly large requests for certain data, such as news, product information, and popular news. In the previous model, the way to obtain this information was to rely on the database, so it would be affected by the IO performance of the database. Over time, the database became the bottleneck of the entire system. And even if the number of servers is increased, it may be difficult to solve this problem, so caching technology comes on the scene.

Caching technology is divided into client browser caching, application server local caching and cache server caching.

● Client browser cache:

If each HTTP request is cached, the pressure on the application server can be greatly reduced.

● Application server local cache:

This type of cache uses an in-process cache, also called a managed heap cache. Taking Java as an example, this part of the cache is stored on the managed heap of the JVM and will be affected by the managed heap recycling algorithm. Because it runs in memory and responds to data very quickly, it is often used to store hot data. When there is no hit in the in-process cache, the information will be obtained from the cache server. If there is still no hit, it will be obtained from the database.

● Cache server cache:

Compared with the local cache of the application server, this cache is an out-of-process cache, which can be deployed on the same server as the application service or on a different server. Generally speaking, in order to facilitate management and rational utilization of resources, it will be deployed on a dedicated cache server. Since the cache takes up memory space, this type of server is often configured with relatively large memory.

Insert image description here

After adding caching technology, system performance has been improved. This is because the cache is in memory, and memory can be read much faster than disk and can respond to user requests very quickly. Especially for some hot data, the advantages are particularly obvious. At the same time, there has also been a significant improvement in availability. Even if the data server fails for a short period of time, the hotspot data or core data saved in the cache server can still satisfy users' temporary access.

4. Server cluster: handling concurrency

But as the number of user requests increases, another problem arises, that is, concurrency.

Let’s take a look at these two words: and can be understood as “together in parallel”, which means at the same time; “fa” can be understood as “to issue a call”, which means to issue a request.

Together,Concurrency refers to multiple users requesting the application server at the same time.

If the original system only faced large amounts of data, then it now needs to face simultaneous requests from multiple users.

To put it simply, a server cluster means clustering multiple servers together, using more servers to share the load pressure of a single server and improve performance and availability. To put it more plainly, it is to increase the number of service processing requests per unit time. It used to be that one server handled requests from multiple users, but now it's handled by a bunch of servers, just like a bank counter, serving more customers by increasing the number of tellers.

Insert image description here

At this time, you need to pay attention to the balancing algorithm used by the load balancer (such as polling and weighted polling) to ensure that user requests are evenly distributed to multiple servers, that all requests belonging to the same session are processed on the same server, and that for The advantages and disadvantages of different server resources can dynamically adjust traffic.

5. Separation of database reading and writing

Adding cache can solve the problem of reading some hotspot data, but the cache capacity is limited after all, and those non-hotspot data still have to be read from the database. Database performance is different for write and read operations. When writing data, row locks or table locks will occur. If other write operations are executed concurrently, queuing will occur. The read operation is not only faster than the write operation, but can also be achieved through indexing, database caching, etc. Therefore, a solution to separate database reading and writing was introduced:

The master-slave database is set up. The master database is mainly used to write data, and then synchronizes the updated data to the slave database (slave) by synchronizing binlog. For the application server, it only needs to access the main library when writing data, and it only needs to access the slave library when reading data.

Insert image description here

While realizing the benefits of database read-write separation, architecture design also needs to consider reliability issues. For example, if the main database dies, how does the slave database take over the work of the master database? After the master database recovers, should it become a slave database or continue to serve as the master database, and how does the master-slave database synchronize data.

6. Reverse proxy and CDN

As the Internet gradually becomes more popular, people have higher and higher requirements for network security and user experience.

In the past, users directly accessed the application server to obtain services through the client, which exposed the application server to the Internet and made it vulnerable to attacks.

If a reverse proxy server is added between the application server and the Internet, the server will receive the user's request and then forward the request to the application server on the intranet, which is equivalent to acting as a buffer between the external network and the intranet. This will solve the previous problem.

The reverse proxy server only forwards requests and does not run any applications. Therefore, when someone attacks it, it will not affect the application server on the intranet. This virtually protects the application server and improves security. . At the same time, the reverse proxy server also plays the role of adaptation and network speed conversion between the Internet and the intranet.

Insert image description here

CDN, its full name is Content Delivery Network, which is content distribution network. If you imagine the Internet as a large network, then each server or each client is a node distributed in this large network. The distance between nodes may be far or near, and user requests will jump from one node to another node, and finally jump to the application server to obtain information. The fewer jumps, the faster information can be obtained, so the information can be stored in nodes closer to the client. In this way, users only need to jump less times to reach the information through the client.

Since this part of information is not updated frequently, it is recommended to store some static data, such as JavaScript files, static HTML, image files, etc.

In this way, the client can obtain resources from the network node closest to itself, which greatly improves user experience and transmission efficiency.

Insert image description here

The addition of CDN significantly speeds up users' access to the application server, and at the same time reduces the pressure on the application server. Requests that originally had to directly access the application server now do not need to go through layers of networks. Resources can be obtained as long as the nearest network node is found. However, from the perspective of requesting resources, this method also has limitations, that is, it only works on static resources, and requires regular resource updates to the CDN server. The addition of reverse proxy and CDN solves the problems of security, availability and high performance.

7. Distributed database and sub-database and sub-table

As the system running time increases, more and more data accumulates in the database. At the same time, the system also records some process data, such as operational data and log data, which will also increase the burden on the database. Even if the database is equipped with indexes and caches, it will still be stretched when querying massive data.

If read-write separation is to allocate database resources from the read-write level, then distributed databases need to allocate databases from the business and data levels.

● For data tables, when the table contains too many records, it can be divided into multiple tables for storage.

● For databases, there is an upper limit on the maximum number of connections and connection pool that each database can bear. In order to improve the efficiency of data access, the database will be divided according to business needs to allow different businesses to access different databases. Of course, different data of the same business can also be stored in different databases.

**If the database resources are placed in different database servers, it is a distributed database design. **Since the data is stored in different tables/libraries, or even on different servers, the complexity of the code will increase when performing database operations. At this time, database middleware can be added to achieve data synchronization, thereby eliminating differences between different storage carriers.

Insert image description here

The distributed design of the database and the sub-tables and sub-databases will improve the performance of the system and also increase the difficulty of database management and access. It used to be that you only needed to access one table and one library to obtain data, but now you need to span multiple tables and multiple libraries.

From a software programming perspective, there are database middlewares that provide best practices, such as MyCat and Sharding JDBC.

Additionally, from a database server management perspective, server availability needs to be monitored.

From the perspective of data governance, issues of data expansion and data governance need to be considered.

8. Business split

Through studying the previous stages, we know that system improvement basically relies on exchanging space for time, using more resources and space to handle more user requests. With the increasing complexity of business and the advent of high concurrency, some large manufacturers have begun to split business application systems and deploy applications separately.

If the previous server cluster mode is to copy the same application to different servers, then business splitting is to split an application into multiple and deploy them to different servers. In addition, there are also plans to horizontally expand core applications and deploy them to multiple servers.

Although the applications have been split, they are still related, and there are problems with calling, communicating and coordinating between them.

This introduces middleware such as queues, service registration discovery, and message centers. These middleware can assist the system in managing applications distributed to different servers and network nodes.

Insert image description here

After the business is split, application services will be formed one by one, including business-based services, such as commodity services and order services, as well as basic services, such as message push and permission verification. These application services, together with the database server, are distributed in different containers, servers, and network nodes. The communication, coordination, management, and monitoring between them are all problems we need to solve.

9. Distributed and microservices

The microservice architecture cuts business applications into more refined segments, turning them into smaller business modules. It can achieve high cohesion and low coupling between modules. Each module can exist independently and be maintained by an independent team. .

Each module can adopt its own unique technology without caring about the technical implementation of other modules.

Modules are run through the deployment of containers, and calls are implemented through interfaces and protocols between modules. Any module can be made public for calls by other modules, or the hotspot module can be expanded horizontally to enhance the overall performance of the system. In this way, when a problem occurs in one of the modules, other identical modules can take over its work. Enhanced usability.

To sum up, microservices have the following characteristics: refined business splitting, autonomy, technical heterogeneity, high performance, and high availability. It is very similar to a distributed architecture. From a conceptual understanding, both have performed the "dismantling" action, but there are differences in the following aspects:

The difference between microservice architecture and distributed architecture
1. Different purposes of splitting

The distributed design is proposed to solve the problem of limited resources of a single application. One server cannot support more user access. Therefore, an application is disassembled into different parts and then deployed to different servers to share the high concurrency workload. pressure.

Microservices refine service components with the purpose of better decoupling and allowing services to achieve high performance, high availability, scalability, and extensibility through combination.

2. Different splitting methods

The distributed service architecture splits the system according to business and technical classifications, with the purpose of allowing the split services to load the original single service business.

Microservices are more detailed on the basis of distribution. It splits services into smaller modules, which are not only more professional, but also have a more refined division of labor, and each small module can run independently.

3. Different deployment methods

After the distributed architecture splits the service, the split parts are usually deployed on different servers.

Microservices can not only deploy different service modules to different servers, but also deploy multiple microservices or multiple backups of the same microservice on one server, and often use containers for deployment.

Insert image description here

The difference between distributed and microservices

Although distributed and microservices have the above differences, from a practical point of view, they are both built based on the idea of ​​distributed architecture.

It can be said that microservices are an evolved version of distributed and a subset of distributed.

A simple example of distributed architecture

Insert image description here

Order business architecture diagram under distributed architecture

This order business structure is divided into four layers:

Client, load balancer (can be called access layer), application server (can be called application layer), data server (can be called storage layer)

Client: The interface between the user and the system, where users browse products, place orders, etc.

Access layer: The load balancer can route user requests to different server clusters by user IP. In addition, operations such as flow control and identity authentication can be performed.

Application layer: used to deploy main application services, such as commodity services, order services, payment services, inventory services and notification services.

Storage layer: data reading and writing, primary and backup databases.

Distributed storage can be used for storage. The so-called distributed storage means that the amount of data of product information in the e-commerce system is relatively large. In order to improve access efficiency, the data is usually stored in fragments. After being split, the product tables will be distributed to different in a database or server.

Insert image description here

Read-write separation and master-slave replication

Characteristics of distributed architecture

Distribution:

Looking at the word distribution separately, "division" refers to splitting, which can be understood as splitting of services, splitting of storage data, and splitting of hardware resources.

"Build" refers to deployment, but also refers to the deployment of resources. There are both computing resources and storage resources. Simply put,

Distribution means splitting up deployment.

Autonomy:

Distribution leads to autonomy. Simply put, autonomy means that each application service has the ability to manage and control its own tasks and resources.

Parallelism/Concurrency:

Autonomy results in each application service being an independent individual, with independent technology and business, and occupying independent physical resources. This independence can reduce the coupling between services, enhance the scalability of the architecture, and lay the foundation for parallelism.

Global:

Distribution enables services and resources to be deployed separately. Autonomy means that a single service has its own business and resources, and multiple services complete large tasks in parallel.

When multiple service applications distributed on different network nodes jointly complete a task, global considerations are required.

To put it bluntly, if dispersed resources want to accomplish a big thing together, they need communication and collaboration, that is, having an overall view.

Distributed architecture issues

The order of problems that distributed architecture needs to solve:

(1) Distribution replaces several services and resources with dispersed services and resources, so application services are first split according to business.

(2) Since services are distributed on different servers and network nodes, the problem of distributed calls must be solved.

(3) After services can perceive and call each other, they need to complete some tasks together. These tasks are either performed together or in sequence, so the distributed collaboration problem needs to be solved.

(4) When working together, you will encounter large-scale computing situations, and you need to consider using a variety of distributed computing algorithms to deal with it.

(5) The results of any service need to be saved, which requires consideration of storage issues. Like services, the distribution of storage can also improve storage performance and availability, so distributed storage issues need to be considered.

(6) All services and storage can be regarded as resources, so distributed resource management and scheduling need to be considered.

(7) The distributed architecture is designed to achieve high performance and availability. To achieve this goal, let's take a look at the best practices for high performance and availability, such as caching applications, request throttling, service degradation, etc.

(8) Finally, after the system goes online, effective monitoring of performance indicators is required to ensure stable operation of the system. At this time, indicators and monitoring are the issues we need to pay attention to.

Insert image description here

Structural diagram of the problems that distributed architecture needs to solve

1. Split application services

The implementation of technology comes from business, so the analysis of business needs to be put first. We can use the DDD (Domain-Driven Design) method to define the domain model, determine the boundaries of business and application services, and ultimately guide the implementation of the technology. Application services designed according to the DDD method meet the standards of "high cohesion and low coupling".

DDD is not an architecture, but an architecture design methodology. It transforms business into domain models through boundary division. The domain model in turn forms the boundaries of application services and assists in the implementation of the architecture.

DDD is a design idea that focuses on complex fields. It builds domain models around business concepts, separates complex businesses, and then maps the separated businesses to code practices.

mainly include:

● Model structure of domain-driven design: including the introduction of domain, domain classification, subdomain, domain events, aggregation, aggregate root, entity and value object.

● Analyze business requirements to form application services: including business scenario analysis, abstract domain objects, and delimited context.

● Domain-driven design layered architecture: including layering principles, content and characteristics of each layer, and layering examples.

A. Business splitting practice based on DDD thinking

Three steps are required to complete the split of the entire application service, namely analysis, extraction and construction.

Any software architecture is established to complete business requirements, and business requirements are used to achieve business goals. Here we take building a student course selection system as an example to explain the entire process of service analysis and extraction. The business background of the student course selection system is as follows.

● Students can choose elective courses through the system and submit an application for elective course selection, which will then be reviewed by the Academic Affairs Office.

● After teachers from the Academic Affairs Office receive the application for elective courses, they will check it according to the approval rules, and finally produce the approval result: pass or fail.

● Students who are qualified to take elective courses need to sign in when they go to class. The teacher will check the sign-in status and generate sign-in details at the end of the course. At the same time, students can also check their check-in status.

Basically it can be summarized as follows: students apply for elective courses, the Academic Affairs Office approves elective courses, students sign in and the teacher checks the sign-in record.

1. Split ideas

(1) Create business processes according to different business scenarios, and mark participants, commands and event information on the nodes of each business process.

(2) Generate domain objects based on annotated participant, command and event information, including entities, value objects, aggregations, domain events, etc. Domain experts and technical teams use a common language to further divide related domain objects, form aggregations and find aggregation roots.

(3) Delineate bounded context through aggregation, where you need to rely on a common language, because the content and meaning of the same transaction in different bounded contexts may be different. Bounded context is the boundary of the service, based on which services or applications are created.

2. Split process

Through the above description of business needs, we can divide the business needs into three scenarios, namely the application for elective courses scenario, the elective course approval scenario and the elective course sign-in scenario. Next, we draw the business process diagrams corresponding to these three scenarios, and mark the participants, commands, and event information.

Insert image description here

Application scenario for elective courses

Insert image description here

Scenario for approving elective courses

Insert image description here

Elective course sign-in scene
3. Extract domain objects and generate aggregations

By analyzing the business, the requirements are divided into participants, business processes, commands and events. Then they correspond to domain objects, and the relationships between domain objects are generated.

The purpose of extraction is to observe the correlations and commonalities between domain objects, and finally aggregate them and divide them into bounded contexts.

Use different shapes to represent domain objects in the three scenes:

Circles represent entities,

The rectangle represents the command,

Pentagons represent events.

Note that here we only roughly divide domain objects and do not subdivide them, because the purpose is to divide the boundaries of services and aggregations.

Insert image description here

Extract domain objects from business processes

By extracting domain objects, you can see:

1. Elective course application This entity exists in both the elective course application scenario and the elective course approval scenario, and has the same meaning.

Similarly, the approval rule entities and login commands also exist in the elective course application scenario and the elective course approval scenario.

2. Sign-in is clear. The entity exists alone in the elective course sign-in scene, and if students and teachers exist in three scenes, it is a universal entity.

Although the elective course entity exists in the three scenarios, the elective courses in the elective course application scenario and the elective course approval scenario describe the course itself, including course content and credits;

The elective courses in the elective course sign-in scenario are more concerned with information such as the time for starting and ending classes, and the location of classes. This is caused by contextual inconsistency, where the meaning of the same thing in different contexts deviates.

Aggregation is a logical boundary that provides the basis for the division of bounded contexts. To generate an aggregate, you first need to consider the logical independence of the aggregate, that is, whether a complete business logic can be completed within the aggregate.

For the aforementioned elective course application scenarios, elective course approval scenarios, and elective course sign-in scenarios, three aggregations can of course be generated. However, considering that the first two scenarios are both completing the business process of application approval, they can be combined into one aggregation.

Of course, if the business becomes more complicated after being merged together, it can also be split again.

At the same time, the elective course sign-in scenario can generate an aggregation by itself, in which the student and teacher entities belong to the organizational relationship, which is relatively common. This concept should be used in other places in the system, so it can be extracted as a separate aggregation.

Insert image description here

From extracting domain objects to generating aggregates
4. Delineate bounded context

The generated aggregate divides the bounded context, which is the boundary of the generated service.

If the aggregation is the logical boundary of the service, then the bounded context is the physical boundary of the service.

From the perspective of business completion, elective course application aggregation and sign-in aggregation belong to different semantic environments.

Divide the three aggregations into three bounded contexts: elective course application, sign-in, and personnel organization. The people organization can serve as a general domain, assisting the other two subdomains.

The elective application serves as a separate bounded context that hosts most entities, commands, and events, and can be considered a core domain.

Check-in can be used as a supporting domain to support the core domain.

Insert image description here

Bounded context based on aggregation

The three bounded contexts shown above can be implemented correspondingly by three application services, namely elective course application service, sign-in service, and personnel organization service.

This also reflects the concept of splitting distributed application services that we want to express.

Of course, this division is not the only option. For example, the application for elective courses itself is an aggregation, and this aggregation can be further split into two aggregations: application and approval.

As the business develops and changes, new bounded contexts may also be derived. These require constant iteration using the idea of ​​domain-driven design.

Communication between bounded contexts can be done through domain events.

B. Domain-driven design layering theory and its corresponding project code structure

Domain-driven design layering can help us transform domain objects into software architecture.

Layering is the most commonly used method when decomposing complex software systems.

In the idea of ​​domain-driven design, layering represents the software framework and is the "skeleton" of the entire distributed architecture;

Domain objects are the mapping of business in software, like "flesh and blood".

Layering not only allows us to view software design from a higher position, but also brings advantages such as high cohesion, low coupling, scalability, and reusability to the entire architecture.

1. How to layer

Architecture layering looks like dividing and stacking each layer according to its function. However, it is necessary to consider clearly the responsibilities of each layer and the dependencies between layers when implementing it.

The architecture is divided into three layers, and some are divided into four or five layers.

Depending on the business situation, technical background, and team structure, the layering will also be different. Here, the hierarchical approach of domain-driven design is used to provide hierarchical ideas for distributed architecture.

Insert image description here

Traditional four-layer architecture of domain-driven design

From top to bottom are the user interface layer, application layer, domain layer and basic layer. Arrows represent dependencies and dependent relationships between layers. For example, an arrow pointing from the user interface layer to the application layer indicates that the user interface layer depends on the application layer. As can be seen from the figure, the base layer is dependent on all other layers and is located at the core.

However, this division conflicts with the concept of business leadership technology. When building a distributed architecture, one must first understand the business, then dismantle the business, and finally map the business to the software architecture. From this point of view, the domain layer is the core of the architecture, so the dependencies of the four-layer architecture in the picture above are problematic.

So DIP (Dependency Inversion Principle) appeared. The idea of ​​DIP points out that high-level modules should not depend on low-level modules, both of which should depend on abstraction; abstraction should not depend on details, and details should depend on abstraction. Therefore, the base layer as the bottom layer should rely on the interfaces provided by the user interface layer, application layer and domain layer. The high-level layer is developed based on the business, and interfaces are generated by abstracting the business. The bottom layer relies on these interfaces to provide services to the high-level layer.

Insert image description here

Redefine the four-layer architecture of domain-driven design using dependency inversion
1. User interface layer

is also called the presentation layer, which includes three parts: user interface, Web service and remote call. This layer is responsible for displaying information to the user and interpreting user instructions. The main responsibility of this layer is to interact with external users and systems, receive feedback, and display data.

2. Application layer

The application layer is relatively simple and does not contain business logic. It is used to coordinate the tasks and work of the domain layer. The application layer is responsible for organizing the entire application process and is designed for use cases.

Usually, application services run at the application layer and are responsible for service composition, service orchestration and service forwarding, assembling business execution sequences and assembling results.

It cannot be said that the application layer has nothing to do with the business. It simply combines the business in a coarse-grained way.

Specific functions include information security authentication, user permission verification, transaction control, message sending and message subscription, etc.

3. Domain layer

The domain layer implements the core business logic of application services and ensures the correctness of the business. This layer reflects the business capabilities of the system and is used to express business concepts, business status and business rules.

The domain layer contains domain objects in domain-driven design, such as aggregates, aggregate roots, entities, value objects, and domain services. The business logic of the domain model is implemented by entities and domain services.

Domain services describe the process of business operations and can convert domain objects, process multiple domain objects, and produce a result.

The difference between domain services and application services is that it has more business relevance.

4. Base layer

The base layer provides common technologies and basic services for the other three layers, including data persistence, tools, message middleware, cache, etc.

For example, the database access implemented in the basic layer is oriented to the domain layer interface.

The domain layer only issues commands to the base layer based on the business, telling it the data specifications that need to be provided (data specifications include user name, ID card, gender, age, etc.). The base layer is responsible for obtaining the corresponding data and passing it to the domain layer. Specifically how to obtain data and where to obtain data from, these issues all need to be considered by the basic layer, and the domain layer does not care.

The domain layer all faces the same abstract interface, which is the data specification. When the implementation method of the database is changed, for example, from Oracle database to MySQL database, the base layer only needs to modify the implementation method of obtaining data; the domain layer still follows the previous data specifications for data acquisition, and is not subject to any Influence.

2. Hierarchical structure diagram

Insert image description here

Domain-driven design hierarchical structure diagram

Looking from above.

The first is the user interface layer, including user interface, Web services and information communication functions. As the entrance to the system, below the user interface layer is the application layer. This layer mainly includes application services, but does not include specific businesses. It is only responsible for combining, orchestrating and forwarding domain services in the domain layer.

Below the application layer is the domain layer. This layer includes domain objects such as aggregations, entities, and value objects, and is responsible for completing the main business logic of the system. Domain services are responsible for operating one or more domain objects to complete business logic that needs to span domain objects.

Below and to the right of the user interface layer, application layer, and domain layer is the base layer. Just like its name, this layer provides basic services for the other three layers, including API gateway, message middleware, database, cache, basic services, General tools etc. In addition to providing basic services, the base layer is also aimed at decoupling common technologies.

3. Hierarchical calls within services and calls between services

Implementing the layered idea into distributed architecture or microservice architecture, each split application or service includes a user interface layer, application layer, and domain layer. So how are calls completed within services and between services? You can see the picture below:

Insert image description here

Calls within and between services
4. Map layers to code structure

Code structure is the mapping of hierarchical structure in the code implementation dimension. Good hierarchical design helps design code structure, and good code structure design makes it easier for people to have a clear understanding of the overall software architecture.

1. Code structure of user interface layer

Insert image description here

Code structure of user interface layer

After the VO (ViewObject) of the presentation layer is passed to the user interface layer, it is first converted into DTO through Assembler, and then passed down by Facade.

● Assembler: plays the role of format conversion. The format of the data passed into the user interface layer and the data in the user interface layer may be different. For example, the presentation layer submits a form, which we call VO (View Object). After this VO is passed to the user interface layer, it needs to be converted by Assembler to form data in DTO format that the user interface layer can recognize.

● DTO (Data Transfer Object): It is the carrier of data transmission at the user interface layer. It does not contain business logic and is converted by Assembler. DTO can isolate the user interface layer from the outside world.

● Facade: Facade is the interface provided by services to external systems and the entrance for calling domain services. Facade provides a coarse-grained calling interface, usually does not contain business logic, and just transfers user requests to application services for processing. Generally, the Controller that provides API services is a Facade.

Insert image description here

User interface layer code structure
2. Application layer code structure

Insert image description here

Application layer code structure

The incoming messages from the user interface layer are first converted into Commands and then handed over to the Application Service for processing.

Application Service is responsible for connecting the domain layer, calling domain objects such as domain services and aggregation (root), and arranging and assembling business logic. At the same time, Application Service also assists the domain layer in subscribing and publishing Events.

● Command: Command, which can be understood as an operation performed by the user, such as placing an order, paying, etc., is an incoming parameter of the application service.

● Application Service: Application service will call and encapsulate the Aggregate, Service, Entity, Repository, and Factory of the domain layer. It mainly realizes combination and orchestration. It does not realize the business itself, but only combines the business.

● Event: event, which mainly stores event-related code and is responsible for event subscription and publishing. The initiation and response of events are handled at the domain layer. If we take newspaper subscription as an example, then Event in the application layer is responsible for the specific work of subscribing to newspapers and contacting and publishing newspapers, reading subscribed newspapers and publishing newspapers. It is completed by Event of the domain layer.

Insert image description here

Application layer code structure
3. Domain layer code structure

The code structure of the domain layer includes one or more Aggregates. Each Aggregate also includes Entity, Event, Repository, Service, Factory, etc. These domain models jointly complete the core business logic.

Insert image description here

Domain layer code structure

The application layer depends on Aggregate and Service in the domain layer.

Aggregate contains Entity and value objects.

Service combines domain objects to complete complex business logic.

Methods in Aggregate and actions in Service will generate Events.

The persistence and query of all domain objects are implemented by Repository.

● Aggregate: Aggregation. The root directory of an aggregate is usually represented by the name of an entity, such as orders and products. Since aggregation defines the logical boundaries within the service, the entities, value objects, and methods in the aggregation all revolve around a certain logical function. For example, order aggregation includes order item information, order placing methods, order modification methods, payment methods, etc. The main purpose is to achieve high cohesion of the business. Since a service is composed of multiple aggregates, the splitting and expansion of the service can be re-orchestrated based on the aggregates.

For example, when aggregate C in service 1 becomes a business bottleneck, it can be expanded to service 3. Or due to business reorganization, aggregate A can be migrated from service 1 to service 2.

Insert image description here

Aggregated code structure facilitates service expansion and reorganization

● Entity: entity, including business fields and business methods. Cross-entity business logic code can be placed in Service.

● Event: Domain event, including logic code related to business activities, such as order created and order paid. As a tool responsible for communication between aggregates, Event needs to implement the functions of sending and listening for events. It is recommended to store the code for listening events separately in the listener directory.

● Service: Domain services include services that need to be completed by one or more entities, or functions that need to be completed by calling external services. For example, the order system needs to call external payment services to complete payment operations. If the business logic of the Service is relatively complex, you can design a class for each Service separately. Wherever you need to call an external system, it is best to use a proxy class to achieve maximum decoupling.

● Repository: warehouse, its role is to persist objects. Operations on data are placed here, mainly reading and querying. Generally speaking, one Aggregate corresponds to one Repository.

Insert image description here

Domain layer code directory
4. Base layer

The code structure of the base layer mainly includes tools, algorithms, caches, gateways and some basic general classes. The directory storage at this level is relatively random and is determined based on specific circumstances. There are no specific regulations here, only an example is given for reference. The top is the infrastructure directory, and below it are the config and util folders, which store configuration and tool-related code respectively.

Insert image description here

Base layer code directory
5. Complete hierarchical structure and its code directory

Insert image description here

Hierarchical code structure diagram

Insert image description here

Layered code directory
C. Code layering example

Let’s first introduce the business background. We need to implement a function to create orders. Each order has multiple order items. Each order item corresponds to a product, and the product has a corresponding price. The order can be calculated based on the price of the order items and the order. Total price, set corresponding delivery address for each order.

Insert image description here

Segment the order business from three dimensions: entities, events, and commands

Insert image description here

Order aggregation relationship diagram

Insert image description here

Relationship diagram of order aggregation at the domain level

Insert image description here

Relationship diagram of order aggregation at the application layer

Insert image description here

A complete view of the code directory of the order business

Below the userinterface folder is the content of the user interface layer. It is relatively simple here. It is a Web API controller that is responsible for providing external access interfaces. Since there is no object conversion, the assembler and dto folders are empty.

The infrastructure directory stores the content of the base layer. Since the aggregate root needs to be defined, the base class of the aggregate is stored in the aggregate directory. The event directory stores event-related basic classes. Similarly, the exception directory stores basic classes defined for exceptions, the jackson directory stores basic classes for serialization and deserialization, and the repository directory stores basic classes for data warehouses.

Insert image description here

First, the request is passed in from the user interface layer. Since it is an order creation operation, the CreateOrderCommand command will be passed into the OrderController class as a parameter.

After receiving the command, the user interface layer will call the OrderApplicationService in the application layer, and the createOrder method will call the OrderFactory and OrderRepository in the domain layer respectively.

The create method of OrderFactory can generate the aggregate root Order, and then call the create method in Order to generate the order.

Afterwards, Order will call the raiseEvent method to send OrderCreatedEvent to other services to notify other services that the order has been created.

createOrder will call the save method in OrderRepository, passing in the parameter Order, and save the Order to the database.

Code process for creating orders

Insert image description here

The application layer service calls the domain layer method to generate the order and save it

The service in the application layer is only responsible for generating the aggregate root Order and then saving it.

Insert image description here

Domain layer aggregate root creates order

In the aggregate root Order of the domain layer, the order is created through the create method, and the message is sent through raiseCreatedEvent after the order is generated.

D. Some additions to domain-driven design

Insert image description here

Division of Domains and Subdomains

Insert image description here

The two sides of the domain model: business (problem space) and technical (solution space)

If the concepts of domain and subdomain tell us how to define business boundaries from a business perspective, then how should we divide this boundary and how should we define business boundaries in terms of technology?

The answer is bounded context. What is this? Let's break down the word. "Limit" means restriction, "jie" means boundary, and "limit" means the limit boundary; "context" is the context of the conversation. Take a product as an example. It is " Raw materials and accessories" are "commodities" in the sales stage and "goods" in the logistics stage. The same thing is given different meanings depending on the environment. This environment is the context.

Insert image description here

2. Distributed calling

Once services and resources are dispersed, it is not that simple to call them. It is necessary to find the corresponding service module for different user requests. For example, when a user places an order, he needs to call the order service. When a large number of users request the same service and there are multiple services, user requests need to be evenly distributed to different services based on resource distribution. Just like when a user browses products, there are multiple products and services to choose from, so which one of them will provide the service?

Regarding the calling problem, there are different ways to deal with it at different architectural levels:

Before user requests enter the application server through the Internet, they need to go through load balancing and reverse proxy;

API gateway calls are required between application servers on the intranet;

Services can call each other through service registration center, message queue, remote call, etc.

Therefore, distributed calls can be summarized into two parts,

The first part is to perceive the other party, including load balancing, API gateway, service registration and discovery, and message queue;

The second part is information transfer, including RPC, RMI, and NIO communication.

3. Distributed collaboration

Distributed collaboration, as the name suggests, means that everyone completes one thing together, and it is a big thing.

In the process of completing this important task, it is inevitable that you will encounter many problems.

For example, an inventory service that responds to multiple requests at the same time will "deduct" the inventory of the same product. In order to ensure the exclusivity of access to critical resources such as product inventory, the concept of distributed locks is introduced to allow multiple "deductions" Requests can be executed serially.

For another example, when the user performs the "place an order" operation, the "record order" (order service) and "deduct inventory" (inventory service) need to be processed in the transaction. Either both operations are completed, or neither operation is completed. .

For another example, after separating the reading and writing of the product table, a master-slave database is generated. When the master database fails, a new master database will be elected through distributed election to replace the work of the original master database. We summarize these issues into the following points.

● Characteristics and mutual exclusion issues of distributed systems: centralized mutual exclusion algorithm, permission-based mutual exclusion algorithm, and token ring mutual exclusion algorithm.

● Distributed locks: The origin and definition of distributed locks, cache implementing distributed locks, ZooKeeper implementing distributed locks, and segmented locking.

● Distributed transactions: Introduces the principles and solutions of distributed transactions. Including the principles of CAP, BASE, ACID, etc.; DTP model; 2PC, TCC scheme.

● Distributed election: Introduces several distributed election algorithms, including Bully algorithm, Raft algorithm, and ZAB algorithm.

● Distributed system practice: ZooKeeper

4. Distributed computing

For the calculation of massive data, distributed architecture usually adopts horizontal expansion method to deal with the challenge. The calculation methods will be different in different computing scenarios, and the calculation modes are divided into two types:

MapReduce mode for batch static data calculation,

and the Stream mode for computing on dynamic data streams.

5. Distributed storage

To simply understand, storage is the persistence of data. From the perspective of participants, data producers produce data and then store it on the media, and data users consume data through data indexing.

From the perspective of data type, data is divided into structured data, semi-structured data, and unstructured data. In a distributed architecture, data will be fragmented according to rules, and data synchronization operations need to be completed for the master-slave database. If you want to build a good data storage solution, you need to pay attention to data uniformity, data stability, node heterogeneity, and fault isolation.

● Problems faced by data storage and solutions: RAID disk array.

● Classification of distributed storage elements and data types.

● Distributed relational database: split tables and databases, master-slave replication, and data expansion.

● Distributed cache: cache sharding algorithm, Redis cluster solution, communication between cache nodes, routing of requests for distributed cache, expansion and contraction of cache nodes, discovery and recovery of cache failures.

6. Distributed resource management and scheduling

If each user request is regarded as a task that the system needs to complete, then what the distributed architecture needs to do is to match tasks with resources.

● The origin and process of distributed scheduling.

● Resource division and scheduling strategies.

● Distributed scheduling architecture.

● The characteristic of centralized scheduling is that one network node participates in resource management and scheduling.

● Two-level scheduling divides resource management and scheduling from one layer to two layers on the basis of single scheduling, namely the resource management layer and the task allocation layer.

● Shared state scheduling, which completes scheduling work through shared cluster state, shared resource state, and shared task state.

● Resource scheduling practice: such as the architecture of Kubernetes and the operating principles of its components.

7. High performance and availability

High performance and availability are themselves the goals of distributed architecture. The ideas of splitting and dividing and conquering distributed architecture also revolve around this purpose. This part mainly focuses on two aspects: caching and availability.

At every level and angle of the distributed architecture, caching technology can be used to improve system performance. Because the use of technology is relatively scattered,

For availability, in order to ensure the normal operation of the system, intervention will be carried out through means such as current limiting, downgrading, and circuit breakers.

● Caching applications: HTTP cache, CDN cache, load balancing cache, in-process cache, distributed cache.

● Availability strategies: request current limiting, service degradation, and service circuit breaker.

8. Indicators and monitoring

When judging whether an architecture is good or bad, there are two reference standards, namely performance indicators and availability indicators. The same is true for distributed architectures.

Performance indicators are further divided into throughput, response time and completion time.

Due to the distributed nature of the system, services will be distributed to different servers and network nodes, so the monitoring program needs to monitor services on different servers and network nodes. In distributed monitoring, the classification and layering of monitoring systems and the best practices of Zabbix, Prometheus, and ELK will be mentioned.

● Performance indicators: latency, traffic, errors, saturation.

● Distributed monitoring system: steps to create a monitoring system, classification of monitoring systems, and layering of monitoring systems.

● Best practices for popular monitoring systems: including Zabbix and Prometheus.

Insert image description here

A breakdown of 8 issues in distributed architecture

Architecture design summary

Architectural design thinking, models

Insert image description here

Architectural Designer's Rational Model

Insert image description here

Architectural design cutting

Insert image description here

Communication between architectural designers, system designers, system security designers, and user experience designers

Refactoring

The increase in workload will generally lead to an increase in the amount of code in the system architecture, and the number of problems exposed by the code will also increase. It turned out that the pits left in the code in the rush to go online finally had to be filled in by myself. In order to improve code quality and make the business go further, designers have increased code review and refactoring efforts.

● When to refactor is an interesting question. Usually, when we start coding, we should design the code architecture and component model. However, due to various reasons, I basically adopt cowboy style programming, that is, I write wherever I think. Only after I step on the trap do I realize that I should refactor the code all the time. For this point, the following aspects can be referenced:

​ ● Don’t do anything but three principles:

​ When you write code to implement a certain business function for the first time, you just write it like this without doing any design; the second time you encounter a similar function, you find that this function seems to have been used before, so you write it again; the third time When you encounter the same function again, you have to tell yourself that you need to refactor. This principle has a large number of application scenarios, especially when developing applications and encountering the extraction of some common business components or system components.

Insert image description here

Nothing but three principles

​ ● Refactor when adding functionality:

​ When you add a new function to the old module, you find that this new function can be completed by a combination of several original functions, but the versatility of those original functions is not very good, so you reconstruct the original function so that It has stronger reusability

Insert image description here

Refactor as functionality is added

​ ● Refactor when fixing bugs:

Programmers often feel a deep sense of satisfaction after fixing a bug. If you can analyze the cause of the bug after repairing it, check whether the same bug exists in other places, and whether the bug can be completely solved by extracting common components, then code refactoring will be very meaningful.

Insert image description here

Refactor when fixing bugs

​ ● Refactor when reviewing code:

​ This is more common in extreme programming and pair programming. One programmer writes the code, and the other programmer reviews the code. The two learn from each other and make progress together. Different people have different backgrounds, ideas, and depths of understanding, so collaborative writing of the same code will make the code appear more three-dimensional, and the refactoring at this time is efficient.

Insert image description here

Refactor during code review

Performance testing and stress testing

If the results of the performance test are the baseline of the system, then the results of the stress test are the upper limit or high-voltage line of the system. The range between the baseline line and the high-voltage line is the scalable range of the system. We pay close attention to the load of the system through these two lines.

Guess you like

Origin blog.csdn.net/yinweimumu/article/details/134899133