Advanced skills in system architecture design · Cloud native architecture design theory and practice

Table of Contents of Series Articles

Advanced skills in system architecture design · Software architecture concepts, architectural styles, ABSD, architecture reuse, DSSA (1) [System Architect] Advanced
skills in system architecture design · System quality attributes and architecture evaluation (2) [System Architect]
Advanced skills in system architecture design · Software reliability analysis and design (3) [System Architecture Designer]

现在的一切都是为将来的梦想编织翅膀,让梦想在现实中展翅高飞。
Now everything is for the future of dream weaving wings, let the dream fly in reality.

Insert image description here

1. The connotation of cloud native architecture

1.1 Definition

Cloud native architecture is a collection of architectural principles and design patterns based on cloud native technology. It aims to maximize the stripping of non-business code parts in cloud applications, allowing cloud facilities to take over a large number of original non-functional features in applications ( Such as elasticity, resilience, security, and code parts are stripped to the maximum extent, so that cloud facilities can take over a large number of original non-functional features (such as elasticity, resilience, security, observability, grayscale, etc.) in the application, so that the business no longer has While being troubled by non-functional business interruptions, it is lightweight, agile, and highly automated.

1.2 Features

Application features based on cloud native architecture include:

(1) The code structure has undergone tremendous changes : it is no longer necessary to master files and their distributed processing technology, and it is no longer necessary to master various complex network technologies. Simplification makes business development more agile and faster.
(2) A large number of non-functional features are entrusted to the cloud native architecture to solve : such as high availability, disaster recovery, security features, operability, ease of use, testability, grayscale release capabilities, etc.
(3) Highly automated software delivery : Automated software delivery based on cloud native can automatically deploy applications to thousands of nodes.

1.3 Principles of cloud native

Cloud native has the following principles:

(1) Servitization principle : Separate modules of different life cycles through servitization architecture and conduct business iterations separately.
(2) Elasticity principle : Elasticity means that the deployment scale of the system can automatically expand and contract as the business volume changes.
(3) Observability principle : Through logs, link tracking and measurement, the time-consuming, return values ​​and parameters of multiple service calls are clearly visible.
(4) Resilience principle : The ability of software to withstand when various abnormalities occur in the software and hardware components on which the software depends.
(5) Principles of automation of all processes : Let automation tools understand delivery goals and environmental differences to realize automation of entire software delivery and operation and maintenance.
(6) Zero trust principle : No person/device/system inside or outside the network should be trusted, and the trust basis of access control needs to be reconstructed based on authentication and authorization.
(7) Principle of continuous evolution of architecture : The architecture has the ability to continue to evolve.

1.4 Main architectural patterns

The main architectural patterns involved in cloud native.

1.4.1 Service-oriented architecture model

It is required to divide an application software at the granularity of application modules, define mutual business relationships with interface contracts (such as IDL), ensure mutual interconnection with standard protocols (HTTP, gRPC, etc.), and combine it with Domain Driven Design (DDD) , Test Driven Design (TDD), and containerized deployment improve the code quality and iteration speed of each interface.

1.4.2 Mesh architecture model

Mesh architecture separates the middleware framework (such as RPC, cache, asynchronous messages, etc.) from the business process, further decoupling the middleware SDK from the business code, so that the middleware upgrade has no impact on the business process, and even migrates to another A platform's middleware is also transparent to the business.

1.4.3 Serverless mode

When business traffic arrives or a business event occurs, the cloud will start or schedule a started business process for processing. After the processing is completed, the cloud will automatically close/schedule the business process and wait for the next trigger. Developers do not need to worry about the application running location, operating system, network configuration, CPU performance, etc., and entrust the entire operation of the application to the cloud. Serverless mode is suitable for event-driven data computing tasks, request/response applications with short computing time, and long-cycle tasks without complex mutual calls.

1.4.4 Storage and calculation separation mode

The difficulty of CAP in a distributed environment is mainly for stateful applications. Since consistency (Consistency, C), availability (Available, A), and partition tolerance (Partition Tolerance, P) cannot be satisfied at the same time, at most two of them can be satisfied. . Therefore, stateless applications do not have the dimension of consistency and can achieve good availability and partition fault tolerance, thus achieving better elasticity.

1.4.5 Distributed transaction model

Since the business needs to access multiple microservices, distributed transaction problems will arise, otherwise the data will be inconsistent. Therefore, architects need to choose appropriate distributed transaction modes according to different scenarios. Commonly used ones are:
(1) XA mode: Since the XA specification is the standard for realizing distributed transaction processing, two-stage commit (2 Prepare Commit, 2PC) is usually used. The method has strong consistency, but requires two network interactions, so the performance is poor.
(2) Message-based eventual consistency (BASE): When availability and consistency conflict, in order to weigh the two, BASE proposes that as long as basic availability (BA) and eventual consistency (E) are met, the software that accepts data state or undetermined state (S) to prioritize performance, so this type of system usually has high performance. However, due to the characteristics of the application, a compromise between availability and consistency is chosen, resulting in poor versatility.
(3) TCC model: Using the Try-Confirm-Cancel two-stage model, transaction isolation is controllable and efficient, but it requires application code to split the business model into two stages, so it is highly intrusive to the business and the cost of design, development and maintenance is high. .
(4) SAGA mode: Each forward transaction corresponds to a compensation transaction. If the forward transaction fails to execute, the compensation transaction will be executed for rollback. Therefore, development and maintenance costs are high.
(5) AT mode of the open source project SEATA: It delegates the second phase of the TCC mode to the underlying code framework and cancels row locks, so it is very high-performance and has no code development workload, and can automatically perform rollback operations, but There are some usage scenario restrictions.

1.4.6 Observable architecture

The observable architecture includes Logging, Tracing, and Metrics. Logging provides multiple levels of tracing, such as INFO/DEBUG/WARNING/ERROR; Tracing collects the aggregation of access logs from the front end to the back end of a request to form a complete call link trace; Metrics provides Multi-dimensional measurement of system quantification, including concurrency, time consumption, available time, capacity, etc.

1.4.7 Event-driven architecture

Event Driven Architecture (EDA) is an integrated architecture pattern between applications/components. Suitable for enhancing service resilience, data change notification, building open interfaces, event stream processing, and command query responsibility separation (Command Query Responsibility Segregation, CQRS). Commands that affect the service status are initiated using events without affecting the service status. The query only uses the API interface that is called synchronously.

1.5 Typical anti-patterns of cloud native architecture

Architecture design sometimes requires choosing different methods based on different business scenarios. Common cloud native anti-patterns include:

(1) Huge single application : lack of dependency isolation, code coupling, unclear boundaries between responsibilities and modules, lack of governance of interfaces between modules, diffusion of changes, difficulty in coordinating development progress and release time between different modules, and instability of one sub-module As a result, the entire application slows down. When expanding, the capacity can only be expanded as a whole and the modules that reach the bottleneck cannot be expanded individually.
(2) "Hard splitting" of monolithic applications into microservices : Forcibly split modules with high coupling degree and low code quality into services; after splitting, the service data is tightly coupled; after differentiation, it becomes distributed calls, which is serious affect performance.
(3) Microservices lacking automation capabilities : The number of modules per person responsible for increases, the workload per person increases, and the cost of software development also increases.

2. Cloud native architecture related technologies

1.1 Container technology

As a basic unit of standardized software, containers package and release applications and their dependencies. Because the dependencies are complete, applications are no longer restricted by the environment and can be read quickly and run reliably in different computing environments .

Comparison of container deployment mode with other modes, as shown in the figure, comparison of traditional, virtualization, and container deployment modes:
Insert image description here

1.2 Container orchestration technology

Container orchestration technology includes resource scheduling, application deployment and management, automatic repair, service discovery and load balancing, elastic scaling, declarative API, scalability architecture, and portability .

1.3 Microservices

The microservice model splits the back-end monolithic application into multiple loosely coupled sub-applications, each sub-application is responsible for a set of sub-functions. These sub-applications become "microservices", and multiple "microservices" together form a physically independent but logically complete distributed microservice system. This microservice is relatively independent and improves the overall iteration efficiency by decoupling the development, testing and deployment processes.

Microservice design constraints are as follows:

(1) Microservice individual constraints: For
a well-designed microservice application, the functions completed should be independent of each other in terms of business domain division. Compared with a single application that is forcibly bound to a language and technology stack, the advantage of this is that different business domains have different technology options. For example, the recommendation system using Python may be much more efficient than Java. From an organizational point of view, microservices correspond to smaller teams and higher development efficiency. “A microservice team can eat two pizzas in one meal” and “a microservice application should be able to complete an iteration at least every two weeks” are all metaphors and standards for how to correctly divide the boundaries of microservices in business domains. In summary, the "micro" of microservices is not to be micro for the sake of being micro, but to reasonably split a single application according to the problem domain. Furthermore, microservices should also have orthogonal decomposition characteristics, focus on specific businesses and do them well in terms of division of responsibilities, which is the Single Responsibility Principle (SRP) in the SOLID principle. If a microservice is modified or released, it should not affect the business interaction of another microservice in the same system.

(2) Horizontal relationships between microservices.
After reasonably dividing the boundaries between microservices, the horizontal relationships between services are mainly dealt with from the discoverability and interactivity of microservices. The discoverability of microservices means that when service A is released, expanded or reduced, how can service B, which depends on service A, automatically detect the changes in service A without re-releasing it? A third-party service needs to be introduced here. Registration center to meet the discoverability of services; especially for large-scale microservice clusters, the push and expansion capabilities of the service registration center are particularly critical. The interactivity of microservices refers to the way in which service A can call service B. Due to the constraints of service autonomy, calls between services need to use language-independent remote calling protocols. For example, the REST protocol satisfies the two important factors of "language-independent" and "standardization" well. However, in high-performance scenarios, An IDL based binary protocol might be a better choice. In addition, most current microservice practices in the industry often do not reach the HATEOAS heuristic REST call, and services need to complete the call through a pre-agreed interface. In order to further realize the decoupling between services, the microservice system needs an independent metadata center to store the metadata information of the service. The service queries the center to understand the details of the call. As service links continue to grow, the entire microservice system becomes more and more fragile. Therefore,
the principle of failure-oriented design is particularly important in the microservice system. For individual microservice applications, mechanisms to enhance service resilience such as current limiting, circuit breaker, isolation, and load balancing have become standard. In order to further improve system throughput and make full use of machine resources, this can be achieved through coroutines, Rx models, asynchronous calls, backpressure and other means.

(3) Vertical constraints between microservices and data layer

In the field of microservices, the Data Storage Segregation (DSS) principle is adopted, that is, data is the private asset of the microservice, and access to the data must be accessed through the API provided by the current microservice. If not, it will cause coupling in the data layer, violating the principle of high cohesion and low coupling. At the same time, for performance reasons, read-write separation (CQRS) is usually adopted. Similarly, due to the unpredictable impact of container scheduling on the stability of underlying facilities, the design of microservices should try to follow the stateless design principle, which means that the upper application and the underlying infrastructure are decoupled, and microservices can be freely scheduled between different containers. . For microservices that have data access (that is, stateful), the separation of computing and storage is usually used to sink the data to distributed storage. In this way, a certain degree of statelessness can be achieved.

(4) Microservice distributed constraints from a global perspective
. From the beginning of microservice system design, the following factors need to be considered: efficient operation and maintenance of the entire system, and technically preparing a fully automated CI/CD pipeline to meet the demand for development efficiency. , and on this basis supports different release strategies such as blue-green and canary to meet the demand for business release stability. In the face of complex systems, full-link, real-time and multi-dimensional observability capabilities have become standard. In order to prevent various operation and maintenance risks in a timely and effective manner, relevant data need to be gathered and analyzed from multiple event sources in the microservice system, and then displayed in a multi-dimensional manner in a centralized monitoring system. As microservices continue to be split, timeliness of fault detection and accuracy of root causes have always been the core demands of development and operation and maintenance personnel.

1.4 Serviceless technology

Characteristics of serviceless technology:

(1) Fully managed computing services, customers only need to write code to build applications, and do not need to pay attention to the homogeneous and burdensome development, operation and maintenance, security, high availability and other work based on server and other infrastructure;

(2) Versatility, combined with the capabilities of cloud BaaS API, can support all important types of applications on the cloud;

(3) Automatic elastic scaling eliminates the need for users to plan capacity in advance for resource usage;

(4) Pay-as-you-go billing allows enterprises to effectively reduce usage costs without having to pay for idle resources.

1.5 Service network

Service Mesh is a new technology developed for distributed applications based on microservice software architecture. It aims to sink common functions such as connection, security, flow control and observability between microservices into platform infrastructure. Achieve decoupling of applications and platform infrastructure . This decoupling means that developers do not need to pay attention to microservice-related governance issues and focus on the business logic itself, improving application development efficiency and accelerating business exploration and innovation. In other words, the service mesh enables application lightweighting in a non-intrusive way because a large amount of non-functionality is stripped from business processes into other processes.

As shown in the figure, the typical architecture of the service grid:

Insert image description here

In this architecture diagram, all requests from service A to call service B are intercepted by the service proxy underneath it. The proxy service A completes the service discovery, circuit breaker, current limiting and other strategies for service B. The overall control of these strategies is Configure on the Control Plane.

Guess you like

Origin blog.csdn.net/weixin_30197685/article/details/132513262