360° Perspective: Evolution of Cloud-Native Architecture

This article is published by  NetEase Cloud .

 

Continued from above: 360° Perspective: Cloud-Native Architecture and Design Principles

 

At present, Internet companies continue to move forward with the development of their businesses. Therefore, different stages have different needs, so different approaches need to be used to focus on different purposes. For example, start-up companies need to seize suitable opportunities to quickly conduct prototype verification, and prove that the general direction is correct before further development is possible. Therefore, the technology is generally designed to meet business development and verification, as long as the shared components can be reused. Only with the further development of the business and the increasing size of the organizational structure and personnel, can the company have the opportunity to develop into a medium-sized or unicorn company. However, the increase in the number of collaborators also means new problems. Generally speaking, as long as the scale of the enterprise exceeds 150, it will bring about a great increase in management costs. At this stage, technology not only needs to be more fully integrated with products, but also needs to be standardized and engineered. When the scale expands again, if it is just a simple accumulation of manpower, the coordination problem cannot be solved. The enterprise needs a more efficient structure to support and feed back the development of the business, and the tools and processes need to be more automated to ensure the sustainable development of the business. , the marginal effect is very obvious, and the challenge is also greater .

 

In conclusion, the evolution of technology is a very important process at any stage, especially in technology-oriented companies. If the technology is ahead, it may become the first mover, and if the technology is backward, it may lead to the slow development of the enterprise and the loss of opportunities.

 

Whether it is an individual developer or a technology company, if the purpose is to continue to operate and develop, it needs a set of good and fixed systems. Similarly, technically, the continuous progress of the team is inseparable from the engineering thinking of the project. The efficient management of the software life cycle of development, testing, and release must rely on engineering, which is the basis for R&D collaboration and cloud operation and maintenance. This includes many problems that need to be solved by technical personnel. For example, the pursuit of advanced technology or the rationality assessment of business needs may be a problem that many companies need to weigh in technology management according to the organizational structure. Generally speaking, any company starting from scratch will go through different stages of technological development, including a simple start-up stage, a rapid growth stage, and a distributed service-oriented architecture stage. Each stage corresponds to different product and business requirements, and technology assumes different responsibilities, which requires technical personnel to make decisions, especially the issues that require architects to think first.

 

Start-up structure

 

Generally speaking, startup companies are basically in the trial-and-error or prototype verification stage when they start a new business. This stage is more concerned with whether the business itself has prospects or business models, rather than focusing on technology. On the system architecture, especially for the non-technical or uncertain project establishment stage. Although many technicians also expect that it will take a lot of time to design the system well in the early stage to ensure the rapid development of possible business in the future, but it often cannot be implemented well due to time cost or manpower and other reasons.

 

Generally speaking, entrepreneurial projects have very strict time requirements, and basically need to complete the system launch within 3 to 6 months. Otherwise, it may be impossible to obtain the relevant raw data for the next step due to the failure of the business to be quickly online and verified. Target verification, more seriously, may cause a break in the capital chain. Rome was not built in a day, so this stage will use a relatively simple architectural approach to design. This section will explain the main points first. For more details, please refer to Chapter 3.

 

Monolithic Architecture

 

For start-up companies, due to the influence of important factors such as talent, technology, and capital, and at the same time, in order to meet the needs of products, technical personnel will use the simplest structure to complete the most primitive stage of development. According to many users we have contacted, some companies even consider cost factors and only use one server or container service. In addition, traditional official websites, forums and other applications, because the early design used a single architecture to achieve, only need a server or container to serve. For other application servers, databases, static files and other resources, they are also deployed to the same server or container to serve. The simplest architectural model is shown in Figure 1-11.

 

 

                                                                                    Figure 1-11 The simplest architecture model

 

For the early single application model, application service + database service basically constitutes the most primitive architecture model, and technical personnel will consider the selection of technologies, including programming language, version management, database type, etc. For example, PHP developers choose PHP+MySQL, and Java developers use Tomcat+MySQL and other development methods.

 

server separation

 

According to online operation experience, general business types, if the daily user visits are within one million PV levels, as long as simple Web application performance parameter tuning, database index optimization, etc., can basically guarantee the service. Stable operation. Of course, as the traffic continues to increase, applications deployed on the same server and service applications such as databases will compete with the server's CPU/memory/disk/bandwidth and other system resources, thereby affecting each other. Obviously, performance bottlenecks are easy to occur. If this server is down or cannot be recovered, it may lead to inaccessibility of the entire site or data loss, etc., and the consequences are very serious. Therefore, most products will use Web applications. The server and the database server are physically separated, deployed independently, and provide services in hot backup with each other. The corresponding performance and data reliability problems can be solved with only a small increase in cost.

 

In the initial stage, due to various conditions, the prospect of new projects cannot be well foreseen. If the technical personnel can ensure the rationality of the architecture at the minimum cost, they can also serve the functional requirements of the product well, even as long as the deployment architecture is used. A little tweaking can prevent catastrophic problems, including many technical architectural considerations.

 

business model

 

Generally speaking, the business at this stage is relatively simple and the products are relatively single, and the business will be adjusted at any time according to its operational data. Therefore, at this time, it is necessary for technicians to be able to separate different modules. For partial business-related functions , it is necessary to have a good mentality to accept the uncertainties that change at any time. For the subsequent projects that may be reused or heavily relied upon, better design is required, otherwise, the progress of business development may become slower and slower when the business breaks out It hinders the development of the business and causes the business to be interrupted from time to time. Even if there is manpower or time to redesign the system, it will create resistance among technicians and introduce higher risks. Therefore, the design patterns based on cloud-native applications also have a great impact on the architecture at the most basic stage, including considering how to use the elasticity of the cloud and integrating the immutable advantages into the design of the system. Reasonable business model demarcation is also one of the important steps to ensure subsequent development.

 

All in all, in the early stage of project prototype verification or rapid trial and error, the adoption of a monolithic architecture has great technical advantages, and the product idea can be better iteratively developed in the initial stage of the project, and the release and deployment are more flexible. However, with the growth of the business, if the architecture remains the same, the technical risks will become higher and higher. For example, the increase in the number of lines of code will affect the learning cost of technical personnel, the speed of business change, business reliability and security. Each modification must be tested repeatedly, otherwise the entire site may be unavailable at any time, resulting in business interruption or loss of market opportunities. Therefore, this part of the technical debt must be transformed in the technical architecture while the business is developing rapidly, so that it can ensure the support of the later business. Therefore, in addition to the business development judgment, the developer's technical ability reserve and architectural vision judgment will also become one of the things to be considered.

 

Rapid growth period architecture

 

Next, as the start-up company further develops its business, for example, when the number of visitors reaches 100,000 daily active users (UV), it is usually the most critical moment, not only to ensure the stable operation of the business, but also to carry out rapid product iteration. At this stage, because the business model has been verified and feedback to a certain extent, there may be many competing products or friends. On the one hand, with the injection of venture capital, it will rely on higher-quality data for development and operation. On the other hand, the emergence of competing products has led to the accelerated progress of the market. Therefore, how to ensure the harmonious development of business and technology during the adjustment period is one of the indicators to test whether the architecture is flexible enough. Similarly, this section mainly explains a few points. For more details, please refer to Chapter 4.

 

Front-end acceleration optimization

 

First of all, based on browser-side applications or mobile-side applications, with the continuous increase of requests, occasionally you will see performance bottlenecks in web services, resulting in slow or failed requests. In addition to the low configuration of the server itself, it is more likely that due to architectural design or separation, a large number of concurrent Web requests are blocked or slowed down. At this time, we need to decompose the front and back ends of the architecture, reasonably configure or forward requests. If the front-end service requests are too late to process or have bottlenecks, we can store pictures, JS, CSS, HTML and application service-related static resource files through Nginx Local proxy or object storage service is used for physical acceleration, different domain names are used to forward requests, and static resources are distributed and cached on each node through CDN to achieve "nearest access", and the CDN cache is actively or passively refreshed to accelerate front-end services. If the dynamic request pressure of the back-end is too high or there are hot services, the stateless back-end service can be further extended horizontally to meet the business sharing, and the state needs to judge whether it can be served by vertical expansion, otherwise, only code and architecture design can be carried out. Or the adjustment of business planning to optimize. Generally speaking, by separating the access of dynamic requests and static requests (“dynamic and static separation”), the access pressure of the server in terms of CPU, disk I/O, and bandwidth can be effectively solved. Of course, some methods need to be adopted in the architecture design. to make adjustments.

 

Horizontal expansion

 

As mentioned in the previous section, vertical expansion can solve some of the problems. However, due to the rapid growth of business and traffic and limited vertical resources, different application scenarios need to rely on different strategies for offloading. For example, applications with long connections will depend on the 4-layer network. Connection, Internet applications are usually completed in a 7-layer mode, and even in game scenarios, rely on UDP for communication. In order to share the pressure of the server more and ensure the high availability of the business, load balancing technology is usually a method to solve the problem at this stage. By adding multiple back-end servers, the function of offloading can be realized. The offloading design also faces many principles and skills. , such as the distribution path, weight, etc. The role of load balancing also determines the back-end application architecture. For example, stateless design can achieve horizontal expansion. In addition, it is necessary to consider whether the business is related, and at the same time, the back-end services are abnormal. In this case, the health check is automatically performed, and abnormal services can be offline in time and fail quickly.

 

Database and cache optimization

 

The use of databases and caches is an effective way to solve the problems of back-end structured data and unstructured data. According to different scenarios, it is necessary to understand which data is appropriate to use structured data, which data is more appropriate to use unstructured data, and which data is appropriate to use. In this way, the cost is acceptable under the condition of ensuring better performance. At the same time, how to transition between the database and the cache also needs to be considered, such as how to ensure the consistency of the cache when the data is updated, how to ensure that the hot data is always accessed, and how to improve the hit rate of the cache. In addition, when a large number of users access data that does not exist, it may also cause great pressure on the backend, and may even cause an avalanche effect.

 

Each service independently undertakes corresponding functions, performs its own duties, and provides different service capabilities according to the characteristics of the application. For example, the application server provides user access services, and the database service is responsible for the storage of structured data. For the storage of structured data (KV value pairs), etc., if the search function is to be provided, the data needs to be segmented, indexed, retrieved, etc., and different servers provide corresponding services according to the functional requirements of the business, as shown in Figure 1-12. Show.

 

At this stage, in addition to ensuring that the functional requirements of the business are met, more consideration must be given to non-functional requirements, such as providing the ability to offload services through front-end load balancing, and forwarding different traffic according to the characteristics of users; the database provides active and standby Data backup is performed between the two through data synchronization. When the main database fails, the application can automatically switch to the backup server to provide services for users; in terms of user experience, basic services such as caching and CDN may be introduced to provide services. Performance acceleration.

 

 

                                                          Figure 1-12 Database and cache optimization

 

Distributed Service Architecture

 

Generally speaking, after the first two stages of development, the enterprise has basically determined the development direction of the business, and then only needs to face the problems of following competitors and a large number of user access requests. These companies will also provide various sub-product module functions to meet the diversified development of business. For example, products will design different product function systems, operators will design different operation activities, customer service staff will receive different user feedback, etc. . The superposition of these requirements will make the entire business more and more complex, and some systems will become no longer maintainable and cannot meet the requirements of availability. As long as there is a traffic peak, the system will definitely go down, so many companies are faced with restructuring the architecture design. For example, most of Jingdong's user business has been refactored from .NET to Java language, Taobao has changed from PHP to Java, upgraded to 2.0, and then to 3.0. It can be seen that the transformation at this point in time Or refactoring may not be life-or-death, but the cost is very high. Therefore, how to be able to predict this transformation from the very beginning poses a considerable challenge to the technical requirements. This section gives some points for readers' reference. For more details, please refer to Chapter 5.

 

Business needs

 

Cloud services can usually solve many architectural problems. For example, the object storage system solves the problem of distributed file storage, and the CDN solves the performance problem of static resource access. However, in fact, with the continuous development of business, the system access pressure increases. , there may also be many requests slowing down or timing out, and the pressure on the application server or database service fluctuates greatly. As long as new services are continuously launched, the technical debt will become more and more obvious, and the iteration of the business will become more and more unable to keep up. product demand. In order to solve these problems, enterprises often need to split various businesses, and split different functional modules to different servers for independent deployment. For example, user module, commodity module, shopping cart module, order module and payment module, etc. After these modules are split and deployed independently, they can be further subdivided according to the bottleneck of the system. However, after the service is split, the dependencies between the modules become obvious again. For example, the number of database connections, the distributed transaction of data, and the performance overhead of the database are all problems that need to be solved urgently.

 

At the same time, with the splitting of business modules, in addition to the above technical problems to be solved, there are also engineering practice problems. For example, in different branches of the business, it is necessary to ensure that developers, testers, and operation and maintenance personnel can quickly develop the environment , test environment, build and release pre-release environment. In a fast-growing enterprise, the frequency of iteration is very high. Taking the NetEase Koala platform as an example, the total number of daily releases of all systems reaches thousands, so technicians have extremely high requirements for efficiency. When expanding to multiple product lines of a company, the overall operation requires the same operation as a modern factory, and an automated capability platform is needed to solve the problem. Pure manual work cannot meet the efficient operation of the enterprise at all.

 

Elastic expansion

 

With the continuous growth of demand and users, the system will appear peaks and troughs. In order to make better use of resources and cost budgets, elastic expansion has become a necessary requirement. At peak times, it can automatically expand according to the pressure of the business, share traffic, and under pressure When it is low, it will automatically shrink, reduce costs or improve resource utilization, and use the shrinking resources for offline business calculations. Perhaps in the past simply scaling capabilities vertically to handle more demand, or purchasing stronger servers, which was possible to a certain extent, but the process was slow and costly, by pre-provisioning too many resources , which will lead to planning the capacity value only based on the forecast of peak usage, such as purchasing hardware based on the highest computing capacity of the server, which is a last resort. For example, during events such as Black Friday in foreign countries and Double 11 in China, the requests on the day are very high, and sufficient resources are needed to meet business requests. However, the usage rate of these servers is usually very low. Therefore, this kind of business scenario can only be satisfied by relying on the elasticity of the cloud. demand.

 

Servicing

 

Whether it is the Internet or a traditional industry transformation enterprise, it is basically developed on the original basic business, and it is impossible to stop the business and start from scratch. Therefore, it is difficult and risky to directly carry out microservice transformation on the original system. . At this time, it is basically in a hybrid architecture period, that is, new services will be developed from scratch, gradually integrated into the old system, and step by step to replace the unsatisfactory parts of the old system. At the same time, it ensures the rapid demand for new business. The famous architect Martin Fowler has not provided a unified best practice solution since he formally proposed a review article on microservice architecture in 2013. Nowadays, the architecture of microservices is implemented in various ways. Usually, it is divided into different microservices according to the type of application. Each service is combined with different technology stacks according to the characteristics of the business. Independently deployed isolated processes to run. At present, the basic framework of microservices is similar, such as service discovery, degradation, governance and so on. The technical details of business implementation of microservices vary, and there is no unified implementation solution. For example, service discovery has self-built service infrastructure, and some rely on third-party open source. Technicians need to make choices based on their own scenarios. A simplified architecture model is shown in Figure 1-13.

 

 

                                                                                 Figure 1-13 Service architecture model

 

Obviously, this architecture model is only a part of the entire business service architecture, and the actual system may be dozens of times more complicated. If the iteration speed of the business is very fast, and the dependencies between each business are a very large and complex project from design, development, testing, launch to operation and maintenance, therefore, how to efficiently manage the dependent services and system dependencies, diagnose And timely response to business feedback is a test of service architecture.

 

This article is excerpted from " Practice of Cloud Native Application Architecture " , written by NetEase Cloud Basic Service Architecture Team, and explains the evolution from monomer to distributed service architecture. For more exciting content, please look forward to sharing in the next issue. Readers who are interested in cloud native are welcome to read the whole book .

 

 

Learn about NetEase Cloud:
NetEase Cloud Official Website: https://www.163yun.com/
New User Gift Package: https://www.163yun.com/gift
NetEase Cloud Community: https://sq.163yun.com/

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324594899&siteId=291194637