Ten pictures let you understand the development and change history of Alibaba's corporate structure design

write picture description here
It is everyone's responsibility to support originality: https://mp.weixin.qq.com/s/YQG95HxCHuO7WCQW5aND9g
At present, distributed and micro-service structure design is prevalent in China, and industries such as large and small companies, e-commerce, and the Internet of Things are closely following these concepts. In the process of project development and operation, according to my discussion with some architect friends recently, I found that not only did most companies not implement the overall plan, but some architects even started development work without knowing why they adopted this series of services. Very dangerous for the software industry.

When talking about the technology, framework, protocol, etc. used for large-scale platform architecture, I think it is not necessary to talk about technology first. Figure out what kind of requirements the architecture design is based on, and what standards and features it has.

First of all, let me explain to you the main aspects of the core elements of large-scale platforms:

1. Performance : No matter what product it is, performance is always the first sense of customer requirements. It takes 10 seconds to click on a query, and the information cannot be loaded when jumping to a page. No matter how strong the architecture design is, users cannot perceive your performance. Effort, so performance is the first and most important core element of the product.

2. Availability : Like a person's reputation, the availability of a large platform is its reputation, and even a minute and a second of downtime cannot be forgiven. This is a hard indicator that does not need to be discussed. Almost all large websites promise 7* Available 24 hours.

3. Scalability : Large-scale platforms always have to follow the rhythm of people. Just like the Chinese New Year, the platform will have peaks and valleys. At this time, it is time to test the scalability of a platform. How many servers can be added to the platform? In the cluster, whether the service can be provided indiscriminately after being added is an important expansion indicator.

4. Scalability : This is the only indicator that focuses on functionality among several core elements. Scalability refers to whether a complete platform can be transparent to existing products in the face of new business development and demand changes after it goes online. Impact, new products can be launched with little or no changes to existing functionality.

5. Security : There is no security, everything else is a joke, and the security level requirements are different for different industries. Let's mention it here temporarily.

As mentioned earlier, we need to discuss these core elements, determine which of these elements we need to achieve, and what the specific indicators are, and then take corresponding technical solutions according to different indicator requirements. It will be very clear. .

In order to facilitate everyone to better understand the architectural thinking, I will use 10 architectural design drawings to explain the main development process of Alibaba's architectural design.

First, the initial website structure

In the initial architecture, applications, databases, and files are all deployed on one server, as shown in the figure:

write picture description here
LAMP Linux+Apache+PHP+Mysql, classic configuration. Too many classic cases are designed with this kind of architecture, and the advantages are easy, convenient, fast development and launch, and low cost.

2. Separation of applications, data and files

With the expansion of the business, one server can no longer meet the performance requirements. Therefore, applications, databases, and files are deployed on independent servers, and different hardware is configured according to the purpose of the server to achieve the best performance.
write picture description here
3. Use cache to improve website performance

While optimizing the performance of hardware, it is also necessary to optimize performance through software. In most website systems, caching technology is used to improve system performance. The use of caching is mainly due to the existence of hot data, and most website visits follow 28 principle (that is, 80% of the access requests end up on 20% of the data), so we can cache hot data, reduce the access paths of these data, and improve the user experience.
write picture description here

The common way of cache implementation is local cache and distributed cache. Of course there are CDNs, reverse proxies, etc. Local cache, as the name implies, caches data locally on the application server, which can exist in memory or in files. OSCache is a commonly used local cache component. The local cache is characterized by fast speed, but the amount of cached data is also limited due to limited local space. The characteristic of distributed cache is that it can cache massive data, and it is very easy to expand. It is often used in portal websites, and its speed is not as fast as local cache. Commonly used distributed caches are Memcached and Redis.

4. Using clusters to improve application server performance

As the entrance of the website, the application server will bear a large number of requests. We often share the number of requests through the application server cluster. A load balancing server is deployed in front of the application server to schedule user requests, and distribute requests to multiple application server nodes according to the distribution policy.

write picture description here
Commonly used load balancing technologies include F5 for hardware, which is relatively expensive, and software such as LVS, Nginx, and HAProxy. LVS is a four-layer load balancing, and the internal server is selected according to the target address and port. Nginx is a seven-layer load balancing and HAProxy supports four-layer and seven-layer load balancing. How to choose According to their own needs and characteristics, we usually choose internal servers according to the content of the packets, so the LVS distribution path is better than Nginx and HAProxy, and the performance is higher. On the other hand, Nginx and HAProxy are more configurable. For example, they can be used for dynamic and static separation.

5. Database read-write separation and sub-database sub-table

With the increase in the number of users, the database will soon become the biggest bottleneck. The commonly used methods to improve database performance are to perform read-write separation and table partitioning. The function realizes data synchronization. The sub-database sub-table is divided into horizontal segmentation and vertical segmentation, and the horizontal switch is to split a very large table in a database, such as a user table. Vertical segmentation is to switch according to different businesses, such as user business, commodity business related tables are placed in different databases.

write picture description here

6. Use CDN and reverse proxy to improve website performance

If our servers are deployed in the computer room in Chengdu, the access for users in Sichuan is faster, but the access for users in Beijing is slower. This is because Sichuan and Beijing belong to different developed regions of China Telecom and China Unicom, respectively. Beijing users need to go through a long path to access the server in Chengdu through the Internet router, and the return path is the same, so the data transmission time is relatively long. In this case, CDN is often used to solve the problem. CDN caches data content in the operator's computer room, and users obtain data from the nearest operator when accessing, which greatly reduces the path of network access.
write picture description here

The reverse proxy is deployed in the computer room of the website. When the user request arrives, the reverse proxy server is first accessed, and the reverse proxy server returns the cached data to the user. If there is no cached data, it will continue to go to the application server to obtain it. It also reduces the cost of acquiring data. Reverse proxy has Squid, Nginx.

7. Use a distributed file system

The number of users is increasing day by day, the business volume is increasing, and more and more files are generated. A single file server can no longer meet the demand. Requires distributed file system support. There are many commonly used distributed file systems, such as FastDFS and NFS.

write picture description here

8. Use NoSql and search engines

For the query of massive data generated by the system, we can achieve better performance by using the nosql database plus the search engine. Not all data needs to be in relational data. Commonly used NOSQL are mongodb and redis, and search engines include lucene, Sorl, and ElasticSearch.

write picture description here

9. Split the application server for business

With the further expansion of the business, the application becomes very bloated. At this time, we need to divide the application into business, such as Baidu into users, commodities, orders, pictures and other businesses. Each business application is responsible for relatively independent business operations. Businesses communicate through messages or share databases.

write picture description here

10. Building distributed services

If none of the above architectural optimizations can carry system access, then a distributed service or microservice processing framework should be considered. At this time, architects will find that each business application will use some basic business services, such as user service, order service, payment service, and security service. These services are the basic elements supporting each business application. We extract these services to build distributed services using a partial service framework. In Taobao's SOA system, Dubbo architecture or SpringCloud microservice framework are good choices.

Summary
The above 10 pictures are the evolution history of ALI's architecture, which is a very concise system upgrade around the core elements of the five large-scale system architectures we initially defined. At the same time, it also confirms the enduring line in the architecture diagram "Don't try to design a large-scale system architecture", because the existing large-scale platform architecture has evolved step by step through the efforts of many architects . It is unimaginable to try to eat a fat man's architecture in one bite. Don't use technology for technology's sake, don't use technology for trend's sake, and the architecture that suits your current business needs is the best architecture.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324650497&siteId=291194637