Ten million users of large sites, how to design their high concurrency architecture

table of Contents

(1) monolithic architecture

(2) the initial high availability architecture

(3) ten million subscribers pressure forecast

(4) Estimated pressure on the server

(5) business vertical split

(6) Anti Distributed Cache read request

(7) based on the master-slave architecture to do a database separate read and write

(8) summary

This article will be a major departure from the development site, step by step to explore how this site architecture from single architecture to a distributed architecture evolution, then evolution to high concurrency architecture.

A monolithic architecture

When a site is generally just beginning to establish the amount of users is very small, probably about the amount of tens of thousands or hundreds of thousands of users, active users every day probably hundreds or thousands.

This time the general site architecture are based on single architecture designed to deploy a total of three servers, one application server, a database server, 1 image server.

R & D teams are usually in less than 10 people, it is to write application code in a single block, and then write the code after the merger, followed by publishing the application server directly on-line. The manual is likely to turn off the Tomcat application server, and then replace the code system war package, and then restart Tomcat.

General database deployed on a separate server, all the core data storage site.

NFS then deployed on another server as a separate image server, storage all the picture sites. Code on the application server will connect and operate the database and image server. As shown below:

 

Second, the initial high-availability architecture

But this pure monolithic system architecture, the presence of high availability problem, the biggest problem is the application server might malfunction, or the database may fail

So in this period, the general budget slightly enough that companies will do an initial high-availability architecture out.

For the application server, usually clustered deployment. Of course, the so-called cluster deployment, in rare cases, the initial amount of users, in fact, generally is to deploy two application servers only, and will put in front of a server to deploy load balancing devices, such as LVS, even to a user request to play up to two application servers.

If at this time an application server fails, there is another application server can be used, thus avoiding single points of failure. As shown below:

 

For the database server, and will generally use a master-slave architecture, deployment to synchronize data from a database from the main library, so that once the main library there is a problem, you can quickly continue to provide database services from the library, to avoid failure causes the entire database system failures are completely unavailable. As shown below:

 

Third, the amount of pressure ten million users forecast

This estimate assumed that the number of users of this site is 10 million, then according to the law of 28, a day will come to visit this site users accounted for 20%, which is over two million users visit every day.

Generally assumed that the average user will come every 30 hits, so there are a total of 60 million clicks (PV).

24 hours a day, according to the 28 rules, most of the day's most active users concentrated in time (24 hours * 0.2) ≈ 5 hours, and most users means that (60 million clicks * 0.8 ≈ 5000 million hits)

In other words, in 5 hours there will be 50 million clicks in.

Down conversion, active in the five-hour access period, the request will probably amount per second around 3000, and may well be the peak period of the emergence of a large number of users centralized access to five hours.

For example, within half an hour concentrated influx of large numbers of users to access the peak form. According to online experience, access is usually the peak two to three times the active access. Suppose we calculated according to three times, then there could be short-lived peaks will have about 10,000 requests per second within five hours.

Fourth, the server pressure forecast

Probably know the peak of the volume of requests per second may be about 10,000 after, take a look at the pressure of each server forecast system.

Generally an application server virtual machine deployment, put on top of a Tomcat, hundreds of requests per second will support the most.

To request support in terms of 500 per second, then 10,000 visits per second peak of support, 20 need to deploy application services.

And the application server access to the database is an amount to turn several times, since it is assumed a second application server 10000 receives the request, but the application server in order to process each request may be related to an average of 3 to 5 times a database access .

According to database access three times to count, then the database will form requests per second 30,000 times.

As requested amount of about 5,000 per second, the highest one database server support, this time by 6 database server needs to support about 30,000 requests per second.

图片服务器的压力同样会很大,因为需要大量的读取图片展示页面,这个不太好估算,但是大致可以推算出来每秒至少也会有几千次请求,因此也需要多台图片服务器来支撑图片访问的请求。

五、业务垂直拆分

一般来说在这个阶段要做的第一件事儿就是业务的垂直拆分

因为如果所有业务代码都混合在一起部署,会导致多人协作开发时难以维护。在网站到了千万级用户的时候,研发团队一般都有几十人甚至上百人。

所以这时如果还是在一个单块系统里做开发,是一件非常痛苦的事情,此时需要做的就是进行业务的垂直拆分,把一个单块系统拆分为多个业务系统,然后一个小团队10个人左右就专门负责维护一个业务系统。如下图

 

六、分布式缓存扛下读请求

这个时候应用服务器层面一般没什么大问题,因为无非就是加机器就可以抗住更高的并发请求。

现在估算出来每秒钟是1万左右的请求,部署个二三十台机器就没问题了。

但是目前上述系统架构中压力最大的,其实是数据库层面 ,因为估算出来可能高峰期对数据库的读写并发会有3万左右的请求。

此时就需要引入分布式缓存来抗下对数据库的读请求压力了,也就是引入Redis集群。

一般来说对数据库的读写请求也大致遵循28法则,所以每秒3万的读写请求中,大概有2.4万左右是读请求

这些读请求基本上90%都可以通过分布式缓存集群来抗下来,也就是大概2万左右的读请求可以通过 Redis集群来抗住。

我们完全可以把热点的、常见的数据都在Redis集群里放一份作为缓存,然后对外提供缓存服务。

在读数据的时候优先从缓存里读,如果缓存里没有,再从数据库里读取。这样2万读请求就落到Redis上了,1万读写请求继续落在数据库上。

Redis一般单台服务器抗每秒几万请求是没问题的,所以Redis集群一般就部署3台机器,抗下每秒2万读请求是绝对没问题的。如下图所示:

 

七、基于数据库主从架构做读写分离

此时数据库服务器还是存在每秒1万的请求,对于单台服务器来说压力还是过大。

但是数据库一般都支持主从架构,也就是有一个从库一直从主库同步数据过去。此时可以基于主从架构做读写分离。

也就是说,每秒大概6000写请求是进入主库,大概还有4000个读请求是在从库上去读,这样就可以把1万读写请求压力分摊到两台服务器上去。

这么分摊过后,主库每秒最多6000写请求,从库每秒最多4000读请求,基本上可以勉强把压力给抗住。如下图:

 

八、总结

本文主要是探讨在千万级用户场景下的大型网站的高并发架构设计,也就是预估出了千万级用户的访问压力以及对应的后台系统为了要抗住高并发,在业务系统、缓存、数据库几个层面的架构设计以及抗高并发的分析。

但是要记住,大型网站架构中共涉及的技术远远不止这些,还包括了MQ、CDN、静态化、分库分表、NoSQL、搜索、分布式文件系统、反向代理,等等很多话题,但是本文不能一一涉及,主要是在高并发这个角度分析一下系统如何抗下每秒上万的请求。

Guess you like

Origin blog.csdn.net/suifeng629/article/details/93735141