Please enjoy, dedicated to fighting all night last night, the majority of programs ape brothers as Zhiling sister!

Personal Public Number: Huperzine architecture notes (ID: shishan100)

table of Contents

(1) Why use cache clusters

(2) 200,000 simultaneous users to access a hotspot cache

(3) the calculated flow-based auto-discovery cache hotspot

(4) automatic loading buffer hot local cache for the JVM

(5) current-limiting fuse protection

(6) summary

This article is an article I wrote before, the reason for the new wine in old bottles, is wanted by the majority of the national program ape brothers Goddess: Zhiling sister married this hot topic, look at the face of such a large cache of a hot spot how should our brothers siege lion design system architecture, can be withstood peak instantaneous flow fans!

Also I hope through this hot topic, help us to re-review the hot cache architecture design related technical points!

Did not talk much, get to the point!

(1) Why use cache clusters

In fact, when using a cache cluster, fear most is the hot key, great value both cases, that Shajiao hot key large value it?

In short, hot key , is a key moment in your cache cluster is off the hook or even tens of thousands of concurrent requests. Great value , is one of your key may have a value corresponding GB class size, leading to a query when the value of network-related problems lead to failure.

Here we take a look at a picture, if you happen to have a system that he himself is a cluster deployment, and then later have a cache cluster, this cluster whether you use redis cluster, or memcached, or cache cluster self-development company, all OK.



So, the system cache clusters do with it?

Very simple, put some data usually do not change in the cache, then the user query large amounts of data usually do not change when you can not go directly from the cache yet?

Concurrency cache cluster is very strong, and cache read performance is very high. For example, suppose you have 20,000 requests per second, but 90% are read request, then the 18,000 requests per second in reading some of the data are unlikely to change, instead of writing data.

At this point that you put these data in the database, and then send the request to 20,000 per second read and write data on the database, suit you?

Of course inappropriate, if you use a database carrying 20,000 requests per second, then sorry, you are likely to have to engage in sub-library sub-table + separate read and write.

For example, you score three main library, carrying write request 2000 per second, and each linked to three main library from the library, a total of nine bearer 18,000 per second read request from the library.

In this case, you may need a total of 12 high-profile database server, it is very cost money, the cost is very high, very inappropriate.

We look at the following chart to understand in this case.


Thus, we can usually do not change the data in the cache cluster, the cluster can use the cache 2 from the master 2, the master node for the write cache, the cache for reading from the node.

Performance-cache cluster, two can be used to carry a large number of read requests per second from 18,000 nodes, and then three main library database that carry the write requests per second, 2000 and a few other read requests on OK.

As a result, you spend the machine has become an instant four sets of database cache machine + 3 = 7 machine machine , is not a great deal of resources to reduce overhead than 12 machines before?

Yes, in fact, the cache system architecture in a very important part. In many cases, for those who rarely change but a large high concurrent reading of data to cache clusters through the anti-high concurrent read, it is very appropriate.

We look at the following chart, understand what this process.


It should be noted here that all of the number of machines, the amount of concurrent requests is an example, we mostly like to understand what it means, its main purpose is to give some less familiar with caching technologies that explained the background of the students, so these students can understand bearer in the system cache cluster read request is what that means.


(2) 200,000 simultaneous users to access a hotspot cache

Well, the background has been clearly explained to you, you can now talk to you today focused on issues to be discussed: a hot cache .

We do a hypothesis, there are now 10 anti-cache node to read a lot of requests. Normally, a read request should be uniform falls on the cache node 10, right!

This cache node 10, the carrier 10 000 requests per second is the same.

Then we do a hypothesis, you request a node carrying 20,000 is the limit, so you generally limits a normal node carrying 10,000 requests ok, a little bit left out buffer.

Well, so-called hot spots caching issue What does it mean? It is simply because of the sudden inexplicable reason, a large number of users to access the same cache data.

For example, suddenly announced that Lin Chi-ling married, then is not the cause are hundreds of thousands of users per second within a short time to see this hot news?

Assuming this news is a cache, corresponding to a cache key, there is a cache on the machine, this time assuming instantaneous 200,000 request toward a key on that machine.

How will this time? We look at the following chart to understand what this feeling of despair.


Obviously, we just assume that the request is a cache of 20,000 Slave nodes per second at most, of course, the actual cache stand-alone carries 50,000 to 100,000 read request is possible, this is a hypothesis.

The results suddenly ran over 200,000 requests per second on this machine, what will happen? Very simple, in the above chart that Taiwan was directed 200,000 request cache machine will overwork and downtime.

So if cache cluster downtime of the machine began to appear, how will this time?

In this case found not read data read request, extracts the original data from the database, and then put the remaining other cache machines go. But a flood of 200,000 requests per second, will be overwhelmed by other cache machine again.

And so, eventually leading to the collapse of the overall cache cluster, causing the whole system down.

Let's look at the following chart, and then feel this horrible scene.



(3) automatic discovery cache based hot flow calculation technique

In fact, the key point here is the hot spot for this cache, your system needs to be able to focus when the cache sudden, he found a direct and instantly flew in milliseconds automatic load balancing.

Then we first is that, how do you find hot spots automatically caching problem?

First of all you should know that usually appears when a cache hot spot, you must be very high concurrency per second, per second may have hundreds of thousands or even millions of requests over quantity, this is possible.

So, at this time can be accessed in real-time statistical data based on the number of stream computing technology in the field of big data, such as storm, spark streaming, flink.

Once the data is in the process of real-time statistics of visits, such as found in less than a second, a sudden visits of data more than 1000, immediately put the data directly determined to be hot data, this can be found out hot write data For example, in the zookeeper.

Of course, how your system determines hotspot data, we can value it according to their own business as well as experience.

We look at this chart below to see how the whole process is carried out.


Here surely someone will ask, that your data flow computing system during visits statistics, when there will say a single machine is requested hundreds of thousands of times per second problem?

The answer is: No

Because the flow computing technology, especially storm such a system, he can do with a request for data over the first dispersion calculated in many local machine, the summary and finally the calculation result to a local machine aggregated globally.

So hundreds of thousands of requests can be dispersed in the first such 100 machines, each machine thousands of times statistics request this data.

Then 100 locally computed results are summarized to calculate the overall machine can be made, so flow-based statistical computing techniques will not have a hot issue.



(4) automatic loading buffer hot local cache for the JVM

Our own system can zookeeper designated hot spots corresponding to the cache znode listen, if there is a change he can immediately perceive to.

At this point the system layer can be immediately relevant to the cached data is loaded from the database, then the local cache directly on their own internal systems can be.

The local cache, you use ehcache, hashmap, in fact, can all look at their business needs. We are here primarily is to say in a centralized cache cluster cache, directly into each system to achieve their own local cache to each local system cache is not too much data.

Because of this single-instance ordinary machine deployment probably a 4-core 8G machine, leaving local cache space is very small, so used to put hot data is cached locally this is the most appropriate, just right.

Assuming your system layer cluster deployed 100 machines, so good, then you instant 100 machines in a local hot spot will have a copy of the cache.

Then the next read of the hot cache, a local cache system directly read out gave returns, do not walk the cache cluster.

In this case, it is impossible to allow reading of 200,000 requests per second to reach the hot spot to read a cache on a machine caching machine, but became 100 machines each machine carries thousands of requests, so that thousands of requests directly return data from the local cache machine, this is not a problem.

Let's draw a picture, take a look at this process:



(5) current-limiting fuse protection

In addition, within each system, in fact, we should also add a special fuse protection measures limit the flow of hot data access.

Within each system instance, you can add a fuse protection mechanism, assuming that carries 40000 read request cache clusters up to second, then you are a total of 100 system instance.

Restrictions in relation to your own good, each system cache cluster instance requests per second up to read no more than 400 times, more than a blown out can be, not to request cache cluster, a direct return blank information, and then the user will go again later refresh the page and the like.

By its own system-level direct current limiting fuse protective measures, it can be well protected behind a cache cluster, database cluster like not to be killed.

Again a picture, take a look:



(6) This article summarizes

Specifically to do to achieve this sophisticated caching hot spots in the system architecture to optimize it? This depends on your own system there is no such a scene.

If your system has hot cache problem, then it would have to implement complex caching support structure hot like this one. But if not, then we should not over-designed, in fact, your system may not need such a complex architecture.

If the latter, then everybody look right when this article to learn about architecture corresponding good idea

Finally, the eve of the Dragon Boat Festival, also fought in the front line of the majority of the siege lion brothers, hat off, you are the most beautiful people ^ _ ^

END

Personal Public Number: Huperzine architecture notes (ID: shishan100)

Welcome to long press the map No public concern: the architecture of huperzine notes!

No reply public background information , access to exclusive secret of learning materials

Huperzine architecture notes, BAT architecture experience purse


Guess you like

Origin juejin.im/post/5cf92664f265da1b9612f56c