Redis talk about how to solve the problem of hot key

 

introduction

Told database series of articles a few days, we must look tired, in fact, not finished. . . (Hereinafter omitted a million words).
Today we have a change of taste, to write the contents of redis aspects, to talk about how hot key problem is solved.
In fact, the hot key question is very simple to say, there is instant access to hundreds of thousands of requests a fixed key on redis, feeling so overwhelmed by the situation caching service.
In fact, life is also there are many such examples. For example XX star married. Then the Key for the XX stars will instantly increases, there will be hot data problems.
ps:hot key and big key question, we must understand.
This paper is expected to be divided into several parts as follows

  • Hot key issue
  • How to find
  • Industry solutions

text

Thermal Key issues

As mentioned above, the so-called hot key problem is that suddenly there are hundreds of thousands of requests to access a particular key on the redis. So, this will cause traffic is too concentrated, reaching the upper limit of the physical network adapter, resulting in this redis server downtime.
The key is that the next request will hate directly to your database, cause your service is unavailable.

How hot key found

Method One: With business experience, which is estimated perform hot key
fact, this method is still quite viable. For example, a commodity spike in doing, then this key commodity can determine a hot key. Obvious shortcomings, not all businesses can be estimated figure out which key is the hot key.
Method Two: to collect the client
this way is, before the operation redis, add a line of code statistics. So there are many statistics this way, it can be to external communication system sends a notification message. Drawback is caused by the invasion of the client code.
Method three: the Proxy layer do collect
some cluster architecture is below, Proxy can be Twemproxy, unified entrance. Proxy reporting in the collection can be done layer, but the drawback is obvious that not all redis cluster architecture has proxy.

Method IV: redis own command
(1) monitor command that can grab an real redis command received by the server, and then write the code key is valid statistics heat. Of course, there are also ready-made analysis tool can give you use, for example redis-faina. But the order under conditions of high concurrency, memory has increased explosion risks, but also reduce the performance of redis.
(2) hotkeys parameters, redis 4.0.3 provides key hotspots redis-cli of discovery, coupled with -hotkeys option to the implementation of redis-cli. But the argument in the course of implementation, if the key is more, the implementation is relatively slow.
Method five: Ethereal own assessment
Redis client with the server using TCP protocol to interact, the communication protocol used is the RESP. Write your own program listening port, parses the data according to protocol rules RESP for analysis. The disadvantage is the high cost of development, maintenance difficulties, there is the possibility of loss.

These five programs, each with advantages and disadvantages. It can make a choice based on their own business scenarios. So after the discovery of the hot key, how to solve it?

How to solve

There are two current industry
(1) using a secondary cache
such use ehcache, or a HashMapcan. After you find the hot key, the hot key is loaded into the JVM system.
This hot key for the request, taken directly from the jvm, but will not come redis layer.
Assuming that there are over one hundred thousand requests for the same key, if not the local cache, a hundred thousand requests directly to hate on the same redis up.
Now suppose that your application layer has 50 machines, OK, you have jvm cache. A hundred thousand average spread out requests, there are 2,000 requests per machine, the JVM will be taken from the value of the value, and then returns the data. Avoid hundred thousand requests to hate on the same redis situation.
(2) hot backup key
to this program is very simple. Do not let the key go on the same redis not on the list. We put this key, on multiple redis are not enough of a deposit. Next, there is a time hot key request comes, we will randomly select a backup on the redis, visit value, return data.
Redis assumed number of clusters to N, the steps shown in FIG.

Note: not necessarily 2N, you want to take 3N, 4N can see the request.
Pseudo code

const M = N * 2
//生成随机数
random = GenRandom(0, M) //构造备份新key bakHotKey = hotKey + “_” + random data = redis.GET(bakHotKey) if data == NULL { data = GetFromDB() redis.SET(bakHotKey, expireTime + GenRandom(0,5)) }

Industry solutions

OK, in fact, after reading the above, you may have a question.

Smoke brother, there are ways in the course of project operation, the automatic discovery of the hot key, then the program automatically handle it?

Ah, good question, then we say something about the industry how to do. In fact, only two steps
(1) to monitor hot key
(2) processing notification system do
happen, like a few days ago an article entitled "to praise transparent multi-level caching solutions (TMC)", inside there are hot spots mentioned key issues, we just need to show
(1) monitor hot key
in the hot key aspects of monitoring, there is praise by the way: to collect the client .
There is a saying in reference to "praise there is a transparent multi-level caching solutions (TMC)" in

TMC and Jedis class of JedisPool native package made jedis transformation, integrated JedisPool initialization TMC "hot spots found" + initialization logic "local cache" function Hermes-SDK package.

People will say rewritten jedis native jar package, he joined the Hermes-SDK package.
That Hermes-SDK package used to doing?
OK, it is to do hot discovered and the local cache .
From a monitoring perspective, the key value for every packet Jedis-Client access request, Hermes-SDK asynchronous event will be reported to the access key Hermes cluster server through its communication module, to which a "hot probe" according to the reported data .

Of course, this is only one way, some companies use the surveillance is the way five: his own capture assessment .
In particular to do so, using the first set of flow flink building a computing system. Then write a program catch data capture redis listening port, after caught in lost data to kafka.
Next, the flow computing systems in the consumer kafka data, statistical data can be, but also to achieve the purpose of monitoring the hot key.

(2) process notifies the system to do
at this point, there is like above with a solution: the use of the secondary cache for processing.
There Like After monitoring the hot key, Hermes cluster server will notify the business system of Hermes-SDK by all means, tell them:. "Buddy, this key is the hot key, remember to do the local cache"
So Hermes-SDK the key will be cached locally for subsequent requests. Hermes-SDK found this is a hot key, directly from the local take in, and not to access the cluster.

In addition to this notification method. We can do the same, such as your stream computing systems to monitor hot key, and to a node inside the zookeeper to write. Then your business systems monitor the node, node discovery data changes, and on behalf discover hot key. Finally, write to the local cache, it is also possible.

Notice the variety of ways you can play for free. This article only provides an idea.

to sum up

We hope this paper, we understand how to deal with the hot key problems encountered in production.

introduction

Told database series of articles a few days, we must look tired, in fact, not finished. . . (Hereinafter omitted a million words).
Today we have a change of taste, to write the contents of redis aspects, to talk about how hot key problem is solved.
In fact, the hot key question is very simple to say, there is instant access to hundreds of thousands of requests a fixed key on redis, feeling so overwhelmed by the situation caching service.
In fact, life is also there are many such examples. For example XX star married. Then the Key for the XX stars will instantly increases, there will be hot data problems.
ps:hot key and big key question, we must understand.
This paper is expected to be divided into several parts as follows

  • Hot key issue
  • How to find
  • Industry solutions

text

Thermal Key issues

As mentioned above, the so-called hot key problem is that suddenly there are hundreds of thousands of requests to access a particular key on the redis. So, this will cause traffic is too concentrated, reaching the upper limit of the physical network adapter, resulting in this redis server downtime.
The key is that the next request will hate directly to your database, cause your service is unavailable.

How hot key found

Method One: With business experience, which is estimated perform hot key
fact, this method is still quite viable. For example, a commodity spike in doing, then this key commodity can determine a hot key. Obvious shortcomings, not all businesses can be estimated figure out which key is the hot key.
Method Two: to collect the client
this way is, before the operation redis, add a line of code statistics. So there are many statistics this way, it can be to external communication system sends a notification message. Drawback is caused by the invasion of the client code.
Method three: the Proxy layer do collect
some cluster architecture is below, Proxy can be Twemproxy, unified entrance. Proxy reporting in the collection can be done layer, but the drawback is obvious that not all redis cluster architecture has proxy.

Method IV: redis own command
(1) monitor command that can grab an real redis command received by the server, and then write the code key is valid statistics heat. Of course, there are also ready-made analysis tool can give you use, for example redis-faina. But the order under conditions of high concurrency, memory has increased explosion risks, but also reduce the performance of redis.
(2) hotkeys parameters, redis 4.0.3 provides key hotspots redis-cli of discovery, coupled with -hotkeys option to the implementation of redis-cli. But the argument in the course of implementation, if the key is more, the implementation is relatively slow.
Method five: Ethereal own assessment
Redis client with the server using TCP protocol to interact, the communication protocol used is the RESP. Write your own program listening port, parses the data according to protocol rules RESP for analysis. The disadvantage is the high cost of development, maintenance difficulties, there is the possibility of loss.

These five programs, each with advantages and disadvantages. It can make a choice based on their own business scenarios. So after the discovery of the hot key, how to solve it?

How to solve

There are two current industry
(1) using a secondary cache
such use ehcache, or a HashMapcan. After you find the hot key, the hot key is loaded into the JVM system.
This hot key for the request, taken directly from the jvm, but will not come redis layer.
Assuming that there are over one hundred thousand requests for the same key, if not the local cache, a hundred thousand requests directly to hate on the same redis up.
Now suppose that your application layer has 50 machines, OK, you have jvm cache. A hundred thousand average spread out requests, there are 2,000 requests per machine, the JVM will be taken from the value of the value, and then returns the data. Avoid hundred thousand requests to hate on the same redis situation.
(2) hot backup key
to this program is very simple. Do not let the key go on the same redis not on the list. We put this key, on multiple redis are not enough of a deposit. Next, there is a time hot key request comes, we will randomly select a backup on the redis, visit value, return data.
Redis assumed number of clusters to N, the steps shown in FIG.

Note: not necessarily 2N, you want to take 3N, 4N can see the request.
Pseudo code

const M = N * 2
//生成随机数
random = GenRandom(0, M) //构造备份新key bakHotKey = hotKey + “_” + random data = redis.GET(bakHotKey) if data == NULL { data = GetFromDB() redis.SET(bakHotKey, expireTime + GenRandom(0,5)) }

Industry solutions

OK, in fact, after reading the above, you may have a question.

Smoke brother, there are ways in the course of project operation, the automatic discovery of the hot key, then the program automatically handle it?

Ah, good question, then we say something about the industry how to do. In fact, only two steps
(1) to monitor hot key
(2) processing notification system do
happen, like a few days ago an article entitled "to praise transparent multi-level caching solutions (TMC)", inside there are hot spots mentioned key issues, we just need to show
(1) monitor hot key
in the hot key aspects of monitoring, there is praise by the way: to collect the client .
There is a saying in reference to "praise there is a transparent multi-level caching solutions (TMC)" in

TMC and Jedis class of JedisPool native package made jedis transformation, integrated JedisPool initialization TMC "hot spots found" + initialization logic "local cache" function Hermes-SDK package.

People will say rewritten jedis native jar package, he joined the Hermes-SDK package.
That Hermes-SDK package used to doing?
OK, it is to do hot discovered and the local cache .
From a monitoring perspective, the key value for every packet Jedis-Client access request, Hermes-SDK asynchronous event will be reported to the access key Hermes cluster server through its communication module, to which a "hot probe" according to the reported data .

Of course, this is only one way, some companies use the surveillance is the way five: his own capture assessment .
In particular to do so, using the first set of flow flink building a computing system. Then write a program catch data capture redis listening port, after caught in lost data to kafka.
Next, the flow computing systems in the consumer kafka data, statistical data can be, but also to achieve the purpose of monitoring the hot key.

(2) process notifies the system to do
at this point, there is like above with a solution: the use of the secondary cache for processing.
There Like After monitoring the hot key, Hermes cluster server will notify the business system of Hermes-SDK by all means, tell them:. "Buddy, this key is the hot key, remember to do the local cache"
So Hermes-SDK the key will be cached locally for subsequent requests. Hermes-SDK found this is a hot key, directly from the local take in, and not to access the cluster.

In addition to this notification method. We can do the same, such as your stream computing systems to monitor hot key, and to a node inside the zookeeper to write. Then your business systems monitor the node, node discovery data changes, and on behalf discover hot key. Finally, write to the local cache, it is also possible.

Notice the variety of ways you can play for free. This article only provides an idea.

to sum up

We hope this paper, we understand how to deal with the hot key problems encountered in production.

Guess you like

Origin www.cnblogs.com/leeego-123/p/11588429.html