Hardcore dry goods! Redis performance optimization, it is recommended to collect!

In some network service systems, the performance of Redis may be a more important topic than the performance of hard disk databases such as MySQL. For example, Weibo, the hot Weibo[1], and the latest user relationships are stored in Redis, and a large number of queries hit Redis instead of MySQL.

So, for the Redis service, what performance optimization can we do? In other words, what performance waste should be avoided?

The fundamentals of Redis performance

Before discussing optimization, we need to know that Redis service itself has some features, such as single-threaded operation. Unless the source code of Redis is modified, these features are the fundamentals for us to think about performance optimization.

So, what are the basic features of Redis that we need to consider? Redis's project introduction summarizes its characteristics:

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported.

First, Redis uses virtual memory provided by the operating system to store data. Moreover, this operating system generally refers to Unix. Redis can also be run on Windows, but requires special handling. If your operating system uses swap space, then Redis data may actually be stored on the hard disk. Pay attention to the public account: programmer Bai Nannan, get the latest interview questions for 2020

Secondly, Redis supports persistence and can save data on the hard disk. In many cases, it is indeed necessary for us to perform persistence to achieve backup, data recovery and other requirements. But persistence will not happen in a vacuum, it will also take up some resources.

Third, Redis uses the key-value method to read and write, and the value can contain many different types of data; furthermore, the bottom layer of a data type is stored in different structures. Different storage structures determine the complexity and performance overhead of data addition, deletion, modification, and query.

Finally, what was not mentioned in the above introduction is that Redis is mostly single-threaded (single-threaded), that is, only one CPU is occupied at the same time, and only one instruction is running, parallel read and write Does not exist. The answer to the delay caused by many operations can be found here.

Regarding this last feature, why Redis is single-threaded but has good performance (according to Amdahl's Law, it makes more sense to optimize the time-consuming process). The two sentences summarized as: Redis uses multiple channels I/O multiplexing mechanism [3], when processing client requests, it does not block the main thread; Redis simply executes (most instructions) one instruction in less than 1 microsecond [4], so a single-core CPU can be used in one second Processing 1 million instructions (corresponding to hundreds of thousands of requests), there is no need to implement multi-threading (the network is the bottleneck [5]).

Optimize network latency

The official Redis blog said in several places that the performance bottleneck is more likely to be the network [6], so how do we optimize the delay on the network?

First of all, if you use stand-alone deployment (application service and Redis are on the same machine), using Unix inter-process communication to request Redis service is faster than localhost LAN (scientific name loopback). The official document [7] says so, think about it, it should be like this in theory.

However, the business scale of many companies cannot be supported by stand-alone deployment, so TCP still has to be used.

The communication between Redis client and server generally uses TCP long links. If the client needs to wait for Redis to return the result after sending the request and then send the next instruction, multiple requests from the client and Redis constitute the following relationship:

image.png

(Note: If it is not that the key you want to send is extremely long, a TCP packet can completely hold Redis commands, so only a push packet is drawn)

In these two requests, the client needs to experience a period of network transmission time.

But if possible, you can use multi-key commands to merge requests. For example, two GET keys can be merged with MGET key1 and key2. In this way, in actual communication, the number of requests is also reduced, and the delay is naturally improved.

If the multi-key command cannot be used to merge, such as a SET, a GET cannot be merged. How to do?

There are at least two methods in Redis that can combine multiple instructions into one request, one is MULTI/EXEC and the other is script. The former was originally a method of constructing Redis transactions, but it is indeed possible to merge multiple instructions into one request. The communication process is as follows. As for the script, it is best to use the sha1 hash key of the cached script to call up the script, so that the communication volume is smaller.

image.png

This can really reduce network transmission time, right? But in this case, the keys involved in this transaction/script must be required to be on the same node, so consider it as appropriate.

If we have considered the above methods and there is still no way to merge multiple requests, we can also consider merging multiple responses. For example, to merge 2 reply messages:

image.png

In this way, theoretically, the network transmission time for one reply can be saved. This is what the pipeline does. Give an example of ruby ​​client using pipeline:

require 'redis'
@redis = Redis.new()
@redis.pipelined do
    @redis.get 'key1'
    @redis.set 'key2' 'some value'
end
# => [1, 2]

It is said that some language clients even use pipeline by default to optimize the delay problem, such as node_redis.

In addition, not any number of reply messages can be put into a TCP packet. If there are too many requests and the reply data is very long (for example, get a long string), TCP will still transmit in packets, but using pipeline, it is still possible Reduce the number of transmissions.

The pipeline is different from the other methods above in that it is not atomic. Therefore, on a cluster in the cluster state, it is more likely to implement pipelines than those atomic methods.

To summarize:

  1. Use unix inter-process communication, if a stand-alone deployment

  2. Use the multi-key command to combine multiple commands to reduce the number of requests, if possible

  3. Use transaction, script to merge requests and responses

  4. Use pipeline to merge response

Be wary of long-running operations

In the case of large amounts of data, the execution time of some operations will be relatively long, such as KEYS *, LRANGE mylist 0 -1, and other instructions with algorithm complexity of O(n). Because Redis only uses one thread for data query, if these instructions take a long time, it will block Redis and cause a lot of delay.

Although the official documents say that the query of KEYS * is very fast, it only takes 40 milliseconds to scan 1 million keys (on a normal notebook) (see: https://redis.io/commands/keys), but dozens of milliseconds For a system with high performance requirements, it is not short, let alone if there are hundreds of millions of keys (a machine may store hundreds of millions of keys, for example, a key of 100 bytes, 100 million keys only 10GB) , Longer time.

Therefore, try not to use these slow-executing instructions in the code of the production environment, as the author of Redis also mentioned in the blog [8]. In addition, operation and maintenance students should try not to use it when querying Redis. Furthermore, the book Redis Essential suggests using rename-command KEYS'' to prohibit the use of this time-consuming command.

In addition to these time-consuming instructions, transactions and scripts in Redis, because multiple commands can be combined into an atomic execution process, may also take a long time for Redis, which requires attention.

If you want to find out the "slow instructions" used in the production environment, you can use SLOWLOG GET count to view the most recent count of instructions with a long execution time. As for how long it is, it can be defined by setting slowlog-log-slower-than in redis.conf.

In addition, a possible slow command that is not mentioned in many places is DEL, but it is mentioned in the comment [9] of the redis.conf file. To make a long story short, when DEL is a large object, it may take a long time (or even a few seconds) to reclaim the corresponding memory. Therefore, it is recommended to use the asynchronous version of DEL: UNLINK. The latter will start a new thread to delete the target key without blocking the original thread.

Furthermore, when a key expires, Redis generally needs to delete it synchronously. One way to delete keys is to check the keys with expiration time set 10 times per second. These keys are stored in a global struct and can be accessed with server.db->expires. The way to check is:

  1. Randomly take out 20 keys from it

  2. Delete the expired ones.

  3. If more than 25% of the just 20 keys (that is, more than 5) are expired, Redis believes that there are quite a lot of expired keys, so continue to repeat step 1 until the exit condition is met: some of the keys removed There are not so many past keys.

The performance impact here is that if a lot of keys really expire at the same time, then Redis will really delete it in a loop, occupying the main thread.

In this regard, the Redis author’s suggestion [10] is to be wary of the EXPIREAT command, because it is more likely to cause the keys to expire at the same time. I have also seen some suggestions to set a random fluctuation amount for the expiration time of the keys. Finally, redis.conf also gives a method to change the key expiration delete operation to asynchronous, that is, set lazyfree-lazy-expire yes in redis.conf.

Optimize data structure and use correct algorithm

The efficiency of adding, deleting, modifying and checking a data type (such as string, list) is determined by its underlying storage structure.

When we use a data type, we can pay proper attention to its underlying storage structure and its algorithm, and avoid using too complicated methods. Give two examples:

  1. The time complexity of ZADD is O(log(N)), which is more complicated than adding a new element to other data types, so use it with care.

  2. If the number of fields of the Hash type value is limited, it is likely to use the ziplist structure for storage, and the query efficiency of the ziplist may not be as efficient as the hashtable with the same number of fields. If necessary, you can adjust the Redis storage structure.

In addition to time performance considerations, sometimes we also need to save storage space. For example, the ziplist structure mentioned above saves storage space than the hashtable structure (the author of Redis Essentials inserts 500 fields into the Hash of the hashtable and ziplist structure respectively, and each field and value is a string of about 15 digits. As a result The space used by the hashtable structure is 4 times that of the ziplist.). But for space-saving data structures, the algorithm complexity may be very high. Therefore, we need to make trade-offs in the face of specific issues. Welcome to follow the official account: Zhu Xiaosi's blog, reply: 1024, you can get redis exclusive information.

How to make better trade-offs? I think I have to dig deep into Redis's storage structure to make myself feel at ease. We will talk about this content next time.

The above three points are considerations at the programming level, and you should pay attention to them when writing programs. The following points will also affect the performance of Redis, but to solve them, it is not only the adjustment of the code level, but also the consideration of architecture and operation and maintenance.

Consider whether the operating system and hardware affect performance

The external environment in which Redis runs, that is, the operating system and hardware, obviously also affects the performance of Redis. In the official document, some examples are given:

  1. CPU: Intel's various CPUs are better than AMD Opteron series

  2. Virtualization: physical machines are better than virtual machines, mainly because on some virtual machines, the hard disk is not a local hard disk, and the monitoring software causes the fork instruction to be slow (fork is used for persistence), especially when Xen is used for virtualization .

  3. Memory management: In the Linux operating system, in order to allow the translation lookaside buffer, or TLB, to manage more memory space (TLB can only cache a limited number of pages), the operating system makes some memory pages larger, such as 2MB or 1GB, Instead of the usual 4096 bytes, these large memory pages are called huge pages. At the same time, in order to facilitate programmers to use these large memory pages, a transparent huge pages (THP) mechanism has been implemented in the operating system to make large memory pages transparent to them and can be used like normal memory pages. But this mechanism is not required by the database. It may be because THP will make the memory space compact and continuous. As the mongodb document [11] clearly stated, the database needs sparse memory space, so please Disable THP function. Redis is no exception, but the reason given on the Redis official blog is that the use of large memory pages will slow down the fork speed when bgsave; if these memory pages are modified in the original process after the fork, they need to be copied (Ie copy on write), such copying will consume a lot of memory (after all, people are huge pages, copying a copy consumes a lot of cost). Therefore, please disable the transparent huge pages function in the operating system.

  4. Swap space: When some memory pages are stored in the swap space file and Redis needs to request the data, the operating system will block the Redis process, and then take the desired page out of the swap space and put it into the memory. This involves the blocking of the entire process, so it may cause a delay problem. One solution is to prohibit the use of swap space (as suggested in Redis Essentials, if the memory space is insufficient, please use other methods).

Consider the cost of persistence

An important function of Redis is persistence, which is to copy data to the hard disk. Based on persistence, Redis has data recovery and other functions.

But maintaining this persistent function also has performance overhead.

First of all, RDB is fully persistent.

This persistence method packs all the data in Redis into an rdb file and stores it on the hard disk. However, the original process to perform the RDB persistence process forks a child process, and the fork system call takes time. According to an experiment done by Redis Lab 6 years ago [12], a new AWS EC2 m1.small^ On 13th, it took 700+ milliseconds to fork a Redis process that occupies 1GB of memory. During this time, redis was unable to process the request.

Although today's machines should be better than that time, the overhead of fork should also be considered. For this reason, use a reasonable RDB persistence interval, not too frequently.

Next, let's look at another persistence method: AOF incremental persistence.

This persistence method will save the instructions you sent to the redis server in the form of text (the format follows the redis protocol). During this process, two system calls will be called, one is write(2), the synchronization is completed, and the other is fsync(2), complete asynchronously.

Both of these may be the cause of the delay problem:

  1. The write may be blocked because the output buffer is full, or the kernel is synchronizing the data in the buffer to the hard disk.

  2. The function of fsync is to ensure that the data written to the aof file by write falls on the hard disk. On a 7200 rpm hard disk, there may be a delay of about 20 milliseconds, and the consumption is still quite large. More importantly, write may be blocked while fsync is in progress.

Among them, write blocking seems to be acceptable, because there is no better way to write data to a file. But for fsync, Redis allows three configurations, which one you choose depends on your balance between backup timeliness and performance:

  1. always: When appendfsync is set to always, fsync will be executed synchronously with the client's instructions, so it is most likely to cause delay problems, but the backup timeliness is the best.

  2. everysec: fsync is executed asynchronously every second. At this time, the performance of redis will be better, but fsync may still block write, which is a compromise choice.

  3. no: redis will not actively start fsync (not never fsync, that is unlikely), and the kernel determines when to fsync

Use a distributed architecture-read and write separation, data fragmentation

Above, we are all based on single or single Redis service for optimization. Next, we consider the use of distributed architecture to ensure Redis performance when the scale of the website becomes larger.

First of all, under what circumstances have to (or best) use a distributed architecture:

  1. The amount of data is so large that it is impossible for a single server to fit in the memory, such as 1 T

  2. Need for high service availability

  3. Single request pressure is too high

To solve these problems, you can use data fragmentation or master-slave separation, or both (that is, the master-slave structure is also set on the cluster node used for fragmentation).

Such an architecture can add new entry points for performance improvement:

  1. Send slow instructions to some slave libraries for execution

  2. Put the persistence function on a rarely used slave library

  3. Fragment some large lists

The first two are based on the single-threaded feature of Redis, using other processes (or even machines) to supplement performance.

Of course, using a distributed architecture may also have an impact on performance. For example, requests need to be forwarded and data needs to be continuously replicated and distributed. (To be checked)

to sum up

In fact, there are many things that also affect the performance of Redis, such as active rehashing (rehashing the main table of keys, 10 times per second, turning it off can improve a little performance), but this blog has been written for a long time. Moreover, the more important thing is not to collect the problems that have been asked by others, and then memorize the solutions; but to master the basic principles of Redis, and solve new problems in a constant way.

The editor summarized the 2020 interview questions. The modules included in this interview question are divided into 19 modules, namely: Java basics, container, multithreading, reflection, object copy, Java Web, exceptions, network, design patterns, Spring/ Spring MVC, Spring Boot/Spring Cloud, Hibernate, MyBatis, RabbitMQ, Kafka, Zookeeper, MySQL, Redis, JVM.

Pay attention to the public account: programmer Bai Nannan, get the above information.

Guess you like

Origin blog.51cto.com/14975073/2562996