1. Key-value design of Redis
1.1 key structure
When customizing the Key of Redis, it is best to follow the following three rules:
- Basic format: [business name]: [data name]: [id]
- The length does not exceed 44 bytes
- does not contain special characters
For example: login business, the key to save user information is defined as login: user: 10 ( [business name]: [data name]: [id] )
advantage:
-
Readable
-
Avoid key conflicts
-
Easy to manage (use the visualization tool to display the clear and clear hierarchical structure)
-
More memory-saving: the key is a string type, and the underlying encoding includes int, embstr, and raw. embstr is used in less than 44 bytes, using continuous memory space, and the memory footprint is smaller
1.2 BigKey problem
BigKey is usually judged comprehensively based on the size of the key and the number of members in the key, such as:
- The data volume of the key itself is too large: a key of string type, its value is 5MB
- There are too many members in the key: a key of type ZSET has 10,000 members
- The amount of data of the members in the key is too large: For example, a Hash type key may not have many members, but the value of each member has a large amount of data
Recommended value:
- The value of a single key is less than 10KB
- For collection type keys, it is recommended that the number of elements be less than 1000
1. The dangers of BigKey
-
Network congestion
When performing a read request on BigKey, a small amount of QPS may cause the bandwidth usage to be fully occupied, causing the Redis instance or even the physical machine to slow down -
Data skew
The memory usage of the Redis instance where the BigKey is located is far higher than other instances, and the memory data of the data fragmentation cannot be balanced -
Redis blocking.
It takes a long time to perform calculations on hash, list, zset, etc. with many elements, causing the main thread to be blocked. -
CPU pressure
The data serialization and deserialization of BigKey will cause the CPU usage to soar, affecting the Redis instance and other native applications
2. Find BigKey
-
redis-cli -a 密码 --bigkeys
Using the parameters provided by the redis client, you can traverse and analyze all the keys, and return the overall statistical information of the key and the big key of Top1 of each data type (Top1 of each data type may
be bigKey or not, because it is possible that this The number of data type keys used is small, and some frequently used data types may also be Top2 BigKey, so the statistics are not complete) -
scan
Program by yourself, use scan to scan all keys in Redis, and use commands such as strlen and hlen to judge the length of the key. Scan
is to divide all keys into multiple parts to scan, so as to avoid scanning all keys under tens of millions of keys and affecting the performance of the main thread -
The third tool
uses third-party tools, such as Redis-Rdb-Tools, to analyze RDB snapshot files and comprehensively analyze memory usage -
Network monitoring
Customized tools to monitor Redis network data, and actively alarm when the warning value is exceeded (generally use tools in cloud services for analysis, such as: Alibaba Cloud)
3. Delete BigKey
Use unlink
asynchronous deletion (after Redis 4.0)
4. Avoid BigKey
Example 1: For example, when storing a User object, there are three storage methods:
As can be seen from the figure, when storing data such as User objects, it is best to choose a hash structure for storage, which occupies less space and can flexibly access any field of the object (hash Structure storage pay attention to data type conversion)
Example 2: If there are 1 million pairs of field and value for a hash type key, what is wrong with this key? How to optimize?
Existing problems:
- When the number of hash entries exceeds 500, the hash table will be used instead of ZipList, which takes up more memory
- You can
hash-max-ziplist-entries
configure the upper limit of entry. But if there are too many entries, it will cause BigKey problems
The string type occupies too much memory, and it is troublesome to obtain data in batches, without the relevance of the hash structure
Note: Some short-term used keys or infrequently used keys set an expiration time, so that BigKey problems will not occur
2. Batch optimization
2.1 Pipeline
1. Advantages of batch processing
Execution process of a single command
Response time of a command = 1 round-trip network transmission time + 1 Redis command execution time
Network transmission time consumption: ms millisecond level Redis command execution time consumption: us microsecond level
If N commands are executed, the time spent on N round-trip network transmissions will be increased by N times, and the time consumed to execute commands can be ignored. At this time, it takes a long time to execute N commands, which will affect the performance of Redis
If N commands are executed for one network transmission, the time-consuming problem of batch data operations can be solved and the efficiency can be improved.
Redis provides many batch processing commands, which can insert data in batches, for example:
mset
hmset
Note: Do not transmit too many commands in one batch, otherwise a single command will occupy too much bandwidth and cause network congestion
Since mset
only string-type data can be batch-processed, but hmset
only hash-type data can be operated on, Pipeline can be used to process complex data types
Pipeline is similar to a pipeline, and commands are put into the pipeline for transmission.
There is no atomicity among the multiple commands of the Pipeline , and there may be a time-consuming problem of multi-threaded queue jumping and waiting.
2.2 Batch processing under the cluster
mset
Batch processing such as or Pipeline needs to carry multiple commands in one request. At this time, if Redis is a cluster, the multiple keys of the batch processing commands must fall into one slot, otherwise the execution will fail.
Solution:
Since hash_tag is prone to data skew, data skew in cluster mode may cause single-point problems. Therefore, parallel slots are often used to implement batch processing under clusters. stringRedisTemplate
There are encapsulated usages in the redis package integrated by spring , and the underlying implementation is an asynchronous pipeline transmission mode.
For example:stringRedisTemplate.opsForValue().multiSet();
3. Server optimization
3.1 Persistent configuration
Although the persistence of Redis ensures data security, it also brings a lot of additional overhead, so persistence generally follows the following recommendations:
- The Redis instance used for caching should not enable the persistence function as much as possible (open multiple Redis instances for complex and different businesses)
- It is recommended to turn off the RDB persistence function and use AOF persistence
- Use the script to regularly do RDB on the slave node to realize data backup
- Set a reasonable rewrite threshold to avoid frequent bgrewrite
- Configuration
no-appendfsync-on-rewrite = yes
, prohibit AOF during rewrite, avoid blocking caused by AOF (possible data loss)
Redis persistence details
Deployment related suggestions:
- The physical machine of the Redis instance should reserve enough memory to deal with fork and rewrite
- The memory limit of a single Redis instance should not be too large, such as 4G or 8G. It can speed up the speed of fork, reduce the pressure of master-slave synchronization and data migration
- Do not deploy with CPU-intensive applications (fork consumes more CPU)
- Do not deploy with high disk load applications. For example: database, message queue
The summary is that redis is best to monopolize a server hh
3.2 Slow query
Commands that take more than a certain threshold to execute in Redis are called slow queries. Since Redis is a single-threaded execution command, executing a command will be put into the queue and queued for execution. At this time, if a slow query takes a long time, the command waiting in the queue may be due to the configuration of
the waiting timeout error threshold
:
slowlog-log-slower-than
: Slow query threshold, in microseconds. The default is 10000, and it is recommended that 1000
slow queries will be put into the slow query log. The length of the log has an upper limit, which can be specified through configuration:slowlog-max-len
: The length of the slow query log (essentially a queue). The default is 128, and 1000 is recommended.
Check the list of slow query logs:slowlog len
: Query the length of the slow query logslowlog get [n]
Read n slow query logsslowlog reset
: Clear the list of slow queries
Can use desktop client RESP
3.3 Memory Safety and Configuration
When the Redis memory is insufficient, it may cause problems such as frequent deletion of keys, longer response time, and unstable QPS. When the memory usage reaches more than 90%, you need to be vigilant and quickly locate the reason for the memory usage.
View the current memory allocation status of Redis
info memory
memory xxx
memory buffer configuration
- Copy buffer: master-slave copy
repl_backlog_buf
, if it is too small, it may cause frequent full copy and affect performance.repl-backlog-size
Set by, the default is 1mb - AOF buffer: the cache area before AOF flushing, and the buffer where AOF executes rewrite. Unable to set capacity cap
- Client buffer: divided into input buffer and output buffer, the maximum input buffer is 1G and cannot be set. The output buffer can be set
3.4 Cluster problems
1. Data integrity issues
In the default configuration of Redis, if any slot is found to be unavailable, the entire cluster will stop serving externally
2. Bandwidth issues
Cluster nodes will constantly ping each other to determine the status of other nodes in the cluster. The information carried by each Ping includes at least:
- slot information
- Cluster status information
The more nodes in the cluster, the larger the data volume of the cluster status information. The relevant information of 10 nodes may reach 1kb. At this time, the bandwidth required for each cluster intercommunication will be very high.
Solutions:
- Avoid large clusters. The number of cluster nodes should not be too many, preferably less than 1000. If the business is huge, multiple clusters should be established.
- Avoid running too many Redis instances on a single physical machine
- Configure the appropriate
cluster-node-timeout
value
3. Data skew problem
During BigKey or batch processing, use the same hash_tag to ensure that all keys fall into the same slot, which will cause the data volume of some nodes to be much larger than that of other nodes
4. Client performance issues
When using a client to connect to a Redis cluster, node selection, slot judgment, and read/write judgment are required, which will affect client performance
5. Command cluster compatibility issues
For example, if batch processing mset
cannot fall on the same slot in the cluster mode, an error will be reported.
6. Lua and transaction issues
The keys of multiple commands must be in the same slot, if not, an error will be reported. Therefore, lua and transactions cannot be used in cluster mode
Monolithic Redis (master-slave Redis) has reached 10,000-level QPS, and also has strong high availability. If the master-slave can meet the business needs, then try not to build a Redis cluster
Reference learning: dark horse Redis entry to actual combat