[Redis] Redis best practice/experience summary

[Redis] Redis best practice/experience summary

1. Redis key-value design

1.1 Elegant key structure

Best practice conventions for keys in Redis:

  1. Follow the format -> [business name]:[data name]:[id] -> example: login:user:1
  2. The length of the key does not exceed 44 bytes
  3. does not contain special characters

The advantages of this design:

  1. Readable
  2. Avoid key conflicts
  3. Easy to manage
  4. More memory-saving: the key is a string type, and the underlying encoding includes int, embstr, and raw. embstr is used in less than 44 bytes, using continuous memory space, and the memory footprint is smaller

1.2 Reject BigKey

BigKey usually refers to a key that occupies a relatively large memory space, such as a hash or zset containing a large number of elements, or a key with a relatively large value of string type. For example:

  • The data volume of the key itself is too large: a key of String type has a value of 5MB.
  • Too many members in the key: a key of type ZSET has 10,000 members.
  • The data volume of the members in the key is too large: a Hash type Key has only 1,000 members, but the total size of the Value (value) of these members is 100 MB.

Recommended :

  • The value of a single key is less than 10KB
  • For collection type keys, it is recommended that the number of elements be less than 1000

1.2.1 The danger of BigKey

  1. network congestion
    • When performing a read request on BigKey, a small amount of QPS may cause the bandwidth usage to be fully occupied, causing the Redis instance and even the physical machine where it resides to slow down
  2. data skew
    • The memory usage rate of the redis instance where BigKey is located is far higher than that of other instances, and the memory resources of data fragmentation cannot be balanced
  3. Redis blocking
    • Operations on hash, list, and zset with many elements will take a long time, causing the main thread to be blocked.
  4. CPU pressure
    • The data serialization and deserialization of BigKey will cause the CPU usage to soar, affecting the Redis instance and other native applications.

1.2.2 How to discover BigKey

  1. redis-cli --bigkeys
    • Using the --bigkeys parameter provided by redis-cli, you can traverse and analyze all keys, and return the overall statistics of Key and the Big Key of Top1 of each data
  2. scan scan
    • Program by yourself, use scan to scan all keys in Redis, and use commands such as strlen and hlen to determine the length of the key (MEMORY USAGE is not recommended here)
  3. third party tools
    • Such as RESP客户端(visualization tools similar to navicat) analyze RDB snapshot files and comprehensively analyze memory usage
  4. Network Monitoring
    • Custom tools to monitor the network data entering and leaving Redis, and actively alarm when the warning value is exceeded

1.2.3 How to delete BigKey

BigKey takes up a lot of memory, and it takes a long time to delete such a key, causing the main thread of Redis to be blocked and causing a series of problems.

  1. Redis 3.0 and below
    • If it is a collection type, traverse the elements of BigKey, first delete the child elements one by one, and finally delete the BigKey
  2. After Redis 4.0
    • Redis provides an asynchronous delete command after 4.0: unlink

1.3 Appropriate data types

1.3.1 Example 1

If we want to store a User object, we have three storage methods:

  1. json string:image-20230622192013934
    • Advantages: easy to implement
    • Disadvantages: data coupling, not flexible enough
  2. The fields are broken up:tupian
    • Advantages: flexible access to any field of the object
    • Disadvantages: It takes up a lot of space, and there is no way to do unified control
  3. hash (recommended):image-20230622192301456
    • Advantages: low-level use ziplist, small space occupation, flexible access to any field of the object.
    • Disadvantages: The code is relatively complex.

1.3.2 Example 2

If there is a hash type key, which has 1 million pairs of fields and values, and the field is an auto-increment id, what is the problem with this key? How to optimize?

Solution 1: Use hash to store 1 million pairs of fields and values

image-20230622192559160

There is a problem:

  1. When the number of hash entries exceeds 500, the hash table will be used instead of the ZipList, which takes up more memory.
  2. The upper limit of entries can be configured through hash-max-ziplist-entries. But if there are too many entries, it will cause BigKey problems

The memory usage of this storage method is as follows:

image-20230622192728624


Solution 2: split into string type

image-20230622192900095

There is a problem:

  1. There is not much memory optimization at the bottom of the string structure, and it takes up a lot of memory.
  2. It is cumbersome to obtain these data in batches.

The memory usage is as follows:

image-20230622193004791


Solution 3: Split into small hashes, use id/100 as the key, and use id%100 as the field, so that every 100 elements are hashes

image-20230622193113984

The memory usage is as follows:

image-20230622193127976

Summary: Option 3 has the smallest memory footprint, and the storage method of Option 3 should be used.


2. Batch optimization

2.1 Pipeline

2.1.1 Execution process of a single command

Response time of a command = 1 round-trip network transmission time + 1 Redis command execution time

image-20230622193445160


2.1.2 N commands are executed sequentially

Response time of N commands = N round-trip network transmission time + N Redis command execution time

image-20230622193533544


2.1.3 Batch execution of N commands

Response time of N commands = 1 round-trip network transmission time + N Redis command execution time

image-20230622193613120


2.1.4 MSET and Pipeline

Redis provides many commands such as Mxxx, which can insert data in batches, such as: mset, hmset.

Code example:

// 定义要设置的key-value对
Map<String, String> keyValueMap = new HashMap<>();
keyValueMap.put("key1", "value1");
keyValueMap.put("key2", "value2");

// 使用mset方法设置key-value对
redisTemplate.opsForValue().multiSet(keyValueMap);

Although MSET can be processed in batches, it can only operate on some data types. Therefore, if there is a need for batch processing of complex data types, it is recommended to use the Pipeline function:

List<Object> results = redisTemplate.executePipelined(new RedisCallback<Object>() {
    
    
  public Object doInRedis(RedisConnection connection) throws DataAccessException {
    
    
    connection.set("key1".getBytes(), "value1".getBytes());
    connection.get("key1".getBytes());
    connection.set("key2".getBytes(), "value2".getBytes());
    connection.get("key2".getBytes());
    return null;
  }
});

for (Object result : results) {
    
    
  System.out.println(result);
}

Note :

  1. It is not recommended to carry too many commands at a time during batch processing
  2. There is no atomicity between multiple commands of Pipeline

2.2 Batch processing under the cluster

Batch processing such as MSET or Pipeline needs to carry multiple commands in one request. At this time, if Redis is a cluster, multiple keys of the batch processing commands must fall into one slot, otherwise the execution will fail.

serial command serial slot Parallel slot hash_tag
Implementation ideas The for loop traverses and executes each command in turn The slot of each key is calculated on the client side, and the slots are uniformly divided into one group, and each group is batch-processed by Pipeline. Execute each set of commands serially The slot of each key is calculated on the client side, and the slots are uniformly divided into one group, and each group is batch-processed by Pipeline. Execute groups of commands in parallel Set all keys to the same hash_tag, then the slots of all keys must be the same
time consuming N times network time consumption + N times command time consumption M network time consumption + N command time consumption m = number of key slots 1 network time consumption + N command time consumption 1 network time consumption + N command time consumption
advantage easy to implement less time consuming very short time Very short time-consuming and simple to implement
shortcoming takes a long time The more complex slots are implemented, the longer it will take implement complex prone to data skew

Note: The third parallel slot is used by default in the spring environment


3. Server optimization

3.1 Persistent configuration

Although the persistence of Redis can ensure data security, it will also bring a lot of additional overhead, so the following recommendations should be followed for persistence:

  1. The Redis instance used for caching should not enable the persistence function as much as possible
  2. It is recommended to turn off the RDB persistence function and use AOF persistence
  3. Use the script to regularly do RDB on the slave node to realize data backup
  4. Set a reasonable rewrite threshold to avoid frequent bgrewrite
  5. Configure no-appendfsync-on-rewrite = yes to prohibit aof during rewrite to avoid blocking caused by AOF

Deployment related suggestions:

  1. The physical machine of the Redis instance should pre-store enough memory to deal with fork and rewrite
  2. The memory limit of a single Redis instance should not be too large, such as 4G or 8G. This can speed up fork and reduce the pressure of master-slave synchronization and data migration
  3. Do not deploy with CPU-intensive applications (such as ES)
  4. Do not deploy with high hard disk load applications (database, message queue), deploy alone.

3.2 Slow query

Slow query : A command that takes more than a certain threshold in Redis execution is called a slow query. Slow reads and slow writes are collectively referred to as slow queries.

image-20230622195805718

The slow query threshold can be specified by configuration:

  • lslowlog-log-slower-than: slow query threshold, in microseconds. The default is 10000, 1000 is recommended

Slow queries will be put into the slow query log. The length of the log has an upper limit, which can be specified by configuration:

  • lslowlog-max-len: The length of the slow query log (essentially a queue). The default is 128, 1000 is recommended

image-20230622195939943

To modify these two configurations, you can use config setthe command:

image-20230622200005071


View the list of slow query logs:

  • slowlog len: query the length of the slow query log
  • slowlog get [n]: read n slow query logs
  • slowlog reset: Clear the list of slow queries


3.3 Command and security configuration

Redis will be bound at 0.0.0.0:6379, which will expose the Redis service to the public network, and if Redis does not do identity authentication, there will be serious security holes.

Vulnerability reproduction method: click to jump

The core reason for the vulnerability:

  1. Redis does not set a password
  2. config setThe Redis configuration is dynamically modified using the Redis command
  3. Start Redis with Root account privileges

To avoid such vulnerabilities, here are some suggestions:

  1. Redis must set a password
  2. It is forbidden to use these commands online: keys, flushall, flushdb, config set and other commands. You can use rename-command to rename these commands to disable them
  3. bind: limit the network card, prohibit access to the external network card
  4. Turn on the firewall
  5. Do not use the Root account to start Redis
  6. Try not to start with the default port (6379)

3.4 Memory configuration

When the Redis memory is insufficient, it may cause problems such as frequent key deletion, longer response time, and unstable QPS. When the memory usage reaches more than 90%, we need to be vigilant and quickly locate the cause of the memory usage.

memory usage illustrate
data memory It is the most important part of Redis, which stores the key-value information of Redis. The main problems are BigKey problem and memory fragmentation problem
process memory The operation of the Redis main process itself must occupy memory, such as code, constant pool, etc.; this part of memory is about several megabytes, which can be ignored compared with the memory occupied by Redis data in most production environments.
buffer memory Generally include client buffer, AOF buffer, copy buffer, etc. Client buffers include input buffers and output buffers. The memory usage of this part fluctuates greatly, improper use of BigKey may lead to memory overflow.

Redis provides some commands to view the current memory allocation status of Redis:

  • info memory

image-20230623001226289

  • memory xxx

image-20230623001240291


3.4.1 Memory buffer configuration

There are three common types of memory buffers:

  1. Copy buffer : master-slave copy repl_backlog_buf, if it is too small, it may cause frequent full copy and affect performance. repl_backlog_sizeSet by , the default is 1mb
  2. AOF buffer : the cache area before AOF flushing, the buffer where AOF executes rewrite, and the upper limit of capacity cannot be set
  3. Client buffer : divided into input buffer and output buffer, the maximum input buffer is 1G and cannot be set, the output buffer can be set

image-20230623001705908

The default configuration is as follows:

image-20230623001717267


4. Cluster Best Practices

Although the cluster has high availability features and can realize automatic fault recovery, if it is used improperly, there will be some problems:

  1. Cluster integrity issues (slots)
  2. Cluster bandwidth problem (heartbeat mechanism)
  3. Data skew problem (BigKey)
  4. Client performance issues
  5. Cluster Compatibility Issues with Commands
  6. lua and transaction issues

4.1 Cluster Integrity Issues

In the default configuration of Redis, if any slot is found to be unavailable, the entire cluster will stop external services:

image-20230623003012555

In order to ensure high availability, it is recommended to cluster-require-full-coverageset the configuration to false.


4.2 Cluster bandwidth issues

The cluster nodes will constantly ping each other to determine the status of other nodes in the cluster. The information carried by each Ping includes at least:

  • slot information
  • Cluster status information

The more nodes in the cluster, the larger the amount of cluster status information data, and the relevant information of 10 nodes may reach 1kb. At this time, the bandwidth required for each cluster intercommunication will be very high. If multiple nodes are deployed on a single machine, the bandwidth will double.

Solutions:

  1. Avoid large clusters. The number of cluster nodes should not be too many, preferably less than 1000. If the business is huge, multiple clusters should be established.
  2. Avoid running too many redis instances on a single physical machine
  3. Appropriate configuration cluster-node-timeout: timeout for node heartbeat failure
  4. Increase bandwidth: You can increase the bandwidth of the Redis cluster by adding more bandwidth or upgrading network devices to meet the needs of high-concurrency scenarios.

4.3 Data Skew Problem

In Redis cluster mode, data skew problems are often caused by the following reasons:

  1. Uneven allocation of hash slots : Redis maps all keys to hash slots, and then distributes the hash slots to each node. If too many hash slots are allocated on some nodes, it will cause some nodes to store much more data than others.
  2. Hot keys are concentrated on certain nodes : In Redis cluster mode, if some hot keys are accessed frequently, they are likely to be stored on the same node, resulting in a large load on the node.
  3. Unbalanced joining of new nodes : When a new node joins the Redis cluster, Redis will automatically allocate part of the hash slots to the new node. If the addition of new nodes is uneven, it will lead to the problem of data skew.
  4. Unbalanced node failure recovery : When a node fails, Redis will automatically redistribute the hash slots on the node to other nodes. If the load on the failed node is high, the redistributed hash slots will be concentrated on a few nodes, causing the problem of data skew.

To sum up, the data skew problem in Redis cluster mode is mainly caused by load imbalance between nodes, hotspot key concentration, new node joining and node failure recovery. In order to solve the problem of data skew, some measures need to be taken to optimize the configuration and management of Redis cluster.

Here are some ways to solve the problem of data skew in Redis cluster:

  1. Adjust hash slot allocation: Redis maps all keys to hash slots, and then distributes hash slots to each node. If there are too many hash slot allocations on some nodes, you can manually adjust the hash slot allocation to solve the problem of data skew. You can use the command redis-cliof the tool reshardor the third-party tool Redis-trib to migrate hash slots.
  2. Increase the number of nodes: increasing the number of nodes can expand the Redis cluster to solve the problem of data skew. When adding nodes, Redis will automatically allocate some hash slots to new nodes.
  3. Use virtual nodes: Virtual nodes refer to dividing a physical node into multiple virtual nodes, and each virtual node is responsible for a part of hash slots. This can avoid the occurrence of excessive allocation of hash slots on a physical node.
  4. Optimize key design: Data skew in Redis clusters is often caused by some hot keys. The design of hot keys can be optimized, such as splitting a hot key into multiple keys or using a hash table to store data, thereby reducing the load on some nodes.

4.4 Client performance issues

Under the Redis cluster, the performance problems of the client usually include the following aspects:

  1. Cluster routing performance : Redis cluster uses a fragmentation strategy based on hash slots to ensure high availability and scalability of data. This means that each node is only responsible for part of the data, and the client needs to send the request to the correct node through the cluster routing mechanism. If the performance of the cluster routing mechanism is insufficient or there is a bottleneck, it will affect the throughput and latency of the entire system.
  2. Cluster node load balancing : In a Redis cluster, the number and configuration of nodes may be different, so the load between nodes may also be unbalanced. If the load of a certain node is too high, the performance of the node will be degraded, which in turn will affect the performance of the entire system.
  3. Read-write separation performance : In Redis clusters, read-write separation is usually used to optimize system performance. However, if the implementation of read-write separation is not good enough, problems such as read-write inconsistency and load imbalance may occur, which will affect the performance and stability of the system.
  4. Network communication performance : In a Redis cluster, frequent network communication is required between nodes in order to jointly maintain the status and data of the entire cluster. Therefore, network communication performance is also one of the important factors affecting the performance of the Redis cluster client.

In order to avoid these problems, the following measures can be taken:

  1. Choose the right client: Different Redis clients may have differences in processing cluster routing, read-write separation, etc., so you need to choose the right client based on the actual situation.
  2. Use a load balancer: Solve the problem of node load imbalance by using a load balancer.
  3. Distributed locks: Use distributed locks to ensure data consistency when different clients access the same resource concurrently.
  4. Preheating cache: Preheating the cache when the system starts to avoid performance problems caused by a large number of simultaneous influx of requests.
  5. Reasonably configure the Redis cluster: set the appropriate number of nodes, the number of hash slots and other configuration parameters to achieve the best performance and availability.

4.5 Cluster Compatibility Issues for Commands

In the Redis cluster, some commands are not supported or you need to pay attention to compatibility issues when using them. Here are some common commands and their cluster compatibility issues:

  1. KEYS command: In a Redis cluster, the KEYS command will traverse the entire cluster, which may affect the performance of the entire cluster.
  2. MIGRATE command: The MIGRATE command needs to use the migration slot to specify the target node, but before the Redis 3.x version, the migration slot is not dynamic, so you need to be careful when using the MIGRATE command.
  3. FLUSHDB and FLUSHALL commands: In the Redis cluster, FLUSHDB and FLUSHALL will only clear the data of the current node, not the entire cluster.
  4. SORT command: The SORT command can only be used on a single node in a Redis cluster because it needs to sort the entire collection instead of spreading it across multiple nodes.
  5. PUBLISH command: Since there is no central node in the Redis cluster, the PUBLISH command cannot be directly used to broadcast messages. Instead, Lua scripts can be used to implement broadcast functionality.

In short, when using Redis cluster, you need to understand the cluster compatibility issues of each command, and take corresponding measures to ensure the stability and performance of the cluster.


4.6 lua and transaction issues

In the Redis cluster mode, since the data is scattered and stored in different nodes, the following points need to be noted when using Lua scripts or transactions for operations:

  1. Use Lua scripts: In Redis cluster mode, Lua scripts can be executed on any node. redis.callHowever, if the script needs to access multiple keys, these keys may be distributed on different nodes. At this time, you need to use the or function in the script to redis.pcallexplicitly specify the key of which node to access. Otherwise, if the keys involved in the script are distributed on different nodes, Redis will throw MOVEDan error.
  2. Using transactions: The implementation of Redis transactions relies on atomicity and consistency on a single node. In cluster mode, since the data is stored in a decentralized manner, when a transaction needs to access multiple keys, these keys may be distributed on different nodes, which leads to the inability to guarantee the atomicity and consistency of the transaction. Therefore, transactions are not recommended in Redis cluster mode.

In general, when you need to use Lua scripts or transactions in Redis cluster mode, you need to pay special attention to the distribution of keys to ensure the correctness of operations.


4.3 Cluster or master-slave

When considering deploying multiple Redis instances, an appropriate deployment solution should be selected based on specific business needs and architecture.

If you need high availability and require each node in the Redis cluster to hold a complete data set and be able to handle client requests, you can choose Redis cluster deployment. Redis cluster deployment uses sharding technology to disperse data across multiple nodes and provide seamless access interfaces for clients. Redis clusters can also automatically perform failover and load balancing, so they can provide higher availability and scalability.

On the other hand, if you only need a master node and some slave nodes to handle read and write operations, and you don't need high availability, you can choose a master-slave deployment. In this case, the master node is responsible for handling all write operations and synchronizing data to the slave nodes, while the slave nodes can only be used for read operations. This deployment scenario is relatively simple and meets most basic needs. However, if the primary node fails, you need to manually switch over to the standby node, so some downtime may result.

To sum up, when choosing a deployment solution, you should consider factors such as business requirements, data volume, availability requirements, and system scalability.

Note : Single Redis (master-slave Redis) can already reach the QPS level of 10,000, and also has strong high-availability features. If the master-slave can meet the business needs, try not to build a Redis cluster.

Guess you like

Origin blog.csdn.net/Decade_Faiz/article/details/131346119