Redis basic principles and common mistakes

1. Redis storage

Since Redis data is stored in memory, if there is no configuration persistence, all data will be lost after redis restarts. So you need to turn on the redis persistence function and save the data to the disk. When redis restarts, you can download it from the disk. Data recovery. Redis provides two methods for persistence, one is RDB persistence (the principle is to periodically dump the database records of Reids in memory to the RDB persistence on disk), and the other is AOF persistence (the principle is to save Reids) The operation log is written to the file by appending) .

1.1 RDB

RDB persistence refers to writing a snapshot of a data set in memory to disk within a specified time interval. The actual operation process is to fork a sub-process, first writing the data set to a temporary file, and then replacing the previous file after the write is successful. , Use binary compression storage.

img

1.1.1 Advantages

1). Once this method is adopted, your entire Redis database will only contain one file, which is perfect for file backup. For example, you may plan to archive the last 24 hours of data once every hour, and also archive the last 30 days of data once a day. Through such a backup strategy, once the system has a catastrophic failure, we can recover very easily.

2). For disaster recovery, RDB is a very good choice. Because we can easily compress a single file and then transfer it to other storage media.

3). Maximize performance. For the Redis service process, when it starts to persist, the only thing it needs to do is to fork the child process, and then the child process will complete the persistence work, which can greatly avoid the service process from performing IO operations.

4). Compared with the AOF mechanism, if the data set is large, the startup efficiency of RDB will be higher.

1.1.2 Disadvantages

1). If you want to ensure high data availability, that is, to avoid data loss to the greatest extent, RDB will not be a good choice. Because once the system is down before the timed persistence, the data that has not had time to write to the disk before will be lost.

2). Because RDB assists in the completion of data persistence through fork sub-processes, if the data set is large, it may cause the entire server to stop serving hundreds of milliseconds, or even 1 second.

1.2 AOF

AOF persistence records every write and delete operation processed by the server in the form of a log. The query operation will not be recorded, but it will be recorded in the form of text. You can open the file to see the detailed operation record.

img

1.2.1 Advantages

1). This mechanism can bring higher data security, that is, data persistence. Redis provides three synchronization strategies, namely, synchronization per second, synchronization per modification, and non-synchronization. In fact, synchronization per second is also completed asynchronously, and its efficiency is also very high. The difference is that once the system goes down, the modified data within one second will be lost. And every time the synchronization is modified, we can regard it as synchronization persistence, that is, every data change that occurs will be immediately recorded to the disk. It can be predicted that this method is the lowest in efficiency. As for no synchronization, no need to say more, I think everyone can understand it correctly.

2). Because this mechanism uses append mode for the write operation of the log file, even if there is a downtime during the writing process, the existing content in the log file will not be destroyed. However, if we only write half of the data in this operation, the system crashes, don't worry, we can use the redis-check-aof tool to help us solve the problem of data consistency before Redis is started next time.

3). If the log is too large, Redis can automatically enable the rewrite mechanism. That is, Redis continuously writes the modified data to the old disk file in the append mode, and at the same time, Redis will also create a new file to record which modification commands have been executed during this period. Therefore, data security can be better guaranteed during rewrite switching.

4). AOF contains a clear and easy-to-understand log file to record all modification operations. In fact, we can also complete the data reconstruction through this file.

1.2.2 Disadvantages

1). For the same number of data sets, the AOF file is usually larger than the RDB file. RDB is faster in restoring large data sets than AOF.

2). According to different synchronization strategies, AOF tends to be slower than RDB in terms of operating efficiency. In short, the efficiency of the synchronization strategy per second is relatively high, and the efficiency of the synchronization disable strategy is as efficient as RDB.

1.3 Selection strategy

The choice between the two is based on whether the system is willing to sacrifice some performance in exchange for higher cache coherency (aof), or is willing to not enable backup in exchange for higher performance when the write operation is frequent, and when save is manually run , And then do a backup (rdb). Rdb is even more meaning to be eventually consistent. However, the production environment is actually more of a combination of the two.

reference:

The difference between RDB and AOF

Redis snapshot

2. Redis error handling

The exception of Could not get a resource from the pool often occurs in the use of redis:

redis.clients.jedis.exceptions.JedisExhaustedPoolException: Could not get a resource since the pool is exhausted] with root cause;java.util.NoSuchElementException: Unable to validate object

redis.clients.jedis.exceptions.JedisException: Could not get a resource from the pool

  • Mainly the configuration of the connection pool and the release of the connection , **no connection is available in the connection pool caused by the failure to release the connection in time. **But jedis.close() is added to redisDao

  • Currently, it is solved by restarting redis and adding jedis=null after jedis.close().

  • Connection pool configuration parameters

    • Maximum number of connections : MAX_ACTIVE, supports the maximum number of connections in use at the same time. Setting too large may waste system performance.

    • The maximum time to wait for an available connection : MAX_WAIT

    • Sometimes in order to ensure that the request is responded quickly, a certain idle connection (setMinIdle) is maintained . In the saturation state of the connection pool, there are at most (MAX_ACTIVE-MinIdle) connections

    • In Redis 2.4, the maximum number of connections is directly hard-coded in the code, and in version 2.6 this value becomes configurable. The default value of maxclients is 10000 that is to say, redis allows up to 10000 connections by default. Of course, this depends on the hardware environment, CPU/memory situation; no matter how big the demand is, it can only be distributed/clustered.

    • Refer to this link to modify the parameters: https://blog.csdn.net/jiguquan3839/article/details/90739263

Reference: Summary of solutions

LOADING Redis is loading the dataset in memory

  • Increase maxmemory in redis.conf and turn on the conversion function at the same time
maxmemory 30GB
maxmemory-policy allkeys-lru
appendonly no #关闭AOF模式,默认关闭
  • When the redis instance starts, the aof file is automatically loaded. It is necessary to ensure that the maximum memory provided is greater than the dump file, otherwise the data will not be fully loaded.
  • If the file exceeds 6G, it will take some time to load. At startup, redis will listen to the port first, and then load the aof data, so that the client can grasp the redis loading progress in time. When loading aof, the ping command will return LOADING, and redis is neither readable nor writable at this time. If it is the master node, the master node loading=1, the status of the client is master_link_status=down; if it is the slave node, loading=1, master_link_status=err, flags=slaves. Generally, the redis driver only judges that the flags are s_down o_down disconnected and judges that the node is unavailable. When loading, both redis and sentinel think that this node can provide services normally, but in fact it is neither readable nor writable at this time.

redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snap

  • Appears when importing data to redis using python
  • The cause of the error: forced to close the redis snapshot, resulting in the problem of not being able to persist.
  • Reason:
  • Solution (not tried, the current method is to restart redis directly):
1.通过redis-cli连接到服务器后执行以下命令:
config set stop-writes-on-bgsave-error no

2.修改redis.conf文件:vi打开redis-server配置的redis.conf文件,然后定位到stop-writes-on-bgsave-error字符串所在位置,接着把后面的yes设置为no即可。

python导入数据:redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error.

redis的log信息:# Write error saving DB on disk: No space left on device

  • Insufficient hard disk storage, expand hard disk

3. Redis cluster and sharding

  • Horizontal: Multiple hosts provide services collaboratively, that is, multiple Redis instances are distributed and run collaboratively.

  • Vertical: Run multiple Redis instances on one machine at the same time (multi-threaded cooperation).

3.1 Redis Sharding based on client sharding

Redis Sharding can be said to be a multi-Redis instance cluster method commonly used in the industry before Redis Cluster came out. The main idea is to use a hash algorithm to hash the keys of Redis data. Through the hash function, a specific key is mapped to a specific Redis node. In this way, the client knows which Redis node to operate data to. Sharding architecture is shown in the figure:

img

Fortunately, the Java redis client driver jedis already supports the Redis Sharding function, namely ShardedJedis and ShardedJedisPool combined with buffer pool.

3.2 Redis Cluster based on server sharding

Redis Cluster is a server sharding technology, and version 3.0 is officially available. In Redis Cluster, Sharding adopts the concept of slot (slot), which is divided into 16384 slots in total, which is a bit similar to the pre sharding idea mentioned earlier. For each key-value pair that enters Redis, it is hashed according to the key and assigned to one of the 16384 slots. The hash algorithm used is also relatively simple, that is, modulo 16384 after CRC16. Each node (node) in the Redis cluster is responsible for apportioning part of the 16384 slots, that is, each slot corresponds to a node responsible for processing. When dynamically adding or reducing node nodes, 16384 slots need to be redistributed, and the key values ​​in the slots must also be migrated. Of course, this process, in the current implementation, is still in a semi-automatic state and requires manual intervention.

For Redis clusters, it is necessary to ensure that the nodes corresponding to 16384 slots are working normally. If a node fails, the slots it is responsible for will also fail, and the entire cluster will not work. In order to increase the accessibility of the cluster, the official recommended solution is to configure the node as a master-slave structure , that is, a master master node and n slave slave nodes. At this time, if the master node fails, Redis Cluster will select one of the slave nodes to become the master node according to the election algorithm, and the entire cluster will continue to provide services to the outside world. This is very similar to the Redis Sharding scenario mentioned in the previous article, and the server nodes use the Sentinel monitoring architecture to become a master-slave structure, but Redis Cluster itself provides the ability to failover and fault tolerance.

Redis Cluster's new node identification capabilities, fault judgment and failover capabilities are through each node in the cluster communicating with other nodes, which is called a cluster bus. They use a special port number, that is, the external service port number plus 10000. For example, if the port number of a node is 6379, then the port number it communicates with other nodes is 16379. The communication between nodes uses a special binary protocol.

For the client, the entire cluster is regarded as a whole. The client can connect to any node for operation, just like operating a single Redis instance. When the client's operating key is not assigned to the node, it is like operating As with a single Redis instance, when the key operated by the client is not assigned to the node, Redis will return a redirection instruction to point to the correct node, which is a bit like a 302 redirect on a browser page.

Redis Cluster was officially launched after Redis 3.0, which is relatively late. At present, there are not many cases that can prove successful in a large-scale production environment, and it needs time to test.

3.3 Proxy middleware realizes large-scale Redis cluster

For example, twemproxy is located between the client and the server, and after certain processing (such as sharding), the request sent by the client is forwarded to the real Redis server on the backend. In other words, the client does not directly access the Redis server, but indirectly through the twemproxy proxy middleware.

3.4 Current deployment

The current redis cluster realizes data sharing through remote redis communication. The realization of data distributed storage is imported separately through manual segmentation.

The client only reads redis data.

Reference materials:

Redis cluster and sharding-redis server cluster, client sharding

4. Redis high availability

In the stand-alone mode, as long as one server is down, the service cannot be provided, which may result in low service efficiency and even unavailability of the corresponding service application. Therefore, Redis provides multiple high-availability solutions:

  • Redis master-slave replication
  • Redis persistence
  • Sentinel Group
  • ……

Reference materials:

Redis high availability summary: Redis master-slave replication, sentinel cluster, split brain...

Currently deployed (107):

nc -vz IP PORT

crontab judges the status of the ip port every minute

5. Redis compressed storage

5.1 Intuitive Strategy

  • Simplify key names and key values
  • Use Integer/Long Integer
  • Shared with
  • Use byte[] type storage

5.2 Internal coding optimization

  • Save more space according to Redis internal coding rules

Redis provides two internal encoding methods for each data type. Take the hash type as an example. The hash type is implemented through a hash table, so that O(1) time complexity search and assignment operations can be realized. However, when there are few elements in the key , O(1) operations It will not have a significant performance improvement than O(n), so in this case Redis will adopt a more compact internal coding method with less performance (the time complexity of obtaining elements is O(n)).

Automatic conversion, cannot be artificially defined.

references:

save space

6. Redis high concurrency (Redis advantage)

1. Redis is based on memory, and the read and write speed of memory is very fast;

2. Redis is single-threaded, which saves a lot of time for context switching threads;

3. Redis uses multiplexing technology to handle concurrent connections. The internal implementation of non-blocking IO uses epoll, which uses a simple event framework implemented by epoll+. Reading, writing, closing, and connection in epoll are all converted into events, and then using the multiplexing feature of epoll, never waste any time on io.

Reference materials:

About Redis handling high concurrency

7. Miscellaneous

Although the Java language is used in the back-end framework, I still prefer to use python in daily processing. I feel this is a bad habit...

Guess you like

Origin blog.csdn.net/MaoziYa/article/details/114269574