Introduction to 4 Redis Cluster Schemes + Comparison of Advantages and Disadvantages

In service development, there will be a single point of failure problem in a single machine, that is, the service is deployed on a server, and once the server is down, the service will not be available. Therefore, in order to make the service highly available, distributed services appear, and the same service is deployed. On multiple machines, even if several servers are down, as long as one server is available, the service will be available.

The same is true for redis. In order to solve the stand-alone failure, the master-slave mode is introduced, but there is a problem in the master-slave mode: after the master node fails, the service needs to be manually switched from the slave node to the master node before the service is restored. To solve this problem, redis introduces the sentinel mode. The sentinel mode can automatically promote the slave node to the master node after the master node fails, and the service can be restored without manual intervention.

However, neither the master-slave mode nor the sentinel mode has achieved real data sharding storage. Each redis instance stores the full amount of data, so redis cluster was born, realizing real data sharding storage. However, since redis cluster was released relatively late (the official version was released only in 2015), major manufacturers couldn't wait, and successively developed their own redis data fragmentation cluster models, such as: Twemproxy, Codis, etc.

master-slave mode

Although a redis single node can persist data to the hard disk through the RDB and AOF persistence mechanisms, the data is stored on a server. If the server has a problem such as a hard disk failure, the data will be unavailable and cannot be read or written. Separation, reading and writing are all on the same server, and I/O bottlenecks will occur when the request volume is large.

In order to avoid single point of failure and non-separation of reading and writing, Redis provides the replication function to realize that after the data in the master database is updated, the updated data will be automatically synchronized to other slave databases.

The characteristics of the master-slave structure of redis above: a master can have multiple salve nodes; a salve node can have slave nodes, and the slave nodes are in a cascade structure.

Advantages and disadvantages of master-slave mode

  • Advantages: The master-slave structure has the advantages of read-write separation, improved efficiency, data backup, and multiple copies.

  • Disadvantages: The biggest disadvantage is that the master-slave mode does not have automatic fault tolerance and recovery functions. If the master node fails, the cluster cannot work, and the availability is relatively low. Manual intervention is required to upgrade the slave node to the master node.

Ordinary master-slave mode, when the master database crashes, you need to manually switch the slave database to become the master database:

  • Use the SLAVE NO ONE command in the slave database to promote the slave database to the master data to continue the service.

  • Start the previously crashed master database, and then use the SLAVEOF command to set it as the slave database of the new master database to synchronize data.

sentinel mode

The first type of master-slave synchronization/replication mode, when the master server is down, you need to manually switch a slave server to the master server, which requires manual intervention, which is laborious and laborious, and will cause the service to be unavailable for a period of time. That's where Sentry Mode comes in.

The sentinel mode was provided from Redis version 2.6, but the mode of this version was unstable at that time, and the sentinel mode was not stabilized until Redis version 2.8.

The core of the sentinel mode is still master-slave replication, but compared with the master-slave mode, when the master node is down and cannot be written, there is an additional election mechanism: a new master node is elected from all slave nodes. The implementation of the election mechanism depends on starting a sentinel process in the system.

As shown in the figure above, Sentry itself has a single point of failure problem, so in a Redis system with one master and multiple slaves, multiple Sentinels can be used for monitoring. Sentinels will not only monitor the master database and slave databases, but also monitor each other. Each sentinel is an independent process, and as a process, it will run independently.

(1) The role of sentinel mode:

Monitor whether all servers are running normally: Send commands to return the running status of the monitoring server, process and monitor the master server, slave server, and sentinels also monitor each other.

Failover: When the Sentinel detects that the master is down, it will automatically switch the slave to the master, and then notify other slave servers through the publish-subscribe mode, modify the configuration file, and let them switch to the master. At the same time, the problematic old master will also become a slave of the new master, that is to say, even if the old master is restored, it will not restore its original master status, but will become a slave of the new master.

(2) Sentinel implementation principle

When Sentinel starts the process, it will read the content of the configuration file, and find out the main database to be monitored through the following configuration:

sentinel monitor master-name ip port quorum
#master-name是主数据库的名字
#ip和port 是当前主数据库地址和端口号
#quorum表示在执行故障切换操作前,需要多少哨兵节点同意。

The reason why only the master node needs to be connected here is because the info command of the master node is used to obtain the slave node information, thereby establishing a connection with the slave node, and at the same time knowing the information of the newly added slave node through the info information of the master node.

A sentinel node can monitor multiple master nodes, but this is not recommended because when the sentinel node crashes, multiple cluster switchovers will fail at the same time. After Sentinel starts, two connections are made to the main database.

  • Subscribe to the main database _sentinel_:hello channel to get information about sentinel nodes that also monitor the database

  • Periodically send the info command to the master database to obtain information about the master database itself.

After establishing a connection with the main database, the following three operations will be performed regularly:

  • (1) Send info commands to master and slave every 10s. The function is to obtain the current database information. For example, when a new slave node is found, a connection will be established and added to the monitoring list. When the role of the master-slave database changes, the information will be updated.

  • (2) Send your own information to the _sentinel_:hello channel of the master data and the slave database every 2s. The role is to share your own monitoring data with Sentry. Each sentinel will subscribe to the _sentinel:hello channel of the database. When other sentinels receive the message, they will judge whether the sentinel is a new sentinel. If so, add it to the sentinel list and establish a connection.

  • (3) Send ping commands to all master-slave nodes and all sentinel nodes every 1s to monitor whether the nodes are alive.

(3) Subjective offline and objective offline

When the sentinel node sends a ping command, if the node does not reply after a certain period of time (down-after-millisecond), the sentinel considers that it is subjectively offline. Subjective offline means that the current Sentinel believes that the node has gone offline. If the node is the master database, Sentinel will further judge whether it needs to fail over it. At this time, it will send a command (SENTINEL is-master-down-by-addr ) to ask other sentinel nodes whether they think the master node is offline subjectively. When the specified number (quorum) is reached, the sentinel will consider it objectively offline.

When the master node objectively goes offline, a master-slave switchover is required. The steps of the master-slave switchover are:

  • Elect the lead sentinel.

  • All slaves of the leader sentinel select the slave database with the highest priority. The priority can be set via the slave-priority option.

  • If the priorities are the same, the greater the offset of the command copied from the database (that is, the more data is copied and synchronized, the newer the data), the higher the priority.

  • If the above conditions are the same, select the slave database with the smaller run ID.

After a slave database is selected, the sentinel sends the slave no one command to upgrade to the master database, and sends the slaveof command to set the master databases of other slave nodes as the new master database.

(4) Advantages and disadvantages of sentinel mode

1. Advantages

  • The sentinel mode is based on the master-slave mode, which solves the problem that the master failure in the master-slave mode cannot automatically switch the fault.

2. Insufficiency - problem

  • It is a centralized cluster implementation scheme: there is always only one Redis host to receive and process write requests, and write operations are affected by the bottleneck of a single machine.

  • All nodes in the cluster store the full amount of data, which wastes memory space and does not truly realize distributed storage. When the amount of data is too large, master-slave synchronization seriously affects the performance of the master.

  • After the Redis host is down, the sentinel mode is not in the case of voting, because no one knows who the host and slave are before the end of the voting. At this time, Redis will also open the protection mechanism to prohibit writing operations until the election New Redis host.

The data stored by each node in the master-slave mode or the sentinel mode is a full amount of data. When the amount of data is too large, the stored data needs to be segmented and stored in multiple redis instances. At this time, Redis Sharding technology will be used.

Redis cluster solutions of major manufacturers

Redis only supports single-instance mode before version 3.0. Although Redis developer Antirez proposed to add the cluster function in Redis version 3.0 as early as on his blog, version 3.0 was not released until 2015. Major companies couldn't wait any longer. Before the release of version 3.0, in order to solve the storage bottleneck of Redis, they launched their own Redis cluster solutions one after another. The core idea of ​​these solutions is to store data sharding (sharding) in multiple Redis instances, and each shard is a Redis instance.

(1) Client Fragmentation

Client-side sharding is implemented by putting the logic of sharding on the Redis client (for example: jedis already supports the Redis Sharding function, that is, ShardedJedis), through the pre-defined routing rules of the Redis client (using consistent hashing), the The access to Key is forwarded to different Redis instances, and the returned results are collected when querying data. The schema for this scenario is shown in the figure.

Pros and cons of client-side sharding:

Advantages: The advantage of client-side sharding technology using the hash consensus algorithm for sharding is that all logic is controllable and does not depend on third-party distributed middleware. The Redis instances on the server side are independent of each other and not related to each other. Each Redis instance runs like a single server. It is very easy to expand linearly and the system is very flexible. Developers know how to implement sharding and routing rules, so they don't have to worry about stepping into the trap.

1. Consistent hash algorithm:

It is a commonly used algorithm in distributed systems. For example, a distributed storage system needs to store data on a specific node. If a common hash method is used to map the data to a specific node, such as mod(key,d), key is the key of the data, and d It is the number of machine nodes. If a machine joins or exits the cluster, all data mappings will be invalid.

The consistent hash algorithm solves the problem of poor scalability of the ordinary remainder Hash algorithm, and can ensure that as many requests as possible hit the original routed server when the server is online or offline.

2. Implementation method: consistent hash algorithm, such as MURMUR_HASH hash algorithm, ketamahash algorithm

For example, the Redis Sharding implementation of Jedis uses a consistent hashing algorithm (consistent hashing) to hash the key and node name at the same time, and then perform mapping matching. The algorithm used is MURMUR_HASH.

The main reason for using consistent hashing instead of simple hash-like modulo mapping is that when nodes are added or removed, there will be no rehashing due to re-matching. Consistent hashing only affects the key distribution of adjacent nodes, and the impact is small.

insufficient:

  • This is a static sharding scheme, which needs to increase or decrease the number of Redis instances, and manually adjust the sharding program.

  • The operation and maintenance costs are relatively high. Any problem with the cluster data requires the cooperation of operation and maintenance personnel and developers, which slows down the speed of problem solving and increases the cost of cross-departmental communication.

  • In different client programs, the cost of maintaining the same route sharding logic is huge. For example, the java project and the PHP project share a set of Redis clusters, and the routing fragmentation logic needs to write two sets of the same logic, and there will be two sets of maintenance in the future.

One of the biggest problems with client sharding is that when the topological structure of the server-side Redis instance group changes, each client needs to be updated and adjusted. If the client sharding module can be taken out separately to form a separate module (middleware), this problem can be solved as a bridge between the client and the server, and proxy sharding will appear at this time.

(2) Proxy Fragmentation

Twemproxy is the most widely used redis agent fragmentation, which is an open-source Redis agent by Twitter. Its basic principle is: through the form of middleware, the Redis client sends the request to Twemproxy, and Twemproxy sends it to the correct Redis instance according to the routing rules. Finally, Twemproxy aggregates the results back to the client.

Twemproxy introduces a proxy layer to manage multiple Redis instances in a unified manner, so that the Redis client only needs to operate on Twemproxy, and does not need to care about how many Redis instances there are behind, thus realizing the Redis cluster.

Advantages of Twemproxy:

  • The client connects to Twemproxy like a Redis instance, without changing any code logic.

  • Automatic deletion of invalid Redis instances is supported.

  • Twemproxy maintains a connection with the Redis instance, reducing the number of connections between the client and the Redis instance.

Disadvantages of Twemproxy:

  • Since each request from the Redis client goes through the Twemproxy proxy to reach the Redis server, there will be a performance loss in this process.

  • There is no friendly monitoring and management background interface, which is not conducive to operation and maintenance monitoring.

  • The biggest pain point of Twemproxy is that it cannot expand/shrink smoothly. For operation and maintenance personnel, the workload is very heavy when adding Redis instances due to business needs.

Twemproxy, as the most widely used, proven and stable Redis proxy, is widely used in the industry.

(3) Codes

The problem that Twemproxy cannot smoothly increase Redis instances has brought great inconvenience, so Peapod independently developed Codis, a Redis proxy software that supports smooth increase of Redis instances, which was developed based on Go and C languages, and was launched in November 2014. Open source on GitHub.

In the architecture diagram of Codis, Codis introduces Redis Server Group, which realizes the high availability of Redis cluster by specifying a master CodisRedis and one or more slave CodisRedis. When a master CodisRedis hangs up, Codis will not automatically promote a slave CodisRedis to master CodisRedis, which involves data consistency issues (Redis itself uses master-slave asynchronous replication for data synchronization, when data is successfully written in the master CodisRedis , whether this data has been read from CodisRedis is not guaranteed), the administrator needs to manually promote CodisRedis from CodisRedis to master CodisRedis on the management interface.

If manual processing is troublesome, pea pods also provide a tool Codis-ha, which will take it offline and promote a slave CodisRedis to master CodisRedis when it detects that the master CodisRedis is down.

Codis adopts the form of pre-sharding. When it is started, 1024 slots are created. One slot is equivalent to one box. Each box has a fixed number ranging from 1 to 1024. The slot box is used to store the Key. As for which box the Key is stored in, a number can be obtained through the algorithm "crc32(key)%1024". The range of this number must be between 1 and 1024, and the Key will be placed in the corresponding number slot.

For example, if the number obtained by a Key through the algorithm "crc32(key)%1024" is 5, put it into the slot (box) coded 5. Only one Redis Server Group can be placed in one slot, and one slot cannot be placed in multiple Redis Server Groups. A Redis Server Group can store at least 1 slot and a maximum of 1024 slots. Therefore, a maximum of 1024 Redis Server Groups can be specified in Codis.

The biggest advantage of Codis is that it supports smooth increase (decrease) of Redis Server Group (Redis instance), and can migrate data safely and transparently. This is also where Codis differs from static distributed Redis solutions such as Twemproxy. After Codis added the Redis Server Group, it involved the migration of slots.

For example, the system has two Redis Server Groups, and the correspondence between Redis Server Groups and slots is as follows.

When a Redis Server Group is added, the slot will be reassigned. There are two ways for Codis to allocate slots:

The first method: manually redistribute through the Codis management tool Codisconfig, and specify the range of slots corresponding to each Redis Server Group. For example, you can specify the new correspondence between Redis Server Group and slots as follows.

The second method: through the rebalance function of the Codis management tool Codisconfig, the slots will be automatically migrated according to the memory of each Redis Server Group to achieve data balance.

Redis Cluster

Although the sentinel mode of Redis can achieve high availability and read-write separation, there are several deficiencies:

  • In sentinel mode, each Redis server stores the same data, which is a waste of memory space; the amount of data is too large, and the master-slave synchronization seriously affects the performance of the master.

  • Sentinel mode is a centralized cluster implementation scheme, each slave machine is highly coupled to the host machine, and the service is unavailable during the period from the master downtime to the recovery of the slave election master.

  • Sentinel mode always has only one Redis host to receive and process write requests. Write operations are still affected by the bottleneck of a single machine, and a true distributed architecture has not been implemented.

Redis added the Cluster cluster mode on 3.0 to realize the distributed storage of Redis, that is to say, different data is stored on each Redis node. In order to solve the problem of limited Redis capacity of a single machine, the cluster mode distributes data to multiple machines according to certain rules. The memory/QPS is not limited to a single machine, and can benefit from the high scalability of distributed clusters.

Redis Cluster is a server sharding technology (sharding and routing are implemented on the server side), using multiple masters and multiple slaves. Each partition is composed of a Redis host and multiple slaves. In parallel. The Redis Cluster cluster adopts the P2P model and is completely decentralized.

As shown in the figure above, it is officially recommended that at least 3 master nodes are required for cluster deployment, and it is best to use a mode of 3 masters and 3 slaves with six nodes. The Redis Cluster cluster has the following characteristics:

  • The cluster is completely decentralized and adopts multiple masters and multiple slaves; all redis nodes are interconnected with each other (PING-PONG mechanism), and the binary protocol is used internally to optimize transmission speed and bandwidth.

  • The client is directly connected to the Redis node without an intermediate proxy layer. The client does not need to connect to all the nodes in the cluster, but can connect to any available node in the cluster.

  • Each partition is composed of a Redis host and multiple slaves, and the shards and shards are parallel to each other.

  • Each master node is responsible for maintaining a part of the slots and the key-value data mapped to the slots; each node in the cluster has a full amount of slot information, and through the slots, each node knows which node the specific data is stored on.

Redis cluster is mainly for massive data + high concurrency + high availability scenarios, massive data, if you have a large amount of data, then it is recommended to use redis cluster, when the amount of data is not large, using sentinel is enough. The performance and high availability of redis cluster are better than sentinel mode.

Redis Cluster adopts virtual hash slot partitioning instead of consistent hash algorithm, and pre-allocates some card slots. All keys are mapped to these slots according to the hash function. The master node in each partition is responsible for maintaining a part of the slots and the slots mapped. key-value data.

Guess you like

Origin blog.csdn.net/2301_77463738/article/details/131263827