Redis three cluster mode features

Redis three cluster modes

  In service development, a single machine will have a single point of failure problem, and the service is deployed on a server. Once the server is down, the service will not be available. Therefore, in order to make the service highly available, distributed services appear, and the same service is deployed. On multiple machines, even if several servers are down, as long as one server is available, the service will be available.

The same is true for redis. In order to solve the stand-alone failure, the master-slave mode   is introduced , but there is a problem in the master-slave mode: after the master node fails, the service needs to be manually switched from the slave node to the master node before the service is restored. To solve this problem, redis introduces the sentinel mode. The sentinel mode can automatically promote the slave node to the master node after the master node fails, and the service can be restored without manual intervention. However, neither the master-slave mode nor the sentinel mode has achieved real data sharding storage. Each redis instance stores the full amount of data, so redis cluster was born, realizing real data sharding storage. However, since redis cluster was released relatively late (the official version was released only in 2015), major manufacturers couldn't wait, and successively developed their own redis data fragmentation cluster models, such as: Twemproxy, Codis, etc.

1. Master-slave mode

  Although a redis single node can persist data to the hard disk through the RDB and AOF persistence mechanisms, the data is stored on a server. If the server has a problem such as a hard disk failure, the data will be unavailable and cannot be read or written. Separation, reading and writing are all on the same server, and I/O bottlenecks will occur when the request volume is large.

 In order to avoid single point of failure and non-separation of reading and writing, Redis provides the replication function to realize that after the data in the master database is updated, the updated data will be automatically synchronized to other slave databases.

insert image description here

  The characteristics of the master-slave structure of redis above: a master can have multiple salve nodes; a salve node can have slave nodes, and the slave nodes are in a cascade structure.

1.1 Advantages and disadvantages of master-slave mode

Advantages :

主从结构具有读写分离,提高效率、数据备份,提供多个副本等优点。

Insufficient :

最大的不足就是主从模式不具备自动容错和恢复功能,主节点故障,集群则无法进行工作,可用性比较低,从节点升主节点需要人工手动干预。

#Ordinary master-slave mode, when the master database crashes, you need to manually switch the slave database to become the master database:
1. Use the SLAVE NO ONE command in the slave database to promote the slave database to the master data to continue the service.
2. Start the master database that crashed before, and then use the SLAVEOF command to set it as the slave database of the new master database to synchronize data.

2. Sentry mode

 The first master-slave synchronization/replication mode, when the master server is down, you need to manually switch a slave server to the master server, which requires manual intervention, which is laborious and laborious, and will cause the service to be unavailable for a period of time. That's where Sentry Mode comes in. The sentinel mode was provided from Redis version 2.6, but the mode of this version was unstable at that time, and the sentinel mode was not stabilized until Redis version 2.8.

 The core of the sentinel mode is still master-slave replication, but compared with the master-slave mode, when the master node is down and cannot be written, there is an additional election mechanism: a new master node is elected from all slave nodes. The implementation of the election mechanism depends on starting a sentinel process in the system.
insert image description here

 As shown in the figure above, Sentry itself has a single point of failure problem, so in a Redis system with one master and multiple slaves, multiple Sentinels can be used for monitoring. Sentinels will not only monitor the master database and slave databases, but also monitor each other. Each sentinel is an independent process, and as a process, it will run independently.
insert image description here

2.1 The role of sentinel mode

(1)监控所有服务器是否正常运行:通过发送命令返回监控服务器的运行状态,处理监控主服务器、从服务器外,哨兵之间也相互监控。
(2)故障切换:当哨兵监测到master宕机,会自动将slave切换成master,然后通过发布订阅模式通知其他的从服务器,修改配置文件,让它们切换master。同时那台有问题的旧主也会变为新主的从,也就是说当旧的主即使恢复时,并不会恢复原来的主身份,而是作为新主的一个从。

2.2 Implementation Principle of Sentinel

 When Sentinel starts the process, it will read the content of the configuration file, and find out the main database to be monitored through the following configuration:

sentinel monitor master-name ip port quorum
#master-name是主数据库的名字
#ip和port 是当前主数据库地址和端口号
#quorum表示在执行故障切换操作前,需要多少哨兵节点同意。

 The reason why only the master node needs to be connected here is because the info command of the master node is used to obtain the slave node information, thereby establishing a connection with the slave node, and at the same time knowing the information of the newly added slave node through the info information of the master node.

 A sentinel node can monitor multiple master nodes, but this is not recommended because when the sentinel node crashes, multiple cluster switchovers will fail at the same time. After Sentinel starts, two connections are made to the main database.

1.订阅主数据库_sentinel_:hello频道以获取同样监控该数据库的哨兵节点信息
2.定期向主数据库发送info命令,获取主数据库本身的信息。

After establishing a connection with the main database, the following three operations will be performed regularly:

(1)每隔10s向master和 slave发送info命令。作用是获取当前数据库信息,比如发现新增从节点时,会建立连接,并加入到监控列表中,当主从数据库的角色发生变化进行信息更新。

(2)每隔2s向主数据里和从数据库的_sentinel_:hello频道发送自己的信息。作用是将自己的监控数据和哨兵分享。每个哨兵会订阅数据库的_sentinel:hello频道,当其他哨兵收到消息后,会判断该哨兵是不是新的哨兵,如果是则将其加入哨兵列表,并建立连接。

(3)每隔1s向所有主从节点和所有哨兵节点发送ping命令,作用是监控节点是否存活。

2.3 Subjective offline and objective offline

  When the sentinel node sends a ping command, if the node does not reply after a certain period of time (down-after-millisecond), the sentinel considers that it is subjectively offline. Subjective offline means that the current Sentinel believes that the node is already down. If the node is the master database, Sentinel will further judge whether it needs to fail over it. At this time, it will send the command (SENTINEL is-master-down-by-addr) Ask other sentinel nodes whether they think the master node is offline subjectively. When the specified number (quorum) is reached, the sentinel will consider it objectively offline.

 When the master node objectively goes offline, a master-slave switchover is required. The steps of the master-slave switchover are:

(1)选出领头哨兵。
(2)领头哨兵所有的slave选出优先级最高的从数据库。优先级可以通过slave-priority选项设置。
(3)如果优先级相同,则从复制的命令偏移量越大(即复制同步数据越多,数据越新),越优先。
(4)如果以上条件都一样,则选择run ID较小的从数据库。

  After a slave database is selected, the sentinel sends the slave no one command to upgrade to the master database, and sends the slaveof command to set the master databases of other slave nodes as the new master database.

2.4 Advantages and Disadvantages of Sentry Mode

1. Advantages

哨兵模式是基于主从模式的,解决可主从模式中master故障不可以自动切换故障的问题。

2. Insufficient

(1)是一种中心化的集群实现方案:始终只有一个Redis主机来接收和处理写请求,写操作受单机瓶颈影响。
(2)集群里所有节点保存的都是全量数据,浪费内存空间,没有真正实现分布式存储。数据量过大时,主从同步严重影响master的性能。
(3)Redis主机宕机后,哨兵模式正在投票选举的情况之外,因为投票选举结束之前,谁也不知道主机和从机是谁,此时Redis也会开启保护机制,禁止写操作,直到选举出了新的Redis主机。
主从模式或哨兵模式每个节点存储的数据都是全量的数据,数据量过大时,就需要对存储的数据进行分片后存储到多个redis实例上。此时就要用到Redis Sharding技术。

3. Redis cluster solutions of major manufacturers

  Redis only supports single-instance mode before version 3.0. Although Redis developer Antirez proposed to add the cluster function in Redis version 3.0 as early as on his blog, version 3.0 did not release the official version until 2015. Major enterprises can't wait any longer. Before the release of version 3.0, in order to solve the storage bottleneck of Redis, they launched their own Redis cluster solutions one after another. The core idea of ​​these solutions is to store data sharding (sharding) in multiple Redis instances, and each shard is a Redis instance.

3.1 Client Fragmentation

  Client-side sharding is implemented by putting the logic of sharding on the Redis client (for example: jedis already supports the Redis Sharding function, that is, ShardedJedis), through the pre-defined routing rules of the Redis client (using consistent hashing), the The access to Key is forwarded to different Redis instances, and the returned results are collected when querying data. The schema for this scenario is shown in the figure.
insert image description here

Pros and cons of client-side sharding:

Advantages:
The advantage of client-side sharding technology using the hash consensus algorithm for sharding is that all logic is controllable and does not depend on third-party distributed middleware. The Redis instances on the server side are independent of each other and not related to each other. Each Redis instance runs like a single server. It is very easy to expand linearly and the system is very flexible. Developers know how to implement sharding and routing rules, so they don't have to worry about stepping into the trap.

(1) 一致性哈希算法:是分布式系统中常用的算法。
比如,一个分布式的存储系统,要将数据存储到具体的节点上,如果采用普通的hash方法,将数据映射到具体的节点上,如mod(key,d),key是数据的key,d是机器节点数,如果有一个机器加入或退出这个集群,则所有的数据映射都无效了。 
一致性哈希算法解决了普通余数Hash算法伸缩性差的问题,可以保证在上线、下线服务器的情况下尽量有多的请求命中原来路由到的服务器。

(2) 实现方式:一致性hash算法,比如MURMUR_HASH散列算法、ketamahash算法,比如Jedis的Redis Sharding实现,采用一致性哈希算法(consistent hashing),将key和节点name同时hashing,然后进行映射匹配,采用的算法是MURMUR_HASH。
采用一致性哈希而不是采用简单类似哈希求模映射的主要原因是当增加或减少节点时,不会产生由于重新匹配造成的rehashing。一致性哈希只影响相邻节点key分配,影响量小。

insufficient:

(1) 这是一种静态的分片方案,需要增加或者减少Redis实例的数量,需要手工调整分片的程序。

(2) 运维成本比较高,集群的数据出了任何问题都需要运维人员和开发人员一起合作,减缓了解决问题的速度,增加了跨部门沟通的成本。

(3) 在不同的客户端程序中,维护相同的路由分片逻辑成本巨大。比如:java项目、PHP项目里共用一套Redis集群,路由分片逻辑分别需要写两套一样的逻辑,以后维护也是两套。

 One of the biggest problems with client sharding is that when the topological structure of the server-side Redis instance group changes, each client needs to be updated and adjusted. If the client fragmentation module can be taken out separately to form a separate module (middleware), this problem can be solved as a bridge connecting the client and the server, and proxy fragmentation will appear at this time.

3.2 Proxy Fragmentation

  Twemproxy is the most widely used redis agent fragmentation, which is an open-source Redis agent by Twitter. Its basic principle is: through the form of middleware, the Redis client sends the request to Twemproxy, and Twemproxy sends it to the correct Redis instance according to the routing rules. Finally, Twemproxy aggregates the results back to the client.

  Twemproxy introduces a proxy layer to manage multiple Redis instances in a unified manner, so that the Redis client only needs to operate on Twemproxy, and does not need to care about how many Redis instances there are behind, thus realizing the Redis cluster.

insert image description here

Advantages of Twemproxy:

(1) 客户端像连接Redis实例一样连接Twemproxy,不需要改任何的代码逻辑。

(2) 支持无效Redis实例的自动删除。

(3) Twemproxy与Redis实例保持连接,减少了客户端与Redis实例的连接数。

Disadvantages of Twemproxy:

(1) 由于Redis客户端的每个请求都经过Twemproxy代理才能到达Redis服务器,这个过程中会产生性能损失。

(2) 没有友好的监控管理后台界面,不利于运维监控。

(3) Twemproxy最大的痛点在于,无法平滑地扩容/缩容。对于运维人员来说,当因为业务需要增加Redis实例时工作量非常大。

Twemproxy, as the most widely used, proven and stable Redis proxy, is widely used in the industry.

3.3 Codes

  The problem that Twemproxy cannot smoothly increase Redis instances has brought great inconvenience, so Peapod independently developed Codis, a Redis proxy software that supports smooth increase of Redis instances, which was developed based on Go and C languages, and was launched in November 2014. Open source codis open source address on GitHub.

insert image description here

  In the architecture diagram of Codis, Codis introduces Redis Server Group, which realizes the high availability of Redis cluster by specifying a master CodisRedis and one or more slave CodisRedis. When a primary CodisRedis hangs up, Codis will not automatically promote a secondary CodisRedis to the primary CodisRedis, which involves data consistency issues (Redis itself uses master-slave asynchronous replication for data synchronization, when data is successfully written to the primary CodisRedis , there is no guarantee whether the slave CodisRedis has read in this data), the administrator needs to manually promote the slave CodisRedis to the master CodisRedis on the management interface.

  If manual processing is troublesome, pea pods also provide a tool Codis-ha, which will take it offline and promote a slave CodisRedis to master CodisRedis when it detects that the master CodisRedis is down.

  Codis adopts the form of pre-sharding. When it is started, 1024 slots are created. One slot is equivalent to one box. Each box has a fixed number and the range is 1 1024 . The slot box is used to store the Key. As for which box the Key is stored in, a number can be obtained through the algorithm "crc32(key)%1024". The range of this number must be between 1 and 1024, and the Key is placed in the slot corresponding to this number . . For example, if the number obtained by a Key through the algorithm "crc32(key)%1024" is 5, put it into the slot (box) coded 5. Only one Redis Server Group can be placed in one slot, and one slot cannot be placed in multiple Redis Server Groups. A Redis Server Group can store at least 1 slot and a maximum of 1024 slots. Therefore, a maximum of 1024 Redis Server Groups can be specified in Codis.

 The biggest advantage of Codis is that it supports smooth increase (decrease) of Redis Server Group (Redis instance), and can migrate data safely and transparently. This is also where Codis differs from static distributed Redis solutions such as Twemproxy. After Codis added the Redis Server Group, it involved the migration of slots. For example, the system has two Redis Server Groups, and the correspondence between Redis Server Groups and slots is as follows.

Redis Server Group slot
1 1~500
2 501~1024

When a Redis Server Group is added, the slot will be reassigned. There are two ways for Codis to allocate slots:

(1) The first method: manually redistribute through the Codis management tool Codisconfig, and specify the range of slots corresponding to each Redis Server Group. For example, you can specify the new correspondence between Redis Server Group and slots as follows.

Redis Server Group slot
1 1~500
2 501~700
3 701~1024

(2) The second type: through the rebalance function of the Codis management tool Codisconfig, the slot will be automatically migrated according to the memory of each Redis Server Group to achieve data balance.

4.Redis Cluster

Although the sentinel mode of Redis can achieve high availability and read-write separation, there are several deficiencies:

(1) 哨兵模式下每台 Redis 服务器都存储相同的数据,很浪费内存空间;数据量太大,主从同步时严重影响了master性能。

(2) 哨兵模式是中心化的集群实现方案,每个从机和主机的耦合度很高,master宕机到salve选举master恢复期间服务不可用。

(3) 哨兵模式始终只有一个Redis主机来接收和处理写请求,写操作还是受单机瓶颈影响,没有实现真正的分布式架构。

  Redis added the Cluster cluster mode on 3.0 to realize the distributed storage of Redis, that is to say, different data is stored on each Redis node. In order to solve the problem of limited Redis capacity of a single machine, the cluster mode distributes data to multiple machines according to certain rules. The memory/QPS is not limited to a single machine, and can benefit from the high scalability of distributed clusters. Redis Cluster is a server sharding technology (sharding and routing are implemented on the server side), using multiple masters and multiple slaves. Each partition is composed of a Redis host and multiple slaves. In parallel. The Redis Cluster cluster adopts the P2P model and is completely decentralized.

insert image description here

  As shown in the figure above, it is officially recommended that at least 3 master nodes are required for cluster deployment, and it is best to use a mode of 3 masters and 3 slaves with six nodes. The Redis Cluster cluster has the following characteristics:

  (1) 集群完全去中心化,采用多主多从;所有的redis节点彼此互联(PING-PONG机制),内部使用二进制协议优化传输速度和带宽。

  (2) 客户端与 Redis 节点直连,不需要中间代理层。客户端不需要连接集群所有节点,连接集群中任何一个可用节点即可。

  (3) 每一个分区都是由一个Redis主机和多个从机组成,分片和分片之间是相互平行的。
  
  (4) 每一个master节点负责维护一部分槽,以及槽所映射的键值数据;集群中每个节点都有全量的槽信息,通过槽每个node都知道具体数据存储到哪个node上。

Redis cluster is mainly for massive data + high concurrency + high availability scenarios, massive data, if you have a large amount of data, then it is recommended to use redis cluster, when the amount of data is not large, using sentinel is enough. The performance and high availability of redis cluster are better than sentinel mode.

  Redis Cluster adopts virtual hash slot partitioning instead of consistent hash algorithm, and pre-allocates some card slots. All keys are mapped to these slots according to the hash function. The master node in each partition is responsible for maintaining a part of the slots and the slots mapped. key-value data. For the detailed implementation principle of Redis Cluster, please refer to: Redis Cluster data fragmentation implementation principle .

Guess you like

Origin blog.csdn.net/jiuyuemo1/article/details/127446943