About Redis's a few little things | high concurrency and high availability

If you use redis caching technology, it will certainly have to consider how to use the machine Cadogan redis, redis ensure high concurrency, there is how to make Redis to ensure their future not hang directly died.

 

redis high concurrency: main architecture, a master multi-slave, in general, in fact, many projects would be sufficient from a single master used to write data, stand-alone tens of thousands of QPS, from more than used to query data from multiple instances can provide 100,000 per QPS.

 

redis high concurrency, it is also the need to accommodate a large amount of data: a master multi-slave, each instance accommodate a complete data, such as the amount of memory 10G redis Lord, in fact, you're the most amount of data can only accommodate 10g. If the amount of data you want to accommodate a large cache, reaching tens of g, or even hundreds of g, or a few t, then you need redis clusters, and followed by redis clusters, can provide hundreds of thousands per second may concurrent read and write.

 

redis availability: If you call the shots from infrastructure deployment, in fact, coupled with sentry on it, can be achieved, any one instance of downtime, will be automatic switchover.

The following details.

 

How .redis carried by a separate read and write read request over 100,000 QPS +?

 

Relations 1.redis high concurrent with the whole system of high concurrency

Rredis engage in high concurrency, it should improve the bottom of the cache, so that fewer requests directly to the database because high concurrent database is relatively cumbersome to implement, and requires some operations as well as transaction and so on, so it is hard to do very high concurrency.

Redis concurrent concurrent to do good for the whole system is still not enough, but as a whole redis large cache architecture, the architecture supports high concurrency which is a very important part.

To achieve high concurrency system, middleware first cache, the cache system must be able to prop up the high concurrency, and then after a good overall cache architecture (multi-level cache, the cache hot spot), in order to truly support from high concurrency.

 

2.redis can not support high-concurrency bottleneck

Redis can not support high concurrency bottlenecks are mainly single issue, meaning that only a single redis, even if no matter how good the performance of the machine, there is an upper limit.

 

3. How to support higher concurrency

Stand-alone can not support concurrent redis is too high, in order to support higher concurrency may be separate read and write. For cache, there are generally high concurrent support read, write request is relatively small, it can be read based on the main separated from the schema.

 

Configuring a master (main) machine used to write data, a plurality of slave read data to (from), after receiving the data in the master data can be synchronized to the above slave, slave may be configured so that multiple machines, We can improve the overall concurrency.

 

Two .redis replication and master persistent security implications for master-slave architecture

 

1.redis replication principle

The master node of a plurality of slave nodes hanging below, a write operation to write data to the master node above and then finished after the master, by means of asynchronous operation to synchronize data to all of the slave nodes to the above, to ensure that the data of all nodes is consistent.

 

2.redis replication of core mechanics

(1) redis asynchronously copy data to the slave node, but began redis2.8, slave node periodically confirm their number per replication.

(2) a master node can configure multiple salve node.

(3) slave node may be connected to another slave node.

(4) slave node points when doing copy, is not blocking the normal operation of the master node.

(5) slave node doing the copying time, that does not block their operation, it will use old data to provide services; however, when copying is complete, you need to delete the old data, new data is loaded, this time will be outside suspend the provision of services.

(6) slave node mainly used for lateral expansion, do separate read and write, the slave node expansion throughput can be improved.

 

3.master persistent security implications for master-slave architecture

If this master-slave architecture, it must open the master node to persistence. We do not recommend the use of slave node as master node hot backup, because if this is the case, if the master goes down once, then the master data will be lost after the restart data is empty, if the other slave node to replicate data, it would copy to empty, so that data on all nodes are lost.

To make a backup file of various cold backup, to prevent the whole machine is broken, rdb data backup also lose situation.

 

Three .redis master-slave replication principle, HTTP, no copy of the disk, expired key processing

 

1. Copy the master-slave principle

① When you start a slave node when it sends a PSYNC command to the master node.

② If the slave node is reconnected master node, then the master node to the slave copies only partial deletion of data; if it is connected to the first master node, it will trigger a full resynchronization.

③ start full resynchronization time, master starts a background thread to start RDB generate a snapshot file, but also all write commands from the client received new cache in memory of them.

④master node RDB generated file to the slave node, slave will now be written to the local disk, and then loaded from disk into memory. Then write memory cache master node will send a command to the slave node, slave node will synchronize some data.

If ⑤slave node with the master node disconnected because of network failure, it will automatically reconnect.

⑥master if there are multiple slave node to reconnect, just start a rdb save operation, all data services with a slave node.

 

2. The master-slave replication HTTP

From the beginning redis2.8 support for HTTP. If the master-slave replication process, the network suddenly cut off, you can then copy the last place, continue to replicate, rather than a copy from scratch.

principle:

master node will be created in memory a backlog, master and slave keeps a replica offset and a master id, offset is saved in the backlog. If the master and slave network connections cut off, slave master will continue to start from the last copy of the replica offset. However, if the offset is not found, it will perform a full resynchronization operation.

 

3. No copy of the disk

Diskless of copying is to create a master file RDB direct memory again, and then sent to the slave, not in their own local disk to save the data.

Setting mode

Configuration repl-diskless-sync and repl-diskless-sync-delay parameters.

repl-diskless-sync: This parameter is carried out to ensure no disk copying.

repl-diskless-sync-delay: This parameter indicates the length of time and then waits for a start copying, so that a plurality of slave nodes can wait up new connections.

 

4. Expiration key processing

slave does not expire key, just wait for the expired master key.

If an expired master key, or out of a key, then the master will send a del command to simulate slave, deletes the key after receiving the slave.

 

Depth analysis of the complete flow again running processes and principles of the four .redis replication

 

1. Copy the complete process

①slave node starts, only holds information master node, including the master node and the host ip, but data replication has not yet begun. The master node and host are arranged in redis.conf ip document inside the slaveOf.

②slave internal node have a cron job to check every second if a new master node to connect to a copy, if found, just like the master node to establish a network socket connection.

③slave node sends a ping command to the master node.

④ If the master set requirepass, then the slave node must send the password master auth past are authenticated.

⑤master node for the first time performing a full volume copy, send all data to the slave node.

⑥master node sustained drop a write command, the asynchronous replication to the slave node.

 

2. Data synchronization mechanism

It refers to the first slave connected to the case where the master, the total amount of replication is performed.

①master and will maintain a slave offset

master itself will continue to accumulate offset, slave will continue to accumulate in itself offset. Every second slave reported their offset to master, while the master will hold the offset of each slave.

Not that this particular copy to use in the full amount, mainly master and slave must know their offset data, inconsistent data between each other in order to know the situation

②backlog

master node there is a backlog in memory, the default is 1M big.

When the master node to the slave node to replicate data, the data will be synchronized in a backlog.

backlog is mainly used to make full-time interruption of incremental volume replication replication.

③master run id

redis can view the master run id through info server.

Use: slave is positioned according to its unique master.

Why not host + ip: because the use of host + ip to locate the master do not fly, if master node restart or data has changed, then the slave should be differentiated according to the different master run id, run id different needs to be done once the full amount copy.

If you do not need to change the run id restart redis, you can use redis-cli debug reload command.

 

3. Copy the full amount of the processes and mechanisms

①master execution bgsave, RDB generate a file locally.

②master node sends the snapshot files to the slave node RDB, RDB time if you copy files over 60 seconds (repl-timeout), then the slave node will copy task fails, you can adjust the parameters.

③ For Gigabit Ethernet machine, generally per second 100M, 6G file transfer is likely to exceed 60 seconds.

After ④master node when generating the RDB file, it will be all new to the write command cached in memory, the slave node to save the RDB file, then the write command to copy a slave node.

⑤ View client-output-buffer-limit slave parameters, such as [client-output-buffer-limit slave 256MB 64MB 60], expressed during replication, in-memory cache to continue to consume more than 64M, or one-time over 256MB, then stop replication, copy failed.

⑥slave node after receiving the RDB files, empty your data, and then reload the RDB files to your memory, in the process, based on the old data to provide services.

⑦ If the slave node opens the AOF, it will be executed immediately BRREWRITEAOF, re-AOF, rdb generation, rdb copy through the network, data clean up old slave, slave aof rewrite, very time-consuming, if the amount of data replication between the 4G ~ 6G , it is likely that the total amount of time spent copying the one and a half to two minutes.

 

4. incremental replication process with Mechanism

① If the full amount of the replication process, master and slave network connection is broken, then the slave master to reconnect supplement will trigger replication.

②master directly from its own backlog in partial loss of data is sent to the slave node.

③master data is acquired from the transmission backlog according psync slave in the offset.

 

5. heartbeat

master and slave will send a heartbeat message to each other.

The default master sent once every 10 seconds, slave node sent once every second by default.

 

6. Asynchronous Replication

After each master receives the write command, the internal write data now, and to send an asynchronous slave node

 

How can I do .redis five main high availability of 99.99 percent from the architecture?

 

1. What is the 99.99% availability?

High availability (English: high availability, abbreviated as HA), terminology, refers to the degree of availability without interruption ability of the system to perform its function, on behalf of the system. It is one of the criteria at the time of system design. Compared with high-availability systems constituting the various components of the system may be more long-running.

High availability is usually achieved by improving the system's fault tolerance. How to define a system is often considered to have a detailed analysis of the specific circumstances of each case require high availability.

Its metrics, according to system damage, time can not be used, and can not be restored by the operation to be operational condition of the time, compared to the total operation time of the system. The formula is: 

A (availability), MTBF (MTBF), MDT (Mean Time to Repair)

Online systems and mission-critical systems typically require their availability to reach 5 9 standard (99.999%).

 

Availability In downtime
99.9999% 32 seconds
99.999% 5 minutes and 15 seconds
99.99% 52分34秒
99.9% 8小时46分
99% 3天15小时36分


2.redis不可用

redis不可以包含了单实例的不可用,主从架构的不可用。

不可用的情况:

①主从架构的master节点挂了,如果master节点挂了那么缓存数据无法再写入,而且slave里面的数据也无法过期,这样就导致了不可用。

②如果是单实例,那么可能因为其他原因导致redis进程死了。或者部署redis的机器坏了。

不可用的后果 :首先缓存不可用了,那么请求就会直接走数据库,如果涌入大量请求超过了数据库的承载能力,那么数据库就挂掉了,这时候如果不能及时处理好缓存问题,那么由于请求过多,数据库重启之后很快就又会挂掉,直接导致整个系统不可用。

 

3.如何实现高可用

①保证每个redis都有备份。

②保证在当前redis出故障之后,可以很快切换到备份redis上面去。

为了解决这个问题,引入下面的哨兵机制。

 

六.redis哨兵架构的相关基础知识的讲解

 

1.什么是哨兵?

哨兵(Sentinal)是redis集群架构当中非常重要的一个组件,它主要有一下功能:

集群监控 ,负责监控redis master和slave进程是否正常工作。

消息通知,如果某个redis实例有故障,那么哨兵负责发送消息作为报警通知给管理员。

故障转移,如果master挂掉了,会自动转移到slave上。

配置中心,如果故障发生了,通知client客户端连接到新的master上面去。

 

2.哨兵的核心知识

①哨兵本身是分布式的,需要作为一个集群去运行,个哨兵协同工作。

②故障转移时,判断一个master宕机了,需要大部分哨兵同意才行。

③即使部分哨兵挂掉了,哨兵集群还是能正常工作的。

④哨兵至少需要3个实例,来保证自己的健壮性。

⑤哨兵+redis主从结构,是无法保证数据零丢失的,只会保证redis集群的高可用。

⑥对应哨兵+redis主从这种架构,再使用之前,要做重复的测试和演练。

 

3.为什么哨兵集群部署2个节点无法正常工作?

哨兵集群必须部署2个以上的节点。如果集群仅仅部署了2个哨兵实例,那么quorum=1(执行故障转移需要同意的哨兵个数)。

如图,如果这时候master1宕机了,哨兵1和哨兵2中只要有一个认为master1宕机了就可以进行故障转移,同时哨兵1和哨兵2会选举出一个哨兵来执行故障转移。

同时这个时候需要majority(也就是所有集群中超过一半哨兵的数量),2个哨兵那么majority就是2,也就说需要至少2个哨兵还运行着,才可以进行故障转移。

但是如果整个master和哨兵1同时宕机了,那么就只剩一个哨兵了,这个时候就没有majority来运行执行故障转移了,虽然两外一台机器还有一个哨兵,但是1无法大于1,也就是无法保证半数以上,因此故障转移不会执行。

 

4.经典的3节点哨兵集群

Configuration: quorum = 2,majority=2

如果M1所在机器宕机了,那么三个哨兵还剩下2个,S2和S3可以一致认为master宕机,然后选举出一个来执行故障转移

同时3个哨兵的majority是2,所以还剩下的2个哨兵运行着,就可以允许执行故障转移

 

七.redis哨兵主备切换的数据丢失问题:异步复制、集群脑裂

 

1.两种数据丢失的场景

①异步复制导致的数据丢失

因为从master到slave的数据复制过程是异步的,可能有部分数据还没来得及复制到slave上面去,这时候master就宕机了,那么这部分数据就丢失了。

②集群脑裂导致的数据丢失

什么是脑裂:脑裂,也就是说,某个master所在机器突然脱离了正常的网络,跟其他slave机器不能连接,但是实际上master还运行着。

此时哨兵可能就会认为master宕机了,然后开启选举,将其他slave切换成了master。

这个时候,集群里就会有两个master,也就是所谓的脑裂。

此时虽然某个slave被切换成了master,但是可能client还没来得及切换到新的master,还继续写向旧master的数据可能也丢失了。

因此旧master再次恢复的时候,会被作为一个slave挂到新的master上去,自己的数据会清空,重新从新的master复制数据

 

2.解决异步复制的脑裂导致的数据丢失

要解决这个问题,就需要配置两个参数:

min-slaves-to-write 1 和 min-slaves-max-lag :

表示 要求至少有一个slave 在进行数据的复制和同步的延迟不能超过10秒。

如果一旦所有的slave数据同步和复制的延迟都超过了10秒,那么这个时候,master就会在接受任何请求了。

①减少异步复制的数据丢失

有了min-slaves-max-lag这个配置,就可以确保说,一旦slave复制数据和ack延时太长,就认为可能master宕机后损失的数据太多了,那么就拒绝写请求,这样可以把master宕机时由于部分数据未同步到slave导致的数据丢失降低的可控范围内。

②减少脑裂的数据丢失

如果一个master出现了脑裂,跟其他slave丢了连接,那么上面两个配置可以确保说,如果不能继续给指定数量的slave发送数据,而且slave超过10秒没有给自己ack消息,那么就直接拒绝客户端的写请求。

这样脑裂后的旧master就不会接受client的新数据,也就避免了数据丢失。

上面的配置就确保了,如果跟任何一个slave丢了连接,在10秒后发现没有slave给自己ack,那么就拒绝新的写请求。

因此在脑裂场景下,最多就丢失10秒的数据

 

八.redis哨兵的多个核心底层原理的深入解析(包含slave选举算法)

 

1.sdown和odown两种状态

sdown是主观宕机,就一个哨兵如果自己觉得一个master宕机了,那么就是主观宕机。

odown是客观宕机,如果quorum数量的哨兵都觉得一个master宕机了,那么就是客观宕机

sdown达成的条件很简单,如果一个哨兵ping一个master,超过了is-master-down-after-milliseconds指定的毫秒数之后,就主观认为master宕机。

sdown到odown转换的条件很简单,如果一个哨兵在指定时间内,收到了quorum指定数量的其他哨兵也认为那个master是sdown了,那么就认为是odown了,客观认为master宕机。

 

2.哨兵集群的字段发现机制

①哨兵相互之间的发现,是通过redis的pub/sub系统实现的,每个哨兵都会往 __sentinel__:hello 这个channel里面发送一个消息,这时候其他的哨兵都可以消费这个消息,并感知其他哨兵的存在。

②每个两秒钟,每个哨兵都会往自己监控的某个master+slave对应的 __sentinel__:hello channel里面发送一个消息,内容是自己的host、ip和run id还有对这个master的监控配置。

③每个哨兵也会去监听自己监控的每个master+slave对应的 __sentinel__:hello channel,r然后去感知到同样在监听这个master+slave的其他哨兵的存在。

④每个哨兵还会根据其他哨兵交换对master的监控配置,互相进行监控配置的同步。

 

3.slave配置的自我纠正

哨兵会负责自动纠正slave的一些配置,比如slave如果要成为潜在的master候选人,哨兵会确保slave在复制现有master数据;如果slave连接到了一个错误的master上,比如故障转移之后,那么哨兵会确保他们连接到正确的master上来。

 

4.选举算法

如果一个master被认为odown了,而且majority数量的哨兵都允许了主备切换,那么某个哨兵就会执行主备切换,此时首先要选举一个slave出来。选举会考虑到一下情况:

①slave跟master断开连接的时长

②slave的优先级

③slave复制数据的offset

④slave的run id

 

首先,如果一个slave跟master断开连接已经超过了 down-after-millisecondes 的10倍,外加master宕机的时长,那么slave就被认为不适合选举为master了。

即:断开连接时间 > (down-after-milliseconds * 10 + milliseconds_since_master_is_in_SDOWN_state).

 

对应剩下的slave按照如下规定排序:

①首先,按照slave的优先级进行排序,slave priority越低,优先级就越高。

②如果优先级相同,那么就看replica offset,那个slave复制了越多的数据,offset越靠后,优先级就越高。

③如果上面都想同,那就选择run id最小的那个slave。

 

5.quorum和majority

每次一个哨兵做主备切换,首先需要quorum数量的哨兵认为odown,然后选举出一个哨兵来做主备切换,这个哨兵还要得到majority数量哨兵的授权,才能正式执行切换。

 

如果quorum < majority ,比如5个哨兵,majority就是3(超过半数),quorum设置为2,那么就需要3个哨兵授权就可以执行切换。

 

如果 quorum >= majority,那么必须quorum数量的哨兵都授权才可以进行切换,比如5个哨兵,quorum是5,那么必须5个哨兵都同意授权,才可以进行切换。

 

6.configuration epoch

哨兵会对一套redis master+slave进行监控,有相应的监控的配置。

 

执行切换的那个哨兵,会从要切换到的新master(salve->master)那里得到一个configuration epoch,这就是一个version号,每次切换的version号都必须是唯一的。

 

如果第一个选举出的哨兵切换失败了,那么其他哨兵,会等待failover-timeout时间,然后接替继续执行切换,此时会重新获取一个新的configuration epoch,作为新的version号。

 

7、configuraiton传播

哨兵完成切换之后,会在自己本地更新生成最新的master配置,然后同步给其他的哨兵,就是通过之前说的pub/sub消息机制。

 

这里之前的version号就很重要了,因为各种消息都是通过一个channel去发布和监听的,所以一个哨兵完成一次新的切换之后,新的master配置是跟着新的version号的。

 

其他的哨兵都是根据版本号的大小来更新自己的master配置的。

Guess you like

Origin www.cnblogs.com/feixiangmanon/p/11229620.html