7.redis cluster(集群)

1.为什么需要集群
(1)并发量:OPS
redis性能可以达到10W/每秒,如果业务需要100W/每秒呢
(2)数据量
机器内存:16-256G,业务需要500G呢
(3)解决方法:
分布式:简单的认为加机器
2.数据分布
常用的两种分布方式
方式一:顺序分布
举例:有三个节点,保证每个节点均衡
图:
特点:数据分散易倾斜,键值业务相关,可顺序访问,不支持批量操作
典型产品:BigTable,HBase
方式二:哈希分布

特点:数据分散度高,键值分布业务无关,无法顺序访问,支持批量操作
典型产品:一致性哈希Memcache,redis cluster,其他缓存产品
哈希分布三种方式:
方式一:节点取余分区
客户端分片:哈希+取余
举例:有三个节点,对每一个数字做哈希的函数,再按照节点数做一个取余

举例:有四个节点,对每一个数字做哈希的函数,再按照节点数做一个取余

问题:如果当前是3个节点,加1个节点,或减1个节点,数据节点关系变化,会导致数据迁移,迁移达到80%
解决:如果想做加节点,建议做翻倍扩容,当前3个节点,加3个节点,迁移会降到50%

方式二:一致性哈希分区
客户端分片:哈希+顺时针(优化取余)
节点伸缩:只影响邻近节点,但是还是有数据迁移
翻倍伸缩:保证最小迁移数据和负载均衡


方式三:虚拟槽分区
槽(理解为一个数字范围是0~16383)
预设虚拟槽:每个槽映射一个数据子集,一般比节点数大
良好的哈希函数:例如CRC16
举例:有5个节点,对槽平均分成5份,对于每个keys来讲做哈希,在RedisCluster里面就是CRC16给它做一个哈希对16383去取一个余,消息会发送给RedisCluster里面的任意一个节点,它每一个节点记录自己是不是负责这个槽的,当某一个节点是负责这个槽就保存返回一个结果,如果发现不在这个槽,由于RedisCluster是共享消息模式,知道哪些节点负责哪些槽,就会告诉你结果让你去对应的节点去取
图:
特性:主从复制,高可用,分片多个主节点可以读写

3.搭建集群
(1)基本架构
节点:RedisCluster里有很多节点,每个节点负责读写
meet:节点之间进行相互通信的,meet过程就是完成这个过程的基础
指派槽:给节点指派了对应的槽,它才可以进行正常读写
复制:保证高可用,每一个主节点都有一个从节点
(2)两种安装方式
方式一:原生命令安装
<1>配置开启节点
配置6个节点:
#端口
port 7000
#守护进程
daemonize yes
#工作目录
dir "/home/redis/data"
#rdb文件
dbfilename "dump-7000.rdb"
#日志
logfile "7000.log"
#当前节点是Cluster节点
cluster-enabled yes
#故障转移,节点超时的时间15秒
cluster-node-timeout 15000
#Cluster节点添加自己的配置文件,记录各个节点的配置
cluster-config-file nodes-7000.conf
#是否需要集群内所有节点都提供服务
cluster-require-full-coverage no
启动6个节点:
./redis-server ../data/redis-7000.conf
./redis-server ../data/redis-7001.conf
./redis-server ../data/redis-7002.conf
./redis-server ../data/redis-7003.conf
./redis-server ../data/redis-7004.conf
./redis-server ../data/redis-7005.conf
查看是否启动成功:
ps aux |grep redis
root     21474  0.0  0.0 136992  7580 ?        Ssl  12:01   0:00 ./redis-server *:7000 [cluster]
root     21478  0.0  0.0 136992  7580 ?        Ssl  12:01   0:00 ./redis-server *:7001 [cluster]
root     21482  0.0  0.0 136992  7584 ?        Ssl  12:01   0:00 ./redis-server *:7002 [cluster]
root     21486  0.0  0.0 136992  7580 ?        Ssl  12:01   0:00 ./redis-server *:7003 [cluster]
root     21490  0.0  0.0 136992  7580 ?        Ssl  12:01   0:00 ./redis-server *:7004 [cluster]
root     21494  0.0  0.0 136992  7584 ?        Ssl  12:01   0:00 ./redis-server *:7005 [cluster]

<2>meet
命令:cluster meet ip port
6个节点相互通信:
./redis-cli -p 7000 cluster meet 127.0.0.1 7001
./redis-cli -p 7000 cluster meet 127.0.0.1 7002
./redis-cli -p 7000 cluster meet 127.0.0.1 7003
./redis-cli -p 7000 cluster meet 127.0.0.1 7004
./redis-cli -p 7000 cluster meet 127.0.0.1 7005
查看6个节点是否通信成功:
./redis-cli -p 7000 cluster nodes
2aff05faa516da21635ab6774a981e2efe7538a0 127.0.0.1:7004 master - 0 1562040532277 4 connected
1862242fd6be9f7503e4d5d43e0134660f40753f 127.0.0.1:7005 master - 0 1562040529272 5 connected
0121bd2bafa4d98833d75c2bc1f3e10841c7ac3c 127.0.0.1:7002 master - 0 1562040533281 2 connected
9c0a0dc0709b9e8bb64c2fab459ca4aeeca24142 127.0.0.1:7003 master - 0 1562040530273 3 connected
6b3f6fba363498589633b581015bbb9071f1f8ce 127.0.0.1:7001 master - 0 1562040531275 1 connected
fe3b483f2836110c67faa354b8f9328664fffb2c 127.0.0.1:7000 myself,master - 0 0 0 connected

<3>指派槽
命令:cluster addslots slot [slot...]
一共6个节点,三主三从,将16384个节点平均分配到三个主节点上:
redis-cli -p 7000 cluster addslots {0...5461}
redis-cli -p 7000 cluster addslots {5462...10922}
redis-cli -p 7000 cluster addslots {10923...16383}
编写脚本:cat addslots.sh
start=$1
end=$2
port=$3
for slot in `seq ${start} ${end}`
    do
        echo "slot:${slot}"
        ./redis-cli -p ${port} cluster addslots ${slot}
done
执行脚本:
分配0-5461节点到7000
sh addslots.sh 0 5461 7000
分配5462-10922节点到7001
sh addslots.sh 5462 10922 7001
分配10923-16383节点到7002
sh addslots.sh 10923 16383 7002
查看:
./redis-cli -p 7000 cluster info
cluster_state:ok(状态OK)
cluster_slots_assigned:16384(共16384个槽分配成功)
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6(当前认识6个节点)
cluster_size:3(分配槽的节点数3个)
cluster_current_epoch:5
cluster_my_epoch:0
cluster_stats_messages_sent:2576
cluster_stats_messages_received:2576
<4>设置主从关系
命令:cluster replicate node-id(集群节点的id,在集群启动的时候会进行分配)
让7003复制7000的节点:
./redis-cli -p 7003 cluster replicate fe3b483f2836110c67faa354b8f9328664fffb2c
让7004复制7001的节点:
./redis-cli -p 7004 cluster replicate 6b3f6fba363498589633b581015bbb9071f1f8ce
让7005复制7002的节点:
./redis-cli -p 7005 cluster replicate 0121bd2bafa4d98833d75c2bc1f3e10841c7ac3c
查看是否成功:
./redis-cli -p 7000 cluster nodes
2aff05faa516da21635ab6774a981e2efe7538a0 127.0.0.1:7004 slave 6b3f6fba363498589633b581015bbb9071f1f8ce 0 1562041938444 4 connected
1862242fd6be9f7503e4d5d43e0134660f40753f 127.0.0.1:7005 slave 0121bd2bafa4d98833d75c2bc1f3e10841c7ac3c 0 1562041937441 5 connected
0121bd2bafa4d98833d75c2bc1f3e10841c7ac3c 127.0.0.1:7002 master - 0 1562041936439 2 connected 10923-16383
9c0a0dc0709b9e8bb64c2fab459ca4aeeca24142 127.0.0.1:7003 slave fe3b483f2836110c67faa354b8f9328664fffb2c 0 1562041934435 3 connected
6b3f6fba363498589633b581015bbb9071f1f8ce 127.0.0.1:7001 master - 0 1562041939447 1 connected 5462-10922
fe3b483f2836110c67faa354b8f9328664fffb2c 127.0.0.1:7000 myself,master - 0 0 0 connected 0-5461

方式二:官方工具Ruby安装
(1)Ruby环境准备
<1>安装ruby-2.5.5
./configure -prefix=/home/ruby
make
make install
<2>安装redis-3.3.0.gem:
gem install -l redis-3.3.0.gem
gem list -- check redis gem
<3>将redis-trib.rb拷贝到可执行文件中
cp /home/redis-3.2.1/src/redis-trib.rb /home/redis/bin/
<4>配置开启节点
配置6个节点:
#端口
port 7000
#守护进程
daemonize yes
#工作目录
dir "/home/redis/data"
#rdb文件
dbfilename "dump-7000.rdb"
#日志
logfile "7000.log"
#当前节点是Cluster节点
cluster-enabled yes
#故障转移,节点超时的时间15秒
cluster-node-timeout 15000
#Cluster节点添加自己的配置文件,记录各个节点的配置
cluster-config-file nodes-7000.conf
#是否需要集群内所有节点都提供服务
cluster-require-full-coverage no
启动6个节点:
./redis-server ../data/redis-7000.conf
./redis-server ../data/redis-7001.conf
./redis-server ../data/redis-7002.conf
./redis-server ../data/redis-7003.conf
./redis-server ../data/redis-7004.conf
./redis-server ../data/redis-7005.conf
查看是否启动成功:
ps aux |grep redis
root     30832  0.0  0.0 136992  7580 ?        Ssl  15:55   0:00 ./redis-server *:7000 [cluster]
root     30836  0.0  0.0 136992  7584 ?        Ssl  15:55   0:00 ./redis-server *:7001 [cluster]
root     30840  0.0  0.0 136992  7584 ?        Ssl  15:55   0:00 ./redis-server *:7002 [cluster]
root     30844  0.0  0.0 136992  7580 ?        Ssl  15:55   0:00 ./redis-server *:7003 [cluster]
root     30848  0.0  0.0 136992  7584 ?        Ssl  15:55   0:00 ./redis-server *:7004 [cluster]
root     30852  0.0  0.0 136992  7580 ?        Ssl  15:55   0:00 ./redis-server *:7005 [cluster]
<5>每个主节点设置一个从节点
./redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
127.0.0.1:7000
127.0.0.1:7001
127.0.0.1:7002
Adding replica 127.0.0.1:7003 to 127.0.0.1:7000
Adding replica 127.0.0.1:7004 to 127.0.0.1:7001
Adding replica 127.0.0.1:7005 to 127.0.0.1:7002
M: ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
M: a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
M: d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
S: 6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003
   replicates ae394a4143dc16ba0b08d94bf78aa5872cf9180b
S: e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004
   replicates a75da4e3b4f89648b53a9fdc66f09fe973db756a
S: 25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005
   replicates d7125372a04236517b4b51e996bd3e3d3575a91c
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.......
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
M: a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
M: d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
M: 6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003
   slots: (0 slots) master
   replicates ae394a4143dc16ba0b08d94bf78aa5872cf9180b
M: e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004
   slots: (0 slots) master
   replicates a75da4e3b4f89648b53a9fdc66f09fe973db756a
M: 25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005
   slots: (0 slots) master
   replicates d7125372a04236517b4b51e996bd3e3d3575a91c
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.(所有槽已经成功分配)
<6>查看是否成功:
./redis-cli -p 7000 cluster info
cluster_state:ok(成功)
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6(6个节点)
cluster_size:3(3个主节点)
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_sent:205
cluster_stats_messages_received:205
<7>查看是否成功:
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562054528621 3 connected 10923-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562054529623 6 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 0-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562054530624 2 connected 5461-10922
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562054531627 4 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562054526616 5 connected

4.集群伸缩
(1)伸缩原理:槽和数据在节点之间的移动
缩(减一些节点):
伸(加一些节点):

(2)扩容集群
<1>准备新的节点
集群模式
配置和其他节点统一
启动后是孤儿节点
<2>加入集群
作用:为它迁移槽和数据实现扩容,作为从节点负责故障转移
<3>迁移槽和数据
1)槽迁移计划:
平均槽数量:本来有三个槽,每个槽节点是5460,添加一个槽后,4个槽每个槽节点是4096
直接迁移:本来有三个槽,每个槽节点是5460,三个槽每个槽拿出一份凑成3份给第4个槽
2)迁移数据过程
第一步对目标节点发送:cluster setslot {slot(对应的槽)} importing {sourceNodeId}命令,告诉他,节点要进行导入了,让目标节点准备导入槽的数据
第二步对源节点发送:cluster setslot {slot(对应的槽)} migrating {targetNodeId}命令,让源节点准备迁出槽的数据
第三步源节点循环执行:cluster getkeysinslot {slot} {count}命令,每次获取count个属于槽的键
第四步在源节点上执行migrate {targetIp} {targetPort} key 0 {timeout} 命令把指定key迁移
第五步重复执行步骤3~4直到槽下所有的键数据迁移到目标节点
第六步向集群内所有主节点发送cluster setslot {slot} node {targetNodeId}命令,通知槽分配给目标节点

3)添加从节点操作
当前三主三从结构:
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562054528621 3 connected 10923-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562054529623 6 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 0-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562054530624 2 connected 5461-10922
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562054531627 4 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562054526616 5 connected
再添加一主一从,添加启动7006和7007
./redis-server ../data/redis-7006.conf
./redis-server ../data/redis-7007.conf
把7006和7007加入到集群
./redis-cli -p 7000 cluster meet 127.0.0.1 7006
./redis-cli -p 7000 cluster meet 127.0.0.1 7007
当前集群状态:
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562126756992 3 connected 10923-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562126752986 6 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 0-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562126756992 2 connected 5461-10922
0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 127.0.0.1:7006 master - 0 1562126753987 7 connected
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562126755991 4 connected
43c1f9023e609c0af1228d87a65e4009076cd28a 127.0.0.1:7007 master - 0 1562126754989 0 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562126751483 5 connected
把7007和7006做主从复制
./redis-cli -p 7007 cluster replicate 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55
当前集群状态:
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562126831660 3 connected 10923-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562126837171 6 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 0-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562126835167 2 connected 5461-10922
0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 127.0.0.1:7006 master - 0 1562126838172 7 connected
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562126836169 4 connected
43c1f9023e609c0af1228d87a65e4009076cd28a 127.0.0.1:7007 slave 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 0 1562126833164 7 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562126832662 5 connected
迁移槽:
执行./redis-trib.rb reshard 127.0.0.1:7000
#迁移多少个槽?
How many slots do you want to move (from 1 to 16384)?4096
#希望给哪个ID加槽
What is the receiving node ID?0c10d5bc1ad6e17f338c0cdf4370fddb81daea55
#填入all就是拿其他的节点分配给这个节点
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:all
#是否要执行?
Do you want to proceed with the proposed reshard plan (yes/no)?yes
查看迁移结果
./redis-cli -p 7000 cluster nodes | grep master
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562127237014 3 connected 12288-16383
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 1365-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562127238017 2 connected 6827-10922
0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 127.0.0.1:7006 master - 0 1562127236012 7 connected 0-1364 5461-6826 10923-12287

(3)缩容集群:下线节点,首先判断是否有槽,有槽迁移槽到其他节点,没有槽通知其它节点忘记下线节点,关闭节点
<1>下线迁移槽
本来有四个槽,每个槽节点是6379,下线一个槽后,把下线的槽节点平均分配给其它三个节点
<2>忘记节点
cluster forget {downNodeId}
<3>关闭节点


缩容从节点操作
查看当前节点状态4主4从
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562128014673 3 connected 12288-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562128013170 6 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 1 connected 1365-5460
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562128013671 2 connected 6827-10922
0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 127.0.0.1:7006 master - 0 1562128016177 7 connected 0-1364 5461-6826 10923-12287
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562128019683 4 connected
43c1f9023e609c0af1228d87a65e4009076cd28a 127.0.0.1:7007 slave 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 0 1562128018680 7 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562128017679 5 connected

把7006的槽第一段0-1364数据迁移到7000
./redis-trib.rb reshard --from 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 --to ae394a4143dc16ba0b08d94bf78aa5872cf9180b --slots 1366 127.0.0.1:7006
把7006的槽第二段5461-6826数据迁移到7001
./redis-trib.rb reshard --from 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 --to a75da4e3b4f89648b53a9fdc66f09fe973db756a --slots 1365 127.0.0.1:7006
把7006的槽第三段10923-12287数据迁移到7002
./redis-trib.rb reshard --from 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 --to d7125372a04236517b4b51e996bd3e3d3575a91c --slots 1364 127.0.0.1:7006
查看当前节点
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562128677014 10 connected 10923-12286 12288-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562128681022 10 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 8 connected 0-5461
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562128683026 9 connected 5462-10922
0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 127.0.0.1:7006 master - 0 1562128933545 7 connected
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562128680018 8 connected
43c1f9023e609c0af1228d87a65e4009076cd28a 127.0.0.1:7007 slave 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 0 1562128682525 7 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562128679017 9 connected
让集群中的节点忘记7006和7007
先下从节点7007
./redis-trib.rb del-node 127.0.0.1:7000 43c1f9023e609c0af1228d87a65e4009076cd28a
>>> Removing node 43c1f9023e609c0af1228d87a65e4009076cd28a from cluster 127.0.0.1:7000
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
在下主节点7006
./redis-trib.rb del-node 127.0.0.1:7000 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55
>>> Removing node 0c10d5bc1ad6e17f338c0cdf4370fddb81daea55 from cluster 127.0.0.1:7000
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
查看缩容结果:
./redis-cli -p 7000 cluster nodes
d7125372a04236517b4b51e996bd3e3d3575a91c 127.0.0.1:7002 master - 0 1562129022228 10 connected 10923-16383
25b08ac647b83eff1f1a3e0539ac302fd8fd6503 127.0.0.1:7005 slave d7125372a04236517b4b51e996bd3e3d3575a91c 0 1562129024734 10 connected
ae394a4143dc16ba0b08d94bf78aa5872cf9180b 127.0.0.1:7000 myself,master - 0 0 8 connected 0-5461
a75da4e3b4f89648b53a9fdc66f09fe973db756a 127.0.0.1:7001 master - 0 1562129022730 9 connected 5462-10922
6235626a07aa61ac2142db67a2a5b482d268fa52 127.0.0.1:7003 slave ae394a4143dc16ba0b08d94bf78aa5872cf9180b 0 1562129023733 8 connected
e3fc4f3dabb9b2a2275ea4ae1ad4a27d2dbb64bc 127.0.0.1:7004 slave a75da4e3b4f89648b53a9fdc66f09fe973db756a 0 1562129021729 9 connected
5.客户端路由
(1)moved重定向

(2)ask重定向

(3)smart客户端


6.集群原理


7.开发运维常见问题
(1)集群完整性
是否需要集群内所有节点都提供服务
cluster-require-full-coverage默认为yes
问题1:集群中16384个槽全部可用:保证集群完整性
问题2:节点故障或者正在做故障转移
总结:大多数业务无法容忍,建议为no

(2)带宽消耗
官方建议:1000个节点
问题:PING/PONG消息:
问题:不容忽视的带宽消耗:
三个方面:
消息发送频率:slots槽数(2KB空间)和整个集群1/10的状态数据(10个节点状态数据约1KB)
消息数据量:节点发现与其它节点最后通信时间超过cluster-node-timeout/2时会直接发送ping消息
节点部署的机器规模:集群分布的机器越多且每台机器划分节点数越均匀,则集群内整体的可用带宽越高
优化:
避免“大”集群:避免多业务使用一个集群,大业务可以多集群
cluster-node-timeout:带宽和故障转移速度的均衡
尽量均匀分配到多机器上:保证高可用和带宽
(3)Pub/Sub广播
问题:publish在集群每个节点广播:加重带宽
解决:单独“走”一套Redis Sentinel
(4)数据倾斜
<1>数据倾斜:内存不均
可能一:节点和槽分配不均
命令查看节点,槽,键值分布:redis-trib.rb info ip:port
使用打印主节点信息:
./redis-trib.rb info 127.0.0.1:7000
127.0.0.1:7000 (ae394a41...) -> 0 keys | 5462 slots | 1 slaves.
127.0.0.1:7002 (d7125372...) -> 0 keys | 5461 slots | 1 slaves.
127.0.0.1:7001 (a75da4e3...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
可能二:不同槽对应键值数量差异较大
CRC16正常情况下比较均匀
可能存在hash_tag
命令获取槽对应键值个数:cluster countkeysinslot {slot}
可能三:包含bigkey
例如大字符串,几百万的元素的hash,set等
从节点:redis-cli --bigkeys
优化:优化数据结构
可能四:内存相关配置不一致
hash-max-ziplist-value,set-max-intset-entries等
优化:定期“检查”配置一致性
<2>请求倾斜:热点
热点key:重要的key或者bigkey
优化:
第一:避免bigkey
第二:热键不要用hash_tag
第三:当一致性不高时,可用本地缓存+MQ
(5)集群读写分离
<1>只读连接:集群模式的从节点不接受任何读写请求
-重定向到负责槽的主节点
-readonly命令可以读:连接级别命令
实现:
在7000主写入一条hello world
./redis-cli -c -p 7000
127.0.0.1:7000> set hello world
OK
在从7003执行读取,跳转到主7000读取
./redis-cli -c -p 7003
127.0.0.1:7003> get hello
-> Redirected to slot [866] located at 127.0.0.1:7000
"world"
从新进入从7003使用readonly命令后可以直接从7003读取
./redis-cli -c -p 7003
127.0.0.1:7003> readonly
OK
127.0.0.1:7003> get hello
"world"
<2>读写分离:更加复杂
-同样的问题:复制延迟,读取过期数据,从节点故障
-修改客户端:cluster slaves {nodeId}
(6)数据迁移-离线/在线迁移
<1>官方迁移工具:redis-trib.rb import
-只能从单机迁移到集群
-不支持在线迁移:source需要停写
-不支持断电续传
-单线程迁移:影响速度
<2>在线迁移:
-唯品会redis-migrate-tool
-豌豆荚:redis-port
(7)集群vs单机
<1>集群限制:
-key批量操作支持有限:例如mget,mset必须在一个slot(槽)
-key事务和Lua支持有限:操作的key必须在一个节点上
-key是数据分区的最小粒度:不支持bigkey分区
-不支持多个数据库:集群模式下只有一个db 0
-复制只支持一层:不支持树形复制结构
<2>分布式redis不一定好
第一:Redis Cluster:满足容量和性能的扩展性,很多业务“不需要”
-大多数时客户端性能会“降低”
-命令无法跨节点使用:mget,keys,scan,flush,sinter等
-Lua和事务无法跨节点使用
-客户端维护更复杂:SDK和应用本身消耗(例如更多的连接池)
第二:很多场景Redis Sentinel已经足够好了

8.集群总结:
(1)Redis cluster数据分区规则采用虚拟槽方式(16384个槽),每个节点负责一部分槽和相关数据,实现数据和请求的负载均衡
(2)搭建集群划分四个步骤:准备节点,节点握手,分配槽,复制。redis-trib.rb工具用于快速搭建集群。
(3)集群伸缩通过在节点之间移动槽和相关数据实现
-扩容时根据槽迁移计划把槽从源节点迁移到新节点
-收缩时如果下线的节点有负责的槽需要迁移到其它节点,再通过cluster forget命令让集群内所有的节点忘记被下线节点。
(4)使用smart客户端操作集群达到通信效率最大化,客户端内部负责计算维护键->槽->节点的映射,用于快速定位到目标节点。
(5)集群自动故障转移过程分为故障发现和节点恢复。节点下线分为主观下线和客观下线,当超过半数主节点认为故障节点为主观下线时标记它为客观下线状态。从节点负责对客观下线的主节点触发故障恢复流程,保证集群的可用性
(6)开发运维常见问题包括:超大规模集群带宽消耗,pub/sub广播问题,集群倾斜问题,单机和集群对比等

猜你喜欢

转载自www.cnblogs.com/xixi18/p/11137564.html