Redis cluster cluster deployment process under centos6

Reprinted from: http://www.cnblogs.com/kevingrace/p/7846324.html

Generally speaking, the purpose of redis master-slave and mysql master-slave is similar, but the redis master-slave configuration is very simple. It mainly specifies the master node ip and port in the slave node configuration file, such as: slaveof 192.168.10.10 6379, and then starts the master-slave, master-slave It's set up. If the master node of redis fails, it will not switch automatically. You need to use redis's Sentinel (sentinel mode) or keepalive to achieve master failover.

Today we will introduce the redis cluster cluster mode:
redis cluster is a centerless distributed redis storage architecture that can share data between multiple nodes and solve the problems of redis high availability and scalability. The redis cluster provides the following two Benefits:
1) Automatically split data into multiple nodes
2) When a node in the cluster fails, redis can continue to process client requests.

A Redis cluster contains 16384 hash slots, and each data in the database belongs to one of these 16384 hash slots. The cluster uses the formula CRC16(key) % 16384 to calculate which slot the key belongs to. Each node in the cluster is responsible for processing a portion of hash slots.
Master-slave replication in the cluster
Each node in the cluster has 1 to N replicas, one of which is the master node and the rest are slave nodes. If the master node goes offline, the cluster will replace one of the master nodes. The slave node is set as the new master node and continues working. In this way, the cluster will not be unable to work properly because a master node goes offline.

==========The Load Balance function has been supported since Redis3.x============

Redis Cluster cluster function has been launched for some time. In the stand-alone version of Redis, there is no communication between each Master, so pre-sharding is usually done in a Jedis client or a proxy such as Codis. According to CAP theory, the stand-alone version of Redis guarantees CP (Consistency & Partition-Tolerancy) at the expense of A (Availability), which means that Redis can ensure that all users see the same data (consistency, because Redis does not automatically redundant Data) and network communication problems, the temporarily isolated subsystem can continue to run (partition tolerance, because there is no direct relationship between Masters and no communication is required), but there is no guarantee that when some nodes fail, all requests can Responded (availability, if a Master node is down, the sharded data on it will be inaccessible).

With the Cluster function, Redis has changed from a simple NoSQL memory database to a distributed NoSQL database, and the CAP model has also changed from CP to AP. In other words, through automatic sharding and redundant data, Redis has real distributed capabilities. If a node fails, because the data is backed up on other nodes, other nodes can continue to provide it. service, ensuring Availability. However, precisely because of this, Redis cannot guarantee the strong consistency it once had. This is also required by CAP theory, and only two of the three can be chosen.

Redis Cluster is a cluster implementation of Redis with a built-in automatic data sharding mechanism. All keys are mapped to 16384 Slots within the cluster. Each Redis Instance in the cluster is responsible for reading and writing part of the Slots. The cluster client can send commands by connecting to any Redis Instance in the cluster. When the Redis Instance receives a request for a Slot that it is not responsible for, it will return the Redis Instance address responsible for requesting the Slot where the Key is located to the client. After the client receives it Automatically resend the original request to this address, transparent to the outside world. Which Slot a Key belongs to is determined by crc16(key) % 16384. In Redis Cluster, load balancing and HA are already fully supported.

Load Balance: Data can be migrated between Redis Instances in the cluster in Slot units, but it is not automatic and needs to be triggered by external commands.
Cluster member management: Cluster nodes (Redis Instance) and nodes regularly exchange and update node information in the cluster. From the perspective of the sending node, this information includes: which nodes are in the cluster, what are the IP and PORT, and the node information. What is the name, what is the status of the node (such as OK, PFAIL, FAIL, detailed later), including the node role (master or slave), etc.
Regarding availability, the cluster consists of N groups of master-slave Redis Instances.

The master can have no slaves, but the absence of slaves means that the Slot read and write services that the master is responsible for are unavailable after the master goes down.

A master can have multiple slaves. When the master goes down, a certain slave will be promoted to the master. Specifically, which slave is promoted to the master. The protocol is similar to Raft. See here. How to detect master outage? Redis Cluster uses the quorum+heartbeat mechanism. From the perspective of a node, the node will periodically send Pings to all other nodes. If no reply is received from the other party within the cluster-node-timeout (configurable, second level), the peer node will be unilaterally considered to be down, and the node will be considered down. The node is marked with PFAIL status. Through the exchange of information between nodes, it is collected that a quorum of nodes all think that this node is PFAIL, so the node is marked as FAIL and sent to all other nodes. After receiving it, all other nodes immediately consider the node to be down. It can be seen from here that after the master goes down, the read and write services of the Slot that the master is responsible for are unavailable at least within the cluster-node-timeout period.

The characteristics of Redis Cluster are as follows:

  • Node automatic discovery
  • slave->master election, cluster fault tolerance
  • Hot resharding: online sharding
  • Cluster management:clusterxxx
  • Cluster management based on configuration (nodes-port.conf)
  • ASK steering/MOVED steering mechanism
  • Deployment does not require specifying a master
  • Can support clusters of more than 1,000 nodes

======Redis-Cluster adopts a centerless structure. Each node saves data and the entire cluster status. Each node is connected to all other nodes. ====== The
redis-cluster architecture diagram is as follows:

 

Its structural features:

  • All redis nodes are interconnected with each other (PING-PONG mechanism), and a binary protocol is used internally to optimize transmission speed and bandwidth.
  • The fail of a node takes effect only when more than half of the nodes in the cluster detect failures.
  • The client is directly connected to the redis node, without the need for an intermediate proxy layer. The client does not need to connect to all nodes in the cluster, just connect to any available node in the cluster.
  • redis-cluster maps all physical nodes to [0-16383] slot (not necessarily evenly distributed), and cluster is responsible for maintaining node<->slot<->value.
  • The Redis cluster is pre-divided into 16384 buckets. When a key-value needs to be placed in the Redis cluster, the bucket in which a key is placed is determined based on the value of CRC16(key) mod 16384.

The redis cluster is designed to reduce the pressure on a single node or a simple master-slave redis. There is no synchronization relationship between the master and master nodes, and there is a synchronization relationship between the data between the master and slave nodes . How many master nodes there are, 16384 hash slots will be evenly distributed to these master nodes . When writing data to redis, the number of hash slots will be calculated based on the hash algorithm and decided to put it. Which master node to go to, and then the slave node of this master node will automatically synchronize. Just connect to any master node on the client, and there will be internal jumps between master nodes! When the corresponding data is fetched, each node will automatically jump to the main node where the fetched data is located!

1) Redis cluster node allocation
Assume that there are three master nodes: A, B, C. They can be three ports on one machine, or they can be three different servers.
Then, if 16384 slots are allocated using the hash slot method , the slot ranges assumed by the three nodes are:
Node A covers 0-5460;
Node B covers 5461-10922;
Node C covers 10923-16383 .Get

data:
If a value is stored, follow the algorithm of the redis cluster hash slot: CRC16('key')%16384 = 6782. Then the storage of this key will be allocated to B. Similarly, when I connect to
any node (A, B, C) and want to get the key 'key', I will also use this algorithm, and then jump internally to node B to get the data. Add a new master 

node:
Add a new node D, the method of redis cluster is to take a part of the slot from the front of each node and put it on D. I will experiment in the next practice. It will roughly become like this:
Node A covers 1365-5460
Node B covers 6827-10922
Node C covers 12288-16383
Node D covers 0-1364,5461-6826,10923-12287

Deleting a node is similar. You can delete the node after the move is completed.

2) Redis Cluster master-slave mode
In order to ensure the high availability of data, redis cluster has added the master-slave mode. One master node corresponds to one or more slave nodes. The master node provides data access, and the slave nodes pull data from the master node.
Backup, when the master node hangs up, one of the slave nodes will be selected to act as the master node to ensure that the cluster will not hang up.

In the above example, the cluster has three master nodes A, B, and C. If these three nodes do not join the slave nodes, and if B dies, we will not be able to access the entire cluster. Slots A and C are also inaccessible.
Therefore, when the cluster is established, slave nodes must be added to each master node. For example, like this, the cluster contains master nodes A, B, C, and slave nodes A1, B1, and C1. Then even if B hangs up the system,
Can continue to work correctly. Node B1 replaces node B, so the Redis cluster will select node B1 as the new master node, and the cluster will continue to provide services correctly. When B is restarted, it will become the slave node of B1.

However, it should be noted that if nodes B and B1 hang up at the same time, the Redis cluster will not be able to continue to provide services correctly.

=========== Without further ado, let’s record how to build a redis cluster ==========
Since the smallest redis cluster requires 3 master nodes (that is, the Redis Cluster cluster requires at least 3 master nodes, that is to say, at least 6 nodes are needed to build a Redis cluster). One machine can run multiple redis instances (usually two machines are used, and each starts 3 redis instances, that is, three master nodes. three slave nodes). In many cases, a single server is used to open 6 ports. The operations are similar, but the configuration is relatively simple. Multiple servers are closer to the production environment. [When the cluster is initially created, remember the master-slave relationship of each node (or specify the master-slave relationship when creating); if one of the machines restarts, after restarting, it needs to be added to the redis cluster again. ; This requires changing the previous slave node of each node on this machine to the master node (the client executes slaveof no one), and then adding each node of this machine to the cluster according to the new master node, and then changing it to For the slave node]
redis cluster node information in this case:
redis01
172.16.51.175:7000
172.16.51.175:7001
172.16.51.175:7002
redis02
172.16.51.176:7003
172.16.51.176:7004 172.1
6.51.176
:7005 redis03
172.16.51.178:7006
172.16 .51.178:7007
172.16.51.178:7008

Let’s first talk about the deployment process of the redis01 node (the deployment process of the other two nodes is the same)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

个人运维习惯,会专门创建一个app账号,用户部署应用程序。本案例应用程序都部署在/data目录下,将/data权限设置成app

[root@bl-redis01 ~]# useradd app

[root@bl-redis01 ~]# passwd app

[root@bl-redis01 ~]# chown -R app.app /data

  

前提准备

1)安裝 GCC 编译工具 不然会有编译不过的问题

[root@bl-redis01 ~]# yum install -y gcc g++ make gcc-c++ kernel-devel automake autoconf libtool make wget tcl vim ruby rubygems unzip git

  

2)升级所有的包,防止出现版本过久不兼容问题

[root@bl-redis01 ~]# yum -y update

  

3)关闭防火墙 节点之前需要开放指定端口,为了方便,生产不要禁用

[root@bl-redis01 ~]# /etc/init.d/iptables stop

[root@bl-redis01 ~]# setenforce 0

[root@bl-redis01 ~]# vim /etc/sysconfig/selinux

......

SELINUX=disabled

......

  

redis cluster集群部署

4)下载并编译安装redis

[root@bl-redis01 ~]# su - app

[app@bl-redis01 ~]$ mkdir /data/software/

[app@bl-redis01 software]$ wget http://download.redis.io/releases/redis-4.0.1.tar.gz

[app@bl-redis01 software]$ tar -zvxf redis-4.0.1.tar.gz

[app@bl-redis01 software]$ mv redis-4.0.1 /data/

[app@bl-redis01 software]$ cd /data/redis-4.0.1/

[app@bl-redis01 redis-4.0.1]$ make

--------------------------------------------------------------------------------------

如果因为上次编译失败,有残留的文件,做法如下:

[app@bl-redis01 redis-4.0.1]$ make distclean

--------------------------------------------------------------------------------------

  

5)创建节点

首先在172.16.51.175机器(redis01)上/data/redis-4.0.1目录下创建redis-cluster目录

[app@bl-redis01 redis-4.0.1]$ mkdir /data/redis-4.0.1/redis-cluster

  

接着在redis-cluster目录下,创建名为7000、7001、7002的目录

[app@bl-redis01 redis-cluster]$ mkdir 7000

[app@bl-redis01 redis-cluster]$ mkdir 7001

[app@bl-redis01 redis-cluster]$ mkdir 7002

  

分别修改这三个配置文件redis.conf

[app@bl-redis01 redis-4.0.1]$ cd redis-cluster/

[app@bl-redis01 redis-cluster]$ ll

total 12

drwxrwxr-x 2 app app 4096 Nov 16 17:38 7000

drwxrwxr-x 2 app app 4096 Nov 16 17:39 7001

drwxrwxr-x 2 app app 4096 Nov 16 17:39 7002

[app@bl-redis01 redis-cluster]$ cat 7000/redis.conf

port 7000

bind 172.16.51.175

daemonize yes

pidfile /var/run/redis_7000.pid

cluster-enabled yes

cluster-config-file nodes_7000.conf

cluster-node-timeout 10100

appendonly yes

[app@bl-redis01 redis-cluster]$ cat 7001/redis.conf

port 7001

bind 172.16.51.175

daemonize yes

pidfile /var/run/redis_7001.pid

cluster-enabled yes

cluster-config-file nodes_7001.conf

cluster-node-timeout 10100

appendonly yes

[app@bl-redis01 redis-cluster]$ cat 7002/redis.conf

port 7002

bind 172.16.51.175

daemonize yes

pidfile /var/run/redis_7002.pid

cluster-enabled yes

cluster-config-file nodes_7002.conf

cluster-node-timeout 10100

appendonly yes

  

----------------------------------------------------------------------------------------------------

redis.conf的配置说明:

#端口7000,7001,7002

port 7000

  

#默认ip为127.0.0.1,需要改为其他节点机器可访问的ip,否则创建集群时无法访问对应的端口,无法创建集群

bind 172.16.51.175

  

#redis后台运行

daemonize yes

  

#pidfile文件对应7000,7001,7002

pidfile /var/run/redis_7000.pid

  

#开启集群,把注释#去掉

cluster-enabled yes

  

#集群的配置,配置文件首次启动自动生成 7000,7001,7002         

cluster-config-file nodes_7000.conf

  

#请求超时,默认15秒,可自行设置

cluster-node-timeout 10100   

          

#aof日志开启,有需要就开启,它会每次写操作都记录一条日志

appendonly yes

----------------------------------------------------------------------------------------------------

  

接着在另外两台机器上(172.16.51.176,172.16.51.178)重复以上三步,只是把目录改为7003、7004、7005和7006、7007、7008,对应的配置文件也按照这个规则修改即可(即修改redis.conf文件中的端口就行了)

  

6)启动集群(依次启动7000-7008端口)

#第一个节点机器上执行 3个节点

[app@bl-redis01 redis-cluster]$ for((i=0;i<=2;i++)); do /data/redis-4.0.1/src/redis-server /data/redis-4.0.1/redis-cluster/700$i/redis.conf; done

  

#第二个节点机器上执行 3个节点

[app@bl-redis01 redis-cluster]$ for((i=3;i<=5;i++)); do /data/redis-4.0.1/src/redis-server /data/redis-4.0.1/redis-cluster/700$i/redis.conf; done

  

#第三个节点机器上执行 3个节点

[app@bl-redis01 redis-cluster]$ for((i=6;i<=8;i++)); do /data/redis-4.0.1/src/redis-server /data/redis-4.0.1/redis-cluster/700$i/redis.conf; done

  

7)检查服务

检查各 Redis 各个节点启动情况

[app@bl-redis01 redis-cluster]$ ps -ef | grep redis 

app       2564  2405  0 20:13 pts/0    00:00:00 grep redis

app      15197     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7000 [cluster]                  

app      15199     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7001 [cluster]                  

app      15201     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7002 [cluster]                  

[app@bl-redis01 redis-cluster]$ ps -ef | grep redis 

app       2566  2405  0 20:13 pts/0    00:00:00 grep redis

app      15197     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7000 [cluster]                  

app      15199     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7001 [cluster]                  

app      15201     1  0 17:57 ?        00:00:05 /data/redis-4.0.1/src/redis-server 172.16.51.175:7002 [cluster]

  

8)安装 Ruby(需要切换到root账号下进行安装,app账号下权限不够)

[root@bl-redis01 ~]# yum -y install ruby ruby-devel rubygems rpm-build

[root@bl-redis01 ~]# gem install redis

-----------------------------------------------------------------------------------------------------

注意:在centos6.x下执行上面的"gem install redis"操作可能会报错,坑很多!

默认yum安装的ruby版本是1.8.7,版本太低,需要升级到ruby2.2以上,否则执行上面安装会报错!

  

首先安装rvm(或者直接下载证书:https://pan.baidu.com/s/1slTyJ7n  密钥:7uan   下载并解压后直接执行"curl -L get.rvm.io | bash -s stable"即可)

[root@bl-redis01 ~]# curl -L get.rvm.io | bash -s stable          //可能会报错,需要安装提示进行下面一步操作

[root@bl-redis01 ~]# curl -sSL https://rvm.io/mpapis.asc | gpg2 --import -      //然后再接着执行:curl -L get.rvm.io | bash -s stable

[root@bl-redis01 ~]# find / -name rvm.sh

/etc/profile.d/rvm.sh

[root@bl-redis01 ~]# source /etc/profile.d/rvm.sh

[root@bl-redis01 ~]# rvm requirements

  

然后升级ruby到2.3

[root@bl-redis01 ~]# rvm install ruby 2.3.1

[root@bl-redis01 ~]# ruby -v

ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]

  

列出所有ruby版本

[root@bl-redis01 ~]# rvm list

  

设置默认的版本

[root@bl-redis01 ~]# rvm --default use 2.3.1

  

更新下载源

[root@bl-redis01 ~]# gem sources --add https://gems.ruby-china.org/ --remove https://rubygems.org

https://gems.ruby-china.org/ added to sources

source https://rubygems.org not present in cache

  

[root@bl-redis01 ~]# gem sources

*** CURRENT SOURCES ***

  

https://rubygems.org/

https://gems.ruby-china.org/

  

最后就能顺利安装了

[root@bl-redis01 src]# gem install redis

Successfully installed redis-4.0.1

Parsing documentation for redis-4.0.1

Done installing documentation for redis after 1 seconds

1 gem installed

-----------------------------------------------------------------------------------------------------

  

9)创建集群

千万注意:在任意一台上运行即可,不要在每台机器上都运行,一台就够了!!!!

Redis 官方提供了 redis-trib.rb 这个工具,就在解压目录的 src 目录中

[root@bl-redis01 ~]# su - app

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-trib.rb create --replicas 1 172.16.51.175:7000 172.16.51.175:7001 172.16.51.175:7002 172.16.51.176:7003 172.16.51.176:7004 172.16.51.176:7005 172.16.51.178:7006 172.16.51.178:7007 172.16.51.178:7008

  

出现下面信息,从下面信息可以看出,本案例三台服务器启动9个实例,配置成4主5从,其中有一个是一主两从,其他3个都是一主一从。

>>> Creating cluster

>>> Performing hash slots allocation on 9 nodes...

Using 4 masters:

172.16.51.175:7000

172.16.51.176:7003

172.16.51.178:7006

172.16.51.175:7001

Adding replica 172.16.51.176:7004 to 172.16.51.175:7000

Adding replica 172.16.51.178:7007 to 172.16.51.176:7003

Adding replica 172.16.51.175:7002 to 172.16.51.178:7006

Adding replica 172.16.51.176:7005 to 172.16.51.175:7001

Adding replica 172.16.51.178:7008 to 172.16.51.175:7000

M: 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf 172.16.51.175:7000

   slots:0-4095 (4096 slots) master

M: 44c81c15b01d992cb9ede4ad35477ec853d70723 172.16.51.175:7001

   slots:12288-16383 (4096 slots) master

S: 38f03c27af39723e1828eb62d1775c4b6e2c3638 172.16.51.175:7002

   replicates f1abb62a8c9b448ea14db421bdfe3f1d8075189c

M: 987965baf505a9aa43e50e46c76189c51a8f17ec 172.16.51.176:7003

   slots:4096-8191 (4096 slots) master

S: 6555292fed9c5d52fcf5b983c441aff6f96923d5 172.16.51.176:7004

   replicates 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf

S: 2b5ba254a0405d4efde4c459867b15176f79244a 172.16.51.176:7005

   replicates 44c81c15b01d992cb9ede4ad35477ec853d70723

M: f1abb62a8c9b448ea14db421bdfe3f1d8075189c 172.16.51.178:7006

   slots:8192-12287 (4096 slots) master

S: eb4067373d36d8a8df07951f92794e67a6aac022 172.16.51.178:7007

   replicates 987965baf505a9aa43e50e46c76189c51a8f17ec

S: 2919e041dd3d1daf176d6800dcd262f4e727f366 172.16.51.178:7008

   replicates 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf

Can I set the above configuration? (type 'yes' to accept): yes

  

输入 yes

>>> Nodes configuration updated

>>> Assign a different config epoch to each node

>>> Sending CLUSTER MEET messages to join the cluster

Waiting for the cluster to join.........

>>> Performing Cluster Check (using node 172.16.51.175:7000)

M: 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf 172.16.51.175:7000

   slots:0-4095 (4096 slots) master

   2 additional replica(s)

S: 6555292fed9c5d52fcf5b983c441aff6f96923d5 172.16.51.176:7004

   slots: (0 slots) slave

   replicates 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf

M: 44c81c15b01d992cb9ede4ad35477ec853d70723 172.16.51.175:7001

   slots:12288-16383 (4096 slots) master

   1 additional replica(s)

S: 2919e041dd3d1daf176d6800dcd262f4e727f366 172.16.51.178:7008

   slots: (0 slots) slave

   replicates 7c622ac191edd40dd61d9b79b27f6f69d02a5bbf

M: f1abb62a8c9b448ea14db421bdfe3f1d8075189c 172.16.51.178:7006

   slots:8192-12287 (4096 slots) master

   1 additional replica(s)

S: eb4067373d36d8a8df07951f92794e67a6aac022 172.16.51.178:7007

   slots: (0 slots) slave

   replicates 987965baf505a9aa43e50e46c76189c51a8f17ec

S: 38f03c27af39723e1828eb62d1775c4b6e2c3638 172.16.51.175:7002

   slots: (0 slots) slave

   replicates f1abb62a8c9b448ea14db421bdfe3f1d8075189c

S: 2b5ba254a0405d4efde4c459867b15176f79244a 172.16.51.176:7005

   slots: (0 slots) slave

   replicates 44c81c15b01d992cb9ede4ad35477ec853d70723

M: 987965baf505a9aa43e50e46c76189c51a8f17ec 172.16.51.176:7003

   slots:4096-8191 (4096 slots) master

   1 additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

  

10)关闭集群

推荐做法:

[app@bl-redis01 ~]$ pkill redis

[app@bl-redis02 ~]$ pkill redis

[app@bl-redis03 ~]$ pkill redis

  

或者循环节点逐个关闭

[app@bl-redis01 ~]$ for((i=0;i<=2;i++)); do /opt/redis-4.0.1/src/redis-cli -c -h 172.16.51.175 -p 700$i shutdowndone

[app@bl-redis02 ~]$ for((i=3;i<=5;i++)); do /opt/redis-4.0.1/src/redis-cli -c -h 172.16.51.176 -p 700$i shutdowndone

[app@bl-redis03 ~]$ for((i=6;i<=8;i++)); do /opt/redis-4.0.1/src/redis-cli -c -h 172.16.51.178 -p 700$i shutdowndone

  

11)集群验证

连接集群测试

参数-C可连接到集群,因为redis.conf将bind改为了ip地址,所以-h参数不可以省略,-p参数为端口号

  

可以先在172.16.51.175机器redis 7000 的节点set一个key

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-cli -h 172.16.51.175 -c -p 7000

172.16.51.175:7000> set name www.ymq.io

-> Redirected to slot [5798] located at 172.16.51.176:7003

OK

172.16.51.176:7003> get name

"www.ymq.io"

172.16.51.176:7003>

  

由上面信息可发现redis set name 之后重定向到172.16.51.176机器 redis 7003 这个节点

  

然后在172.16.51.178机器redis 7008 的节点get一个key

[app@bl-redis03 ~]$ /data/redis-4.0.1/src/redis-cli -h 172.16.51.178 -c -p 7008

172.16.51.178:7008> get name

-> Redirected to slot [5798] located at 172.16.51.176:7003

"www.ymq.io"

172.16.51.176:7003>

  

发现redis get name 重定向到172.16.51.176机器 redis 7003 这个节点.

  

如果看到这样的现象,说明redis cluster集群已经是可用的了!!!!!!

  

12)检查集群状态(通过下面的命令,可以看到本案例实现的是4主5从,4个主节点会默认分配到三个机器上,每个机器上都要有master;另:创建集群的时候可以指定master和slave。这里我是默认创建的)

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-cli -h 172.16.51.175 -c -p 7000

172.16.51.175:7000>

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-trib.rb check 172.16.51.175:7000

>>> Performing Cluster Check (using node 172.16.51.175:7000)

M: 5a43e668f53ff64da68be31afe6dc6ea1f3c14c5 172.16.51.175:7000

   slots:0-4095 (4096 slots) master

   2 additional replica(s)

M: c64b0839e0199f73c5c192cc8c90f12c999f79b2 172.16.51.175:7001

   slots:12288-16383 (4096 slots) master

   1 additional replica(s)

S: 81347f01cf38d8f0faef1ad02676ebb4cffbec9e 172.16.51.176:7005

   slots: (0 slots) slave

   replicates c64b0839e0199f73c5c192cc8c90f12c999f79b2

M: da5dde3f2f02c232784bf3163f5f584b8cf046f2 172.16.51.178:7006

   slots:8192-12287 (4096 slots) master

   1 additional replica(s)

M: b217ab2a6c05497af3b2a859c1bb6b3fae5e0d92 172.16.51.176:7003

   slots:4096-8191 (4096 slots) master

   1 additional replica(s)

S: 0420c49fbc9f1fe16066d189265cca2f5e71c86e 172.16.51.178:7007

   slots: (0 slots) slave

   replicates b217ab2a6c05497af3b2a859c1bb6b3fae5e0d92

S: 5ad89453fb36e50ecc4560de6b4acce1dbbb78b3 172.16.51.176:7004

   slots: (0 slots) slave

   replicates 5a43e668f53ff64da68be31afe6dc6ea1f3c14c5

S: bbd1f279b99b95cf00ecbfab22b6b8dd5eb05989 172.16.51.178:7008

   slots: (0 slots) slave

   replicates 5a43e668f53ff64da68be31afe6dc6ea1f3c14c5

S: e95407b83bfeb30e3cc537161eadc372d6aa1fa2 172.16.51.175:7002

   slots: (0 slots) slave

   replicates da5dde3f2f02c232784bf3163f5f584b8cf046f2

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

  

13)列出集群节点

列出集群当前已知的所有节点(node),以及这些节点的相关信息

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-cli -h 172.16.51.175 -c -p 7000

172.16.51.175:7000> cluster nodes

5a43e668f53ff64da68be31afe6dc6ea1f3c14c5 172.16.51.175:7000@17000 myself,master - 0 1510836027000 1 connected 0-4095

c64b0839e0199f73c5c192cc8c90f12c999f79b2 172.16.51.175:7001@17001 master - 0 1510836030068 2 connected 12288-16383

81347f01cf38d8f0faef1ad02676ebb4cffbec9e 172.16.51.176:7005@17005 slave c64b0839e0199f73c5c192cc8c90f12c999f79b2 0 1510836031000 6 connected

da5dde3f2f02c232784bf3163f5f584b8cf046f2 172.16.51.178:7006@17006 master - 0 1510836031000 7 connected 8192-12287

b217ab2a6c05497af3b2a859c1bb6b3fae5e0d92 172.16.51.176:7003@17003 master - 0 1510836030000 4 connected 4096-8191

0420c49fbc9f1fe16066d189265cca2f5e71c86e 172.16.51.178:7007@17007 slave b217ab2a6c05497af3b2a859c1bb6b3fae5e0d92 0 1510836029067 8 connected

5ad89453fb36e50ecc4560de6b4acce1dbbb78b3 172.16.51.176:7004@17004 slave 5a43e668f53ff64da68be31afe6dc6ea1f3c14c5 0 1510836032672 5 connected

bbd1f279b99b95cf00ecbfab22b6b8dd5eb05989 172.16.51.178:7008@17008 slave 5a43e668f53ff64da68be31afe6dc6ea1f3c14c5 0 1510836031000 9 connected

e95407b83bfeb30e3cc537161eadc372d6aa1fa2 172.16.51.175:7002@17002 slave da5dde3f2f02c232784bf3163f5f584b8cf046f2 0 1510836031672 7 connected

  

14)打印集群信息

[app@bl-redis01 ~]$ /data/redis-4.0.1/src/redis-cli -h 172.16.51.175 -c -p 7000

172.16.51.175:7000> cluster info

cluster_state:ok

cluster_slots_assigned:16384

cluster_slots_ok:16384

cluster_slots_pfail:0

cluster_slots_fail:0

cluster_known_nodes:9

cluster_size:4

cluster_current_epoch:9

cluster_my_epoch:1

cluster_stats_messages_ping_sent:8627

cluster_stats_messages_pong_sent:8581

cluster_stats_messages_sent:17208

cluster_stats_messages_ping_received:8573

cluster_stats_messages_pong_received:8626

cluster_stats_messages_meet_received:8

cluster_stats_messages_received:17207

 

------------------------------------------------------------------------------------------------

[root@bl-redis01 src]# pwd

/data/redis-4.0.1/src

[root@bl-redis01 src]# ./redis-trib.rb help

Usage: redis-trib <command> <options> <arguments ...>

 

  create          host1:port1 ... hostN:portN

                  --replicas <arg>

  check           host:port

  info            host:port

  fix             host:port

                  --timeout <arg>

  reshard         host:port

                  --from <arg>

                  --to <arg>

                  --slots <arg>

                  --yes

                  --timeout <arg>

                  --pipeline <arg>

  rebalance       host:port

                  --weight <arg>

                  --auto-weights

                  --use-empty-masters

                  --timeout <arg>

                  --simulate

                  --pipeline <arg>

                  --threshold <arg>

  add-node        new_host:new_port existing_host:existing_port

                  --slave

                  --master-id <arg>

  del-node        host:port node_id

  set-timeout     host:port milliseconds

  call            host:port command arg arg .. arg

  import          host:port

                  --from <arg>

                  --copy

                  --replace

  help            (show this help)

 

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.

 

上面已经多次出现了slot这个词,略为解释一下:

redis-cluster把整个集群的存储空间划分为16384个slot(译为:插槽?),当9个实例分为3主6从时,相当于整个cluster中有3组HA的节点,

3个master会平均分摊所有slot,每次向cluster中的key做操作时(比如:读取/写入缓存),redis会对key值做CRC32算法处理,得到一个数值,

然后再对16384取模,通过余数判断该缓存项应该落在哪个slot上,确定了slot,也就确定了保存在哪个master节点上,当cluster扩容或删除

节点时,只需要将slot重新分配即可(即:把部分slot从一些节点移动到其它节点)。

 

---------------------------------------------------------------------------------------------------------

在代码里连接以上redis cluster集群节点配置如下:

spring.redis.cluster.nodes = 172.16.51.175:7000,172.16.51.175:7001,172.16.51.175:7002,172.16.51.176:7003,172.16.51.176:7004,172.16.51.176:7005,172.16.51.178:7006,172.16.51.178:7007,172.16.51.178:7008

==========================集群模式配置======================

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

以下这种方式貌似不能按照自己的思路添加主从

redis-trib.rb create --replicas 1 192.168.1.101:6381 192.168.1.102:6382  192.168.1.103:6383 192.168.1.102:6381 192.168.1.103:6382   192.168.1.101:6383

  

思路改为先加主库 再加从库

添加主库

redis-trib.rb create  192.168.1.101:6381 192.168.1.102:6382  192.168.1.103:6383

  

添加从库

把 102的6381 作为从库加入 101的6381

redis-trib.rb add-node --slave 192.168.1.102:6381   192.168.1.101:6381

  

redis-trib.rb add-node --slave 192.168.1.103:6382   192.168.1.102:6382

redis-trib.rb add-node --slave 192.168.1.101:6383   192.168.1.103:6383

  

检测

redis-trib.rb check 192.168.1.101:6381

redis-trib.rb check 192.168.1.102:6382

redis-trib.rb check 192.168.1.103:6383

==========================redis cluster常见的几个问题======================

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

1)问题一

由于redis clster集群节点宕机(或节点的redis服务重启),导致了部分slot数据分片丢失;在用check检查集群运行状态时,遇到错误;

[root@slave2 redis]# redis-trib.rb check 192.168.1.100:7000

........

[ERR] Not all 16384 slots are covered by nodes.

     

原因分析:

这个往往是由于主node移除了,但是并没有移除node上面的slot,从而导致了slot总数没有达到16384,其实也就是slots分布不正确。

所以在删除节点的时候一定要注意删除的是否是Master主节点。

     

解决办法:

官方是推荐使用redis-trib.rb fix 来修复集群。通过cluster nodes看到7001这个节点被干掉了。可以按照下面操作进行修复

[root@slave2 redis]# redis-trib.rb fix 192.168.1.100:7000

     

修复完成后再用check命令检查下是否正确(查看别的节点)

[root@slave2 redis]# redis-trib.rb check 192.168.1.101:7002

     

只要输入任意集群中节点即可,会自动检查所有相关节点。

可以查看相应的输出看下是否是每个Master都有了slots。

     

如果分布不均匀那可以使用下面的方式重新分配slot:

[root@slave2 redis]# redis-trib.rb reshard 192.168.1.100:7000

  

特别注意:

在部分节点重启后重新回到集群中的过程期间,在check集群状态的时候会出现"[ERR] Not all 16384 slots are covered by nodes."这个报错,

需要稍微等待一会,等重启节点完全回到集群中后,这个报错就会消失!

======================================================

问题二:

在往redis cluster集群环境中添加节点时遇到一个问题,提示新增的Node不为空:

[root@slave2 redis]# redis-trib.rb add-node --slave 192.168.1.103:7004  192.168.1.102:7002

.......

[ERR] Node 192.168.1.103:7004 is not empty. Either the nodealready knows other nodes (check with CLUSTER NODES) or

contains some key in database 0.

     

解决办法::

1)将192.168.1.103节点机redis下的aof、rdb等本地备份文件全部删除

2)同时将新Node的集群配置文件删除,也即是删除redis.conf里面cluster-config-file指定所在的文件;

3)"redis-cli -c -h 192.168.1.103 -p 7004"登陆后,执行 "flushdb"命令进行清除操作

4)重启reds服务

5)最后再次执行节点添加操作

     

======================================================

温馨提示:

-  集群中只要有一组master-slave节点同时挂点,则集群服务也会挂掉;待该组master和slave节点的redis恢复后,这部分slot槽的数据也会丢失。

-  集群中1/2或master节点挂掉,则集群服务也会挂掉;待这些master节点服务重启后,会自动加入到集群中,需等待一段时间,集群恢复正常,数据不会丢失。

   

- 集群中master节点关闭,需要等待一小段时间,它对应的slave节点就会变成master节点,集群服务正常,数据会随之到新的maser节点的slot。

- master节点挂掉后,重启redis服务(一定要在原来的aof和nodes*.conf文件路径下启动),则会自动加入到cluster集群中,并会变成slave节点。

   

-  新添加的master节点的slot默认为0,master主节点如果没有slots,存取数据就都不会被选中! 故要为新增加的master节点进行reshard重新分配slot。

-  slave从节点的slot为0,数据不会存储在slave节点!只会存储在master主节点中,master节点才有slot数值。

 

======================================================

注意:每一组的master-slave节点不能同时挂掉或短时间内先后挂掉,否则这部分slot内的数据就会丢失。

比如说一主一从,当master节点挂掉后,数据都保存到slave节点内,稍过一会,slave节点就会被选举为新的master节点。

老的master节点重启后重新回到集群中,并自动变为它原来的slave(现在是新的master)的slave节点,并自动同步数据。

这个时候新的master节点如果挂掉,则数据同样会保存到新的slave节点中,新的slave节点过一段时间同样会被再次选举为新的master,如此类推....

如果master节点和它的slave节点同时挂掉,或者在其中一个挂掉后还没有来得及恢复到集群中,另一个就挂掉,这种情况下,这部分slot槽的数据肯定就没有了。

所以说,一般会重启一个节点,待该节点恢复到集群中后,再可以重启它对应的slave或master节点。

  

redis作为纯缓存服务时,数据丢失,一般对业务是无感的,不影响业务,丢失后会再次写入。但如果作为存储服务(即作为存储数据库),数据丢失则对业务影响很大。

不过一般业务场景,存储数据库会用mysql、oracle或mongodb。

 

======================================================

redis cluster集群节点重启后,要想恢复集群状态,正确的做法是:

1)要在各个节点原来的appendonly.aof ,dump.rdb,nodes_*.conf 文件所在路径下重启redis服务。这样就能确保redis启动后用到之前的数据文件。

(可以使用find命令查找这些文件所在路径,然后在这个路径下启动redis服务)

2)各个节点的redis服务正常启动后,就可以直接查看redis cluster状态了,检查集群状态是否恢复。

       

注意:

一定要在原来的数据文件的路径下启动redis,如果启动的路径错误,则读取的数据文件就不是之前的了,这样集群就很难恢复了。这个时候就需要删除之前的数据文件,

重新创建集群了。

    

集群节点的redis服务重启后,check集群状态,如有下面告警信息,处理如下:

[root@redis-node01 redis-cluster]# /data/redis-4.0.6/src/redis-trib.rb check 192.168.1.100:7000

...........

...........

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

[WARNING] Node 192.168.1.100:7000 has slots in importing state (5798,11479).

[WARNING] Node 192.168.1.100:7001 has slots in importing state (1734,5798).

[WARNING] Node 192.168.1.101:7002 has slots in importing state (11479).

[WARNING] The following slots are open: 5798,11479,1734

>>> Check slots coverage...

[OK] All 16384 slots covered.

    

解决办法:

一定要登录到告警信息中的节点和对应的端口上进行操作。

执行"cluster setslot <slot> stable"命令,表示取消对槽slot 的导入( import)或者迁移( migrate)。

执行后,这部分slot槽的数据就没了。

[root@redis-node01 redis-cluster]# /data/redis-4.0.6/src/redis-cli -h 192.168.1.100 -c -p 7000

192.168.1.100:7000> cluster setslot 5798 stable

OK

192.168.1.100:7000> cluster setslot 11479 stable

OK

    

[root@redis-node01 redis-cluster]# /data/redis-4.0.6/src/redis-cli -h 192.168.1.100 -c -p 7001

192.168.1.100:7001> cluster setslot 1734 stable

OK

192.168.1.100:7001> cluster setslot 5798 stable

OK

    

[root@redis-node01 redis-cluster]# /data/redis-4.0.6/src/redis-cli -h 192.168.1.101 -c -p 7002

192.168.1.101:7002> cluster setslot 11479 stable

OK

    

再次检查redis cluster集群状态,就会发现一切正常了!

[root@redis-node01 redis-cluster]# /data/redis-4.0.6/src/redis-trib.rb check 192.168.1.100:7000

>>> Performing Cluster Check (using node 192.168.1.100:7000)

M: 39737de1c48fdbaec304f0d11294286593553365 192.168.1.100:7000

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

S: 61a0cc84069ced156b6e1459bb71cab225182385 192.168.1.101:7003

   slots: (0 slots) slave

   replicates 39737de1c48fdbaec304f0d11294286593553365

S: 75de8c46eda03aee1afdd39de3ffd39cc42a5eec 172.16.60.209:7005

   slots: (0 slots) slave

   replicates 70a24c750995e2f316ee15320acb73441254a7aa

M: 70a24c750995e2f316ee15320acb73441254a7aa 192.168.1.101:7002

   slots:5461-10922 (5462 slots) master

   1 additional replica(s)

S: 5272bd14768e3e32e165284c272525a7da47b47e 192.168.1.100:7001

   slots: (0 slots) slave

   replicates c1b71d52b0d804f499c9166c0c1f4e3c35077ee9

M: c1b71d52b0d804f499c9166c0c1f4e3c35077ee9 172.16.60.209:7004

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

Guess you like

Origin blog.csdn.net/harryptter/article/details/88243386