Redis distributed articles

Redis distributed articles

1 Why Redis cluster

1.1 Why need a cluster?

1.1.1 Performance

QPS Redis itself is already high, but if in some cases very high concurrency, performance can be affected. This time we hope to have more Redis service to do the work.

1.1.2 Extended

The second is for storage considerations. Redis because all data in memory, if the data is large, are vulnerable to hardware limitations. And upgrading hardware with little cost ratio is too low, so we need to have a method of scale-out

1.1.3 Availability

The third is the availability and security. If only a Redis service, once the service is down, all clients can access, will significantly affect the business. Another, if the hardware fails, and stand-alone data can not be restored, then the impact is disastrous.

Availability, data security, performance can be achieved by building more Reids service. There is a master node (master), there may be a plurality of slave nodes (slave). Through data synchronization between master and slave, storing identical data. If the primary node fails, put into a master node from the node, access to the new primary node.

2 Redis from the master copy (Replication)

2.1 replication master-slave configuration

For example, from a multi-master, the master node 203 is increased in line redis.conf profile of each slave node

slaveof 192.168.8.203 6379

When the main switch, this configuration will be rewritten as:

# Generated by CONFIG REWRITE
replicaof 192.168.8.203 6379

Or when you start the service by the designated master node parameters

./redis-server --slaveof 192.168.8.203 637

Or performing slaveof xx xx directly in the client, so that the Redis instance a slave node.
After starting, check the cluster status:

redis> info replication

Data can not be written from the node (read-only), data can only be synchronized from the master node. get success, set to fail.

127.0.0.1:6379> set sunda 666
(error) READONLY You can't write against a read only replica.

After writing the master node, slave automatically synchronized data from the master.
Disconnect copy:

redis> slaveof no one

This time from the node will become their master node no longer replicate data.

2.2 Master-Slave Replication Works

2.2.1 connection phase

1, when the slave node startup (command execution slaveof), will save the information in their own local master node, including the master node of the host and ip.
2, slave internal node have a regular job replicationCron (source replication.c), every 1 second to check for new master node to connect and replicate, if found, just like the master node to establish a network socket connection, if the connection is successful , socket set up a dedicated file processing event handler to copy the work, in charge of the follow-up of copying work, such as receiving RDB file, receives commands from the node for dissemination. When the node becomes from a customer master end node, the master node will send a ping request.

2.2.2 Data synchronization phase

3, master node for the first time performing a full volume copy, generate a snapshot of the local RDB by bgsave command, the RDB snapshot files sent to slave node (if time-out will reconnect, you can transfer large repl-timeout values). slave node first clear their old data and then load the data with RDB file.

Question: During the generation RDB, master received command how to deal with?

RDB is generated at the beginning of the file, master will write all new commands cached in memory. After the slave node holds the RDB, copied to the slave node then the new write command.

2.2.3 Command propagation stage

4, master node will continue to write commands, asynchronous replication to slave node

Delay is inevitable, only by optimizing the network.

repl-disable-tcp-nodelay no

When set to yes, TCP packets will be combined to reduce the bandwidth, but the frequency of the transmission may be reduced, the data from the node increases latency, consistency variation; specific transmission frequency and arranged about the Linux kernel, the default configuration is 40ms. When set to no, TCP will immediately send data from the node to the master node, but the increased bandwidth delay becomes smaller.

In general, only when the application of a high tolerance Redis inconsistent data, and the master node is between the poor network conditions, will be set to Yes; in most cases the default value no.

Question: If there is a period of time from the node disconnected from the master node is not the full amount to re-copy it again? If incremental replication, how do you know where the last copy?

By recording the offset master_repl_offset

redis> info replication

1571747119946.png

2.3 Master-slave replication inadequate

Master-slave mode to solve the data backup and performance (separate read and write) questions, but still has some shortcomings:

1, under the RDB file is too large, the synchronization is very time-consuming.
2, in the case of a master-slave or a master multi-slaves, if the primary server is hung up, the service is not available to provide both a single-point problem is not resolved. If the switch manually every time before from the server to the primary server, the more time consuming, but also cause some time service is unavailable.

3 to ensure the availability of Sentinel

3.1 Sentinel Principle

How to Master-slave automatic switching? Our idea:
to create a monitoring server to monitor the status of all Redis service nodes, such as, master node over a certain time does not send a heartbeat packet to monitor the server, put the master marked off the assembly line, then a slave becomes a master. Application every time to get the address of the master server from the monitor.

The question is: If the monitoring server itself is a problem how to do? Then we get the address of the master, there is no way to access the application.
Then we re-create a monitoring server, monitoring server to monitor ...... it seems to be caught in the cycle of death, how to solve this problem? The first question stood.
The Redis Sentinel is this idea: to ensure the availability of the service by running Monitoring Server.

Official website:
https://redis.io/topics/sentinel

From Redis2.8 version, it provides a stable version of Sentinel (Sentinel), to solve the problem of high availability. It is a special state redis example.
We'll start with one or more Sentinel service (by src / redis-sentinel), which is essentially just a Redis under the special mode of operation, Sentinel is listening Redis get the machine master, slave and other information through the info command.
1571747203355.png

In order to ensure availability monitoring server, we will do the Sentinel deployment cluster. Both Redis Sentinel monitors all services between Sentinel also monitor each other.
Note: Sentinel does not have a master-slave distinction, only the Redis service node from the main points.
Concept carding: master, slave (redis group) , sentinel, sentinel set

3.1.1 Service offline

Sentinel PING command to send default Redis service node at a frequency of once per second. If the down-after-milliseconds does not receive a valid reply, Sentinel marks the server offline ( subjective offline ).

# sentinel.conf
sentinel down-after-milliseconds <master-name> <milliseconds>

This time Sentinel node will continue to ask other Sentinel node, verify the node offline, if the majority Sentinel nodes are considered master offline, master really confirm is offline (objective offline), this time on the need to re-election of master .

3.1.2 Failover

If the master is marked as offline, it will start the failover process.
Since there are so many Sentinel node failover who do things it?
The first step in the process is in a failover cluster Sentinel Choose a Leader, the Leader failover process is complete. Sentinle by Raft algorithm, Sentinel election.

3.1.2.1 Ratf algorithm

In a distributed storage system, usually to improve system availability by maintaining multiple copies of data must face the issue of consistency between multiple nodes so. Raft goal is by way of replication, so that all nodes agree, but so many nodes, the nodes to which the data subject to it? It is necessary to elect a Leader

There are two general steps: leadership election, data replication.
Raft is a consensus algorithm (consensus algorithm). Such as encryption currency like Bitcoin, we need consensus algorithm. Spring Cloud registration center solutions Consul also used Raft agreement.
Raft's core idea: a first-come, majority.
Raft algorithm demo:

http://thesecretlivesofdata.com/raft/
Summary:
slightly different algorithm Sentinle of Raft and Raft papers.
1, master objective offline trigger election, rather than over the election timeout time starts election.
2, Leader will not become Leader of the message to other Sentinel. After waiting for the other Sentinel Leader selected from the slave master, upon detecting the new master work, the objective will be to remove the identification offline, thereby eliminating the need to enter the failover process.

3.1.2.2 Failover

Problem: how to make a original slave node becomes the master node?
1, after the elected Sentinel Leader, slaveof no one to send commands to a node from the Sentinel Leader, it became an independent node.
2, and then sent to other nodes slaveof xxxx xxxx (native service), so that they become a child node of this node, the failover is complete.
Question: so much from the node, choose who becomes the master node?
About, a total of four factors affecting the results of the elections from the node elections are disconnected duration, prioritization, copy number, process id.
If the connection is disconnected relatively long and Sentinel, exceeds a certain threshold value, directly lost the right to vote. If you have the right to vote, to see who it high priority in this configuration file can be set (replica-priority 100), the smaller the value the higher the priority.
If the priority is the same, to see who copied from master data up to (maximum offset copy), up to the election that, if copy numbers are the same, it is the smallest of the selection process id.

3.2 Sentinel functionality summary

  • Monitoring. Sentinel constantly checks if your master and slave instances are working as expected.
  • Notification. Sentinel can notify the system administrator, another computer programs, via an API, that something iswrong with one of the monitored Redis instances.
  • Automatic failover. If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.
  • Configuration provider. Sentinel acts as a source of authority for clients service discovery:clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.

Monitoring: Sentinel will continue to check whether the primary server and from the server to run properly.
Note: If an instance of a monitored problems, Sentinel can be notified via the API.
Automatic failover (failover): If the primary server fails, Sentinel can start the failover process. The upgrade a server-based servers, and notification.
Configuration Management: client connects to the Sentinel, get the current address Redis master server.

3.3 Sentinel combat

3.3.1 Sentinel Configuration

In order to ensure high availability of Sentinel, Sentinel also need to do a clustered deployment, a cluster requires at least three instances Sentinel (recommended an odd number, to prevent split brain).

hostname IP addresses Node Role & port
master 192.168.8.203 Master:6379 / Sentinel : 26379
slave1 192.168.8.204 Slave :6379 / Sentinel : 26379
slave2 192.168.8.205 Slave :6379 / Sentinel : 26379

Redis installation path to /usr/local/soft/redis-5.0.5/ example.
Add src / redis.conf profiles 204 and 205

slaveof 192.168.8.203 6379

Create a sentinel profiles 203, 204 (root by default after installation has sentinel.conf directory):

cd /usr/local/soft/redis-5.0.5
mkdir logs
mkdir rdbs
mkdir sentinel-tmp
vim sentinel.conf

The same three servers Content:

daemonize yes
port 26379
protected-mode no
dir "/usr/local/soft/redis-5.0.5/sentinel-tmp"
sentinel monitor redis-master 192.168.2.203 6379 2
sentinel down-after-milliseconds redis-master 30000
sentinel failover-timeout redis-master 180000
sentinel parallel-syncs redis-master 1

Above, there have been four 'redis-master', this name should be unified, and when using a client (such as Jedis) connected to the correct name.

hostname IP
protected-mode Whether to allow external network access
to you sentinel working directory
sentinel monitor monitoring the master node sentinel redis
down-after-milliseconds (msec) master how long downtime, will be considered subjective offline Sentinel
sentinel failover-timeout(毫秒 1 with a sentinel to the same time interval between two master failover.
2. When a slave from a master error there synchronize data calculation time. Until the
slave is corrected to the right of where the data synchronization when the master.
3. When you want to cancel an ongoing failover time needed.
4. When failover, all slaves configured maximum time required to point to the new master.
parallel-syncs This setting specifies the failover occurs at the standby switching master can have up to the number of simultaneous new master slave synchronization, the smaller the number, the time required to complete failover longer, but if the larger the number, it means the more the slave because replication is not available. This value can be set to 1 to ensure that only one slave state can not process the request command

3.3.2 Sentinel verification

Start Redis service and Sentinel

cd /usr/local/soft/redis-5.0.5/src
# 启动 Redis 节点
./redis-server ../redis.conf
# 启动 Sentinel 节点
./redis-sentinel ../sentinel.conf
# 或者
./redis-server ../sentinel.conf --sentinel

View cluster status:

redis> info replication

203

1571747818355.png

204 and 205

1571747829582.png

Analog master down, 203 in execution:

redis> shutdown

205 was chosen as the new Master, there is only one Slave node.

1571747852515.png

Note the sentinel.conf inside redis-master has been modified!
Restore the original analog master, start redis-server 203. It is slave, but the master there are two slave.

slave downtime and recovery omitted.

3.3.3 Sentinel connection

Jedis connection Sentinel

package sentinel;

import redis.clients.jedis.JedisSentinelPool;

import java.util.HashSet;
import java.util.Properties;
import java.util.Set;


public class JedisSentinelTest {
    private static JedisSentinelPool pool;

    private static JedisSentinelPool createJedisPool() {
        // master的名字是sentinel.conf配置文件里面的名称
        String masterName = "redis-master";
        Set<String> sentinels = new HashSet<String>();
        sentinels.add("192.168.8.203:26379");
        sentinels.add("192.168.8.204:26379");
        sentinels.add("192.168.8.205:26379");
        pool = new JedisSentinelPool(masterName, sentinels);
        return pool;
    }

    public static void main(String[] args) {
        JedisSentinelPool pool = createJedisPool();
        pool.getResource().set("qingshan", "qq"+System.currentTimeMillis());
        System.out.println(pool.getResource().get("qingshan"));
    }
}

master name from configuration sentinel.conf

private static JedisSentinelPool createJedisPool() {
String masterName = "redis-master";
Set<String> sentinels = new HashSet<String>();
sentinels.add("192.168.8.203:26379");
sentinels.add("192.168.8.204:26379");
sentinels.add("192.168.8.205:26379");
pool = new JedisSentinelPool(masterName, sentinels);
return pool;
}

Spring Boot connect Sentinel


import com.gupaoedu.util.RedisUtil;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;

@RunWith(SpringRunner.class)
@SpringBootTest
public class RedisAppTests {

    @Autowired
    RedisUtil util;

    @Test
    public void contextLoads() {
        util.set("boot", "2673--" +System.currentTimeMillis());
        System.out.println(util.get("boot"));
    }

}
spring.redis.sentinel.master=redis-master
spring.redis.sentinel.nodes=192.168.8.203:26379,192.168.8.204:26379,192.168.8.205:26379

​ 无论是 Jedis 还是 Spring Boot(2.x 版本默认是 Lettuce),都只需要配置全部哨兵的地址,由哨兵返回当前的 master 节点地址。

3.4 哨兵机制的不足

​ 主从切换的过程中会丢失数据,因为只有一个 master。
​ 只能单点写,没有解决水平扩容的问题。
​ 如果数据量非常大,这个时候我们需要多个 master-slave 的 group,把数据分布到不同的 group 中。
​ 问题来了,数据怎么分片?分片之后,怎么实现路由?

4 Redis 分布式方案

​ 如果要实现 Redis 数据的分片,我们有三种方案。第一种是在客户端实现相关的逻辑,例如用取模或者一致性哈希对 key 进行分片,查询和修改都先判断 key 的路由。
​ 第二种是把做分片处理的逻辑抽取出来,运行一个独立的代理服务,客户端连接到这个代理服务,代理服务做请求的转发。
​ 第三种就是基于服务端实现。

4.1 客户端 Sharding

1571748107273.png

Jedis 客户端提供了 Redis Sharding 的方案,并且支持连接池。

4.1.1 ShardedJedis

public class ShardingTest {
    public static void main(String[] args) {
        JedisPoolConfig poolConfig = new JedisPoolConfig();
​
        // Redis 服务器
        JedisShardInfo shardInfo1 = new JedisShardInfo("127.0.0.1", 6379);
        JedisShardInfo shardInfo2 = new JedisShardInfo("192.168.8.205", 6379);
​
        // 连接池
        List<JedisShardInfo> infoList = Arrays.asList(shardInfo1, shardInfo2);
        ShardedJedisPool jedisPool = new ShardedJedisPool(poolConfig, infoList);
        ShardedJedis jedis = null;
        try{
            jedis = jedisPool.getResource();
            for(int i=0; i<100; i++){
            jedis.set("k"+i, ""+i);
        }
        for(int i=0; i<100; i++){
            System.out.println(jedis.get("k"+i));
        }
        ​
        }finally{
            if(jedis!=null) {
            jedis.close();
            }
        }
    }
}

​ 使用 ShardedJedis 之类的客户端分片代码的优势是配置简单,不依赖于其他中间件,分区的逻辑可以自定义,比较灵活。但是基于客户端的方案,不能实现动态的服务增减,每个客户端需要自行维护分片策略,存在重复代码。
​ 第二种思路就是把分片的代码抽取出来,做成一个公共服务,所有的客户端都连接到这个代理层。由代理层来实现请求和转发。

4.2 代理 Proxy

1571748259203.png

典型的代理分区方案有 Twitter 开源的 Twemproxy 和国内的豌豆荚开源的 Codis。

4.2.1 Twemproxy

two-em-proxy
https://github.com/twitter/twemproxy

1571748281722.png

Twemproxy 的优点:比较稳定,可用性高。

不足:
1、出现故障不能自动转移,架构复杂,需要借助其他组件(LVS/HAProxy +
Keepalived)实现 HA
2、扩缩容需要修改配置,不能实现平滑地扩缩容(需要重新分布数据)。

4.2.2 Codis

https://github.com/CodisLabs/codis
Codis 是一个代理中间件,用 Go 语言开发的。
功能:客户端连接 Codis 跟连接 Redis 没有区别。

Codis Tewmproxy Redis Cluster
重新分片不需要重启 Yes No Yes
pipeline Yes Yes
多 key 操作的 hash tags {} Yes Yes Yes
重新分片时的多 key 操作 Yes No
客户端支持 所有 所有 支持 cluster 协议的客户端

1571748437326.png

​ 分片原理:Codis 把所有的 key 分成了 N 个槽(例如 1024),每个槽对应一个分组,一个分组对应于一个或者一组 Redis 实例。Codis 对 key 进行 CRC32 运算,得到一个32 位的数字,然后模以 N(槽的个数),得到余数,这个就是 key 对应的槽,槽后面就是 Redis 的实例。比如 4 个槽:

1571748485305.png

​ Codis 的槽位映射关系是保存在 Proxy 中的,如果要解决单点的问题,Codis 也要做集群部署,多个 Codis 节点怎么同步槽和实例的关系呢?需要运行一个 Zookeeper (或者 etcd/本地文件)。

​ 在新增节点的时候,可以为节点指定特定的槽位。Codis 也提供了自动均衡策略。Codis 不支持事务,其他的一些命令也不支持。

不支持的命令
https://github.com/CodisLabs/codis/blob/release3.2/doc/unsupported_cmds.md
获取数据原理(mget):在 Redis 中的各个实例里获取到符合的 key,然后再汇总到 Codis 中。
Codis 是第三方提供的分布式解决方案,在官方的集群功能稳定之前,Codis 也得到了大量的应用。

4.3 Redis Cluster

https://redis.io/topics/cluster-tutorial/
Redis Cluster 是在 Redis 3.0 的版本正式推出的,用来解决分布式的需求,同时也可以实现高可用。跟 Codis 不一样,它是去中心化的,客户端可以连接到任意一个可用节点。
数据分片有几个关键的问题需要解决:
1、数据怎么相对均匀地分片
2、客户端怎么访问到相应的节点和数据
3、重新分片的过程,怎么保证正常服务

4.3.1 架构

​ Redis Cluster 可以看成是由多个 Redis 实例组成的数据集合。客户端不需要关注数据的子集到底存储在哪个节点,只需要关注这个集合整体。
​ 以 3 主 3 从为例,节点之间两两交互,共享数据分片、节点状态等信息。

1571748577107.png

4.3.2 搭建

首先,本篇要基于单实例的安装,你的机器上已经有一个Redis
博客 www.sundablog.com

为了节省机器,我们直接把6个Redis实例安装在同一台机器上(3主3从),只是使用不同的端口号。
机器IP 192.168.8.207

cd /usr/local/soft/redis-5.0.5
mkdir redis-cluster
cd redis-cluster
mkdir 7291 7292 7293 7294 7295 7296

复制redis配置文件到7291目录

cp /usr/local/soft/redis-5.0.5/redis.conf /usr/local/soft/redis-5.0.5/redis-cluster/7291

修改7291的配置文件

port 7291
dir /usr/local/soft/redis-5.0.5/redis-cluster/7291/
cluster-enabled yes
cluster-config-file nodes-7291.conf
cluster-node-timeout 5000
appendonly yes
pidfile /var/run/redis_7291.pid

把7291下的redis.conf复制到其他5个目录。

cd /usr/local/soft/redis-5.0.5/redis-cluster/7291
cp redis.conf ../7292
cp redis.conf ../7293
cp redis.conf ../7294
cp redis.conf ../7295
cp redis.conf ../7296

批量替换内容

cd /usr/local/soft/redis-5.0.5/redis-cluster
sed -i 's/7291/7292/g' 7292/redis.conf
sed -i 's/7291/7293/g' 7293/redis.conf
sed -i 's/7291/7294/g' 7294/redis.conf
sed -i 's/7291/7295/g' 7295/redis.conf
sed -i 's/7291/7296/g' 7296/redis.conf

安装ruby依赖、rubygems依赖、gem-redis依赖

yum install ruby -y
yum install rubygems -y
gem install redis -v 3.0.7

启动6个Redis节点

cd /usr/local/soft/redis-5.0.5/
./src/redis-server redis-cluster/7291/redis.conf
./src/redis-server redis-cluster/7292/redis.conf
./src/redis-server redis-cluster/7293/redis.conf
./src/redis-server redis-cluster/7294/redis.conf
./src/redis-server redis-cluster/7295/redis.conf
./src/redis-server redis-cluster/7296/redis.conf

是否启动了6个进程

ps -ef|grep redis

创建集群
旧版本中的redis-trib.rb已经废弃了,直接用–cluster命令
注意用绝对IP,不要用127.0.0.1

cd /usr/local/soft/redis-5.0.5/src/
redis-cli --cluster create 192.168.8.207:7291 192.168.8.207:7292 192.168.8.207:7293 192.168.8.207:7294 192.168.8.207:7295 192.168.8.207:7296 --cluster-replicas 1

Redis会给出一个预计的方案,对6个节点分配3主3从,如果认为没有问题,输入yes确认

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 127.0.0.1:7295 to 127.0.0.1:7291
Adding replica 127.0.0.1:7296 to 127.0.0.1:7292
Adding replica 127.0.0.1:7294 to 127.0.0.1:7293
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: dfdc9c0589219f727e4fd0ad8dafaf7e0cfb4f1c 127.0.0.1:7291
   slots:[0-5460] (5461 slots) master
M: 8c878b45905bba3d7366c89ec51bd0cd7ce959f8 127.0.0.1:7292
   slots:[5461-10922] (5462 slots) master
M: aeeb7d7076d9b25a7805ac6f508497b43887e599 127.0.0.1:7293
   slots:[10923-16383] (5461 slots) master
S: ebc479e609ff8f6ca9283947530919c559a08f80 127.0.0.1:7294
   replicates aeeb7d7076d9b25a7805ac6f508497b43887e599
S: 49385ed6e58469ef900ec48e5912e5f7b7505f6e 127.0.0.1:7295
   replicates dfdc9c0589219f727e4fd0ad8dafaf7e0cfb4f1c
S: 8d6227aefc4830065624ff6c1dd795d2d5ad094a 127.0.0.1:7296
   replicates 8c878b45905bba3d7366c89ec51bd0cd7ce959f8
Can I set the above configuration? (type 'yes' to accept): 

注意看slot的分布

7291  [0-5460] (5461个槽) 
7292  [5461-10922] (5462个槽) 
7293  [10923-16383] (5461个槽)

集群创建完成

>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 127.0.0.1:7291)
M: dfdc9c0589219f727e4fd0ad8dafaf7e0cfb4f1c 127.0.0.1:7291
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 8c878b45905bba3d7366c89ec51bd0cd7ce959f8 127.0.0.1:7292
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: aeeb7d7076d9b25a7805ac6f508497b43887e599 127.0.0.1:7293
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 8d6227aefc4830065624ff6c1dd795d2d5ad094a 127.0.0.1:7296
   slots: (0 slots) slave
   replicates aeeb7d7076d9b25a7805ac6f508497b43887e599
S: ebc479e609ff8f6ca9283947530919c559a08f80 127.0.0.1:7294
   slots: (0 slots) slave
   replicates dfdc9c0589219f727e4fd0ad8dafaf7e0cfb4f1c
S: 49385ed6e58469ef900ec48e5912e5f7b7505f6e 127.0.0.1:7295
   slots: (0 slots) slave
   replicates 8c878b45905bba3d7366c89ec51bd0cd7ce959f8
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered

重置集群的方式是在每个节点上个执行cluster reset,然后重新创建集群

连接到客户端

redis-cli -p 7291
redis-cli -p 7292
redis-cli -p 7293

批量写入值

cd /usr/local/soft/redis-5.0.5/redis-cluster/
vim setkey.sh

脚本内容

#!/bin/bash
for ((i=0;i<20000;i++))
do
echo -en "helloworld" | redis-cli -h 192.168.8.207 -p 7291 -c -x set name$i >>redis.log
done
chmod +x setkey.sh
./setkey.sh

每个节点分布的数据

127.0.0.1:7292> dbsize
(integer) 6683
127.0.0.1:7293> dbsize
(integer) 6665
127.0.0.1:7291> dbsize
(integer) 6652

其他命令,比如添加节点、删除节点,重新分布数据:

redis-cli --cluster help
Cluster Manager Commands:
  create         host1:port1 ... hostN:portN
                 --cluster-replicas <arg>
  check          host:port
                 --cluster-search-multiple-owners
  info           host:port
  fix            host:port
                 --cluster-search-multiple-owners
  reshard        host:port
                 --cluster-from <arg>
                 --cluster-to <arg>
                 --cluster-slots <arg>
                 --cluster-yes
                 --cluster-timeout <arg>
                 --cluster-pipeline <arg>
                 --cluster-replace
  rebalance      host:port
                 --cluster-weight <node1=w1...nodeN=wN>
                 --cluster-use-empty-masters
                 --cluster-timeout <arg>
                 --cluster-simulate
                 --cluster-pipeline <arg>
                 --cluster-threshold <arg>
                 --cluster-replace
  add-node       new_host:new_port existing_host:existing_port
                 --cluster-slave
                 --cluster-master-id <arg>
  del-node       host:port node_id
  call           host:port command arg arg .. arg
  set-timeout    host:port milliseconds
  import         host:port
                 --cluster-from <arg>
                 --cluster-copy
                 --cluster-replace
  help           

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.

附录:

集群命令

cluster info :打印集群的信息
cluster nodes :列出集群当前已知的所有节点(node),以及这些节点的相关信息。
cluster meet :将 ip 和 port 所指定的节点添加到集群当中,让它成为集群的一份子。
cluster forget :从集群中移除 node_id 指定的节点(保证空槽道)。
cluster replicate :将当前节点设置为 node_id 指定的节点的从节点。
cluster saveconfig :将节点的配置文件保存到硬盘里面。

槽slot命令

cluster addslots [slot …] :将一个或多个槽(slot)指派(assign)给当前节点。
cluster delslots [slot …] :移除一个或多个槽对当前节点的指派。
cluster flushslots :移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。
cluster setslot node :将槽 slot 指派给 node_id 指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽>,然后再进行指派。
cluster setslot migrating :将本节点的槽 slot 迁移到 node_id 指定的节点中。
cluster setslot importing :从 node_id 指定的节点中导入槽 slot 到本节点。
cluster setslot stable :取消对槽 slot 的导入(import)或者迁移(migrate)。

键命令

cluster keyslot :计算键 key 应该被放置在哪个槽上。
cluster countkeysinslot :返回槽 slot 目前包含的键值对数量。
cluster getkeysinslot :返回 count 个 slot 槽中的键

4.3.3 数据分布

​ 如果是希望数据分布相对均匀的话,我们首先可以考虑哈希后取模。

4.3.3.1 哈希后 取模

​ 例如,hash(key)%N,根据余数,决定映射到那一个节点。这种方式比较简单,属于静态的分片规则。但是一旦节点数量变化,新增或者减少,由于取模的 N 发生变化,数据需要重新分布。
​ 为了解决这个问题,我们又有了一致性哈希算法。

4.3.3.2 一致性

​ 一致性哈希的原理:
​ 把所有的哈希值空间组织成一个虚拟的圆环(哈希环),整个空间按顺时针方向组织。因为是环形空间,0 和 2^32-1 是重叠的。
​ 假设我们有四台机器要哈希环来实现映射(分布数据),我们先根据机器的名称或者 IP 计算哈希值,然后分布到哈希环中(红色圆圈)。

1571749016225.png

现在有 4 条数据或者 4 个访问请求,对 key 计算后,得到哈希环中的位置(绿色圆圈)。沿哈希环顺时针找到的第一个 Node,就是数据存储的节点。

1571749032547.png

在这种情况下,新增了一个 Node5 节点,不影响数据的分布。

1571749046917.png

删除了一个节点 Node4,只影响相邻的一个节点

1571749059910.png

​ 谷歌的 MurmurHash 就是一致性哈希算法。在分布式系统中,负载均衡、分库分表等场景中都有应用。

​ 一致性哈希解决了动态增减节点时,所有数据都需要重新分布的问题,它只会影响到下一个相邻的节点,对其他节点没有影响。
​ 但是这样的一致性哈希算法有一个缺点,因为节点不一定是均匀地分布的,特别是在节点数比较少的情况下,所以数据不能得到均匀分布。解决这个问题的办法是引入虚拟节点(Virtual Node)。
​ 比如:2 个节点,5 条数据,只有 1 条分布到 Node2,4 条分布到 Node1,不均匀。

1571749095499.png

Node1 设置了两个虚拟节点,Node2 也设置了两个虚拟节点(虚线圆圈)。
这时候有 3 条数据分布到 Node1,1 条数据分布到 Node2。

1571749105882.png

​ Redis 虚拟槽分区

​ Redis 既没有用哈希取模,也没有用一致性哈希,而是用虚拟槽来实现的。
​ Redis 创建了 16384 个槽(slot),每个节点负责一定区间的 slot。比如 Node1 负责 0-5460,Node2 负责 5461-10922,Node3 负责 10923-16383。

1571749129064.png

​ Redis 的每个 master 节点维护一个 16384 位(2048bytes=2KB)的位序列,比如:序列的第 0 位是 1,就代表第一个 slot 是它负责;序列的第 1 位是 0,代表第二个 slot不归它负责。

​ 对象分布到 Redis 节点上时,对 key 用 CRC16 算法计算再%16384,得到一个 slot的值,数据落到负责这个 slot 的 Redis 节点上。

​ 查看 key 属于哪个 slot:

redis> cluster keyslot sunda

注意:key 与 slot 的关系是永远不会变的,会变的只有 slot 和 Redis 节点的关系。

问题:怎么让相关的数据落到同一个节点上?

比如有些 multi key 操作是不能跨节点的,如果要让某些数据分布到一个节点上,例如用户 2673 的基本信息和金融信息,怎么办?

在 key 里面加入{hash tag}即可。Redis 在计算槽编号的时候只会获取{}之间的字符串进行槽编号计算,这样由于上面两个不同的键,{}里面的字符串是相同的,因此他们可以被计算出相同的槽。

user{2673}base=…
user{2673}fin=…
127.0.0.1:7293> set a{qs}a 1
OK
127.0.0.1:7293> set a{qs}b 1
OK
127.0.0.1:7293> set a{qs}c 1
OK
127.0.0.1:7293> set a{qs}d 1
OK
127.0.0.1:7293> set a{qs}e 1
OK

问题:客户端连接到哪一台服务器?访问的数据不在当前节点上,怎么办?

4.3.4 客户端 重定向

比如在 7291 端口的 Redis 的 redis-cli 客户端操作:

127.0.0.1:7291> set qs 1
(error) MOVED 13724 127.0.0.1:7293

​ 服务端返回 MOVED,也就是根据 key 计算出来的 slot 不归 7191 端口管理,而是归 7293 端口管理,服务端返回 MOVED 告诉客户端去 7293 端口操作。
​ 这个时候更换端口,用 redis-cli –p 7293 操作,才会返回 OK。或者用./redis-cli -c -p port 的命令(c 代表 cluster)。这样客户端需要连接两次。
​ Jedis 等客户端会在本地维护一份 slot——node 的映射关系,大部分时候不需要重定向,所以叫做 smart jedis(需要客户端支持)。
问题:新增或下线了 Master 节点,数据怎么迁移(重新分配)?

4.3.5 数据迁移

因为 key 和 slot 的关系是永远不会变的,当新增了节点的时候,需要把原有的 slot
分配给新的节点负责,并且把相关的数据迁移过来。

添加新节点(新增一个 7297):

redis-cli --cluster add-node 127.0.0.1:7291 127.0.0.1:7297

新增的节点没有哈希槽,不能分布数据,在原来的任意一个节点上执行:

redis-cli --cluster reshard 127.0.0.1:7291

输入需要分配的哈希槽的数量(比如 500),和哈希槽的来源节点(可以输入 all 或者 id)。

问题:只有主节点可以写,一个主节点挂了,从节点怎么变成主节点?

4.3.6 高可用 和主 从 切换原理

​ 当 slave 发现自己的 master 变为 FAIL 状态时,便尝试进行 Failover,以期成为新的master。由于挂掉的master可能会有多个slave,从而存在多个slave竞争成为master节点的过程, 其过程如下:
​ 1.slave 发现自己的 master 变为 FAIL
​ 2.将自己记录的集群 currentEpoch 加 1,并广播 FAILOVER_AUTH_REQUEST 信息
​ 3.其他节点收到该信息,只有 master 响应,判断请求者的合法性,并发送FAILOVER_AUTH_ACK,对每一个 epoch 只发送一次 ack
​ 4.尝试 failover 的 slave 收集 FAILOVER_AUTH_ACK
​ 5.超过半数后变成新 Master、
​ 6.广播 Pong 通知其他集群节点。

Redis Cluster 既能够实现主从的角色分配,又能够实现主从切换,相当于集成了Replication 和 Sentinal 的功能

4.3.7 总结

优势
1. 无中心架构。
2. 数据按照 slot 存储分布在多个节点,节点间数据共享,可动态调整数据分布。
3. 可扩展性,可线性扩展到 1000 个节点(官方推荐不超过 1000 个),节点可动态添加或删除。
4. 高可用性,部分节点不可用时,集群仍可用。通过增加 Slave 做 standby 数据副本,能够实现故障自动 failover,节点之间通过 gossip 协议交换状态信息,用投票机制完成 Slave 到 Master 的角色提升。
5. 降低运维成本,提高系统的扩展性和可用性。

Less than
1. Client achieve complex drive requirements to achieve Smart Client, the cache slots mapping information and to update and improve the difficulty of developing, immature affect the stability of the client's business.
2. The node for some reason occur blocking (blocking time is greater than clutser-node-timeout), it is determined offline, such failover is unnecessary.
3. data through asynchronous replication, does not guarantee strong data consistency.
4. When multiple services using the same set of clusters, can not distinguish between hot and cold data based on statistics, resource isolation is poor, the situation is prone to influence each other.

No public

If you want to focus on real-time update of my article and dry sharing, you can focus on my public number.
QQ picture 20191012084332.png

Guess you like

Origin www.cnblogs.com/sundaboke/p/11723893.html