What is the relationship between Redis Sharding cluster and consistent hashing?

Table of contents

I. Introduction

2. Redis sharding cluster

1. Concept and advantages and disadvantages:

2. Data skew problem

3. Data loss problem

4. Application

Three, afterword

4. Reference


I. Introduction

Recently, I encountered some Redis-related problems on some of the systems I was in charge of. I just talked about Cluster and Sharding in the circle of friends, and found that some places were rather vague. Considering that I also sorted out the Sentinel cluster mode before, I took advantage of A little effort to sort out some relevant information about Sharding.

There will be time to add it later in the Cluster mode.

 

2. Redis sharding cluster

 

1. Concept and advantages and disadvantages:

 
Client-side sharding technology, that is, the client calculates which Redis instance each key should be placed in. The main principle is to use a hash algorithm to hash a Key and map it to the corresponding Redis instance. (It is revealed in advance that the simple hash algorithm can be obtained by simply taking the remainder, but this method has a fatal weakness, which will be discussed later.)
 
The benefits of this method are 1) reduce the complexity of the server cluster, and 2) each instance is independent and unrelated. Disadvantages are 1) capacity expansion, currently there is no solution for client implementation to support dynamic addition and deletion of instances; 2) single point of failure: that is, if a shard goes down, the data of that shard cannot provide services, and each instance The high availability of itself needs to find a way by itself.
 
For the above two problems, generally there are the following solutions:
 
Single point of failure : The general approach is to realize the automatic failover of the fragment through the Sentinel mode of one master and N slaves. For the specific solution principle and construction, please refer to my other article: Redis Sentinel (Sentinel) mode
 
Expansion problem : Generally, the capacity can only be expanded by restarting, but this method increases the difficulty of the operation and maintenance side in terms of key-value pair data migration, and the application layer needs to make configuration changes to support new instances. There is another way. The author of Redis recommends using a PreSharding method. I won’t introduce it here, and I’ll add it later.
 
 

2. Data skew problem

 
The usage of Jedis will be mentioned later (because Jedis is one of the implementations of client sharding), and here is a brief introduction to the problems that consistent hashing can solve. But before starting the introduction, you must accept a premise. The consistent hash algorithm can do:
  • Balance: the hash values ​​of different objects after hashing in the hash ring (that is, the hash family) can ensure that the distribution is as uniform as possible to ensure that the balance falls to different nodes;
  • Monotonicity: For some key-value pairs that have been assigned to specific nodes, even if a new node joins, it can be guaranteed that this part of the key-value pair is either mapped to the original node or mapped to a new node (that is, it will not be mapped to other old nodes);
  • Dispersion: I don’t really understand it yet, and I can’t translate it
  • Loadability: I don’t really understand it yet, and I can’t translate it
 
The English descriptions of the above four features can be found on page 6 of the original paper below, and will not be shown here.
 
Other:
1. Friends who are interested in papers related to the application of this algorithm on the cache system can refer to the information on the website of Columbia University:  http://www.cs.columbia.edu/~asherman/papers/cachePaper.pdf
2. Those who are interested in the original papers on the algorithm can refer to the information of Princeton University: https://www.cs.princeton.edu/courses/archive/fall09/cos518/papers/chash.pdf
 
If Redis usually puts some static data such as user sessions and parameters, it is not a big problem. If we store tens of millions or billions of transaction data and embedded data (and require frequent reading and writing), the Redis Sharding solution For example, in the above example, it is impossible to configure dozens or hundreds of instances due to resource cost issues, so only two instances are configured, but what kind of problems will two instances bring? Yes, there will also be a problem of data skew due to the small number of physical instance nodes (here referred to as Node). How to do it? If you use the consistent hashing algorithm (I have drawn a conceptual diagram here for easy understanding), the algorithm will virtualize multiple virtual nodes (here referred to as Virtual Node or VNode) for each physical node (Node), so that in the entire hash There are multiple interleaved virtual nodes in the ring space, and the data distribution is more balanced to avoid the read and write pressure on a certain server. Generally speaking, the more virtual nodes there are, the more even the data distribution is According to the pressure test data, when the number of virtual nodes reaches 1000 levels , the data stored by each node is basically close to the "average number" ).
 
Please see the figure below for details. In the absence of virtualization, almost all data (K1/K2/K3/K4/K5) is in Node1, and only K6 is in Node2; after using the consistent hash algorithm (after virtualization), K1/K3/K5 In fact, among the virtual nodes corresponding to Node1, K2/K4/K6 is in Node2, so that the data is evenly distributed on the two physical nodes.
 
 
 
In fact, you can take a look at the source code of Jedis. When Node is initialized, Jedis will automatically virtualize 160 VNodes for each Node. In this case, the two instances in the above example actually have 2*160= 320 Virtual nodes , and can be deduced from the above figure, these 320 nodes are arranged in a criss-cross pattern rather than in sequence . In addition, if you are interested in how Jedis selects the corresponding virtual node when storing specific key-value pairs (K, V), you can take a look at the corresponding source code, which will not be shown here.
 
 

3. Data loss problem

 
Another advantage of using the consistent hash algorithm is that if there is not enough time for data migration, dynamic expansion or sudden downtime will lead to data loss or reduce the traffic to the subsequent database, so as to avoid avalanches to the greatest extent. Happening is a manifestation of high fault tolerance.
 
Take the hash ring in the following figure as an example. Before the Node3 node is inserted, if we read the key-value pairs of these K1/K2/K3 through the client, it will be automatically identified according to the algorithm as being stored in Node1 and can be obtained. However, if a new server is temporarily added when the pressure on the server is soaring (the subtext is that no data migration is done, because once the migration is done, it will take a lot of time), when the client goes to obtain K1/K2/K3 , through the algorithm calculation, it is believed that they should be stored in Node3, but in fact Node3 must not have it (because no data migration has been done), so 50% of the data has disappeared out of thin air for the client (the part circled by the yellow circle, here Think about whether it is a bit consistent with the monotonicity of the algorithm, and take a closer look ), how tragic this is, wait and run, brother dei. However, if you are a conscientious programmer, you will do a good job in fault-tolerant design when designing the system (I don’t know if you have done it while reading it). Generally, you will understand the logic of the application layer, which is to allow the request to penetrate to the service. The end obtains and synchronously updates to the latest Node3 by checking the database, then the next time there is a request, it can be obtained directly from Node3 without hitting the database.
 
In response to the above-mentioned problems,
  • First of all, if consistent hashing is adopted, because there are many virtualized nodes mentioned above, even if the above situation occurs, the affected data range is relatively small, at least not the data of the entire physical node has to be checked. ;
  • Secondly, if we do the corresponding bottom-up processing at the application layer (that is, penetrate to the database to obtain and synchronize to the Redis node), because the corresponding data range is small, the traffic pressure to the server is not so great.
 
 

4. Application

 
At present, the client Jedis can support Redis Sharding, that is, the combination of ShardedJedis and ShardedJedisPool combined with the cache pool. Moreover, the Redis Sharding implementation of Jedis uses a consistent hash algorithm (for details, please refer to the second point above). Please refer to the following for specific client usage methods (the entire project is a springboot project ).
 
pom.xml
<!--引入Jedis客户端-->
<dependency>
	<groupId>redis.clients</groupId>
	<artifactId>jedis</artifactId>
	<version>2.8.0</version>
</dependency>

application.properties

#redis sharding instance config
redis_client_timeout=500
redis_one_host=192.168.32.101
redis_one_port=6379
redis_one_password=123
redis_two_host=192.168.32.102
redis_two_port=6380
redis_two_password=123

RedisConfiguration.java

@Configuration
public class RedisConfiguration {

    //redis one host
    @Value("${redis_one_host}")
    private String redisOneHost;

    //redis one port
    @Value("${redis_one_port}")
    private int redisOnePort;

    //redis one password
    @Value("${redis_one_password}")
    private String redisOnePassword;

    //redis two host
    @Value("${redis_one_host}")
    private String redisTwoHost;

    //redis two port
    @Value("${redis_one_port}")
    private int redisTwoPort;

    //redis two password
    @Value("${redis_two_password}")
    private String redisTwoPassword;

    //redis client timeout
    @Value("${redis_client_timeout}")
    private int redisClientTimeout;

    @Bean(name="redisPool")
    public ShardedJedisPool createRedisPool() throws Exception {

        //设置连接池的相关配置
        JedisPoolConfig poolConfig = new JedisPoolConfig();
        poolConfig.setMaxTotal(5);
        poolConfig.setMaxIdle(2);
        poolConfig.setMaxWaitMillis(5000);
        poolConfig.setTestOnBorrow(false);
        poolConfig.setTestOnReturn(false);

        //设置Redis信息
        JedisShardInfo shardInfo1 = new JedisShardInfo(redisOneHost,redisOnePort, redisClientTimeout);
        shardInfo1.setPassword(redisOnePassword);
        JedisShardInfo shardInfo2 = new JedisShardInfo(redisTwoHost, redisTwoPort, redisClientTimeout);
        shardInfo2.setPassword(redisTwoPassword);

        //初始化ShardedJedisPool
        List<JedisShardInfo> infoList = Arrays.asList(shardInfo1, shardInfo2);
        ShardedJedisPool jedisPool = new ShardedJedisPool(poolConfig, infoList);

        return jedisPool;

    }

    public static void main(String[] args){

        ShardedJedis shardedJedis = null;

        try{
            RedisConfiguration redisConfiguration = new RedisConfiguration();
            ShardedJedisPool shardedJedisPool = redisConfiguration.createRedisPool();
            shardedJedis = shardedJedisPool.getResource();

            shardedJedis.set("CSDN", "56");
            shardedJedis.set("InfoQ","44");
            shardedJedis.set("CNBlog","13");
            shardedJedis.set("SegmentFault","22");

            Client client1 = shardedJedis.getShard("CSDN").getClient();
            Client client2 = shardedJedis.getShard("InfoQ").getClient();
            Client client3 = shardedJedis.getShard("CNBlog").getClient();
            Client client4 = shardedJedis.getShard("SegmentFault").getClient();

            System.out.println("CSDN 位于实例:" + client1.getHost() + "|" + client1.getPort());
            System.out.println("InfoQ 位于实例:" + client1.getHost() + "|" + client1.getPort());
            System.out.println("CNBlog 位于实例:" + client1.getHost() + "|" + client1.getPort());
            System.out.println("SegmentFault 位于实例:" + client1.getHost() + "|" + client1.getPort());

        }catch(Exception e){
            e.printStackTrace();
        }finally {
            shardedJedis.close();
        }


    }


}

According to the log printed out after running, these values ​​are stored in different Redis instances , but how to allocate them to different shards when the client uses them is determined by the consistent hash algorithm implemented by Jedis; Of course, the default algorithm it supports is the 64-bit MURMUR_HASH algorithm, and also supports the MD5 hash algorithm.

"C:\Program Files\Java\jdk1.8.0_102\bin\java"...

CSDN is located in instance: 192.168.32.101|6379

InfoQ is located at instance: 192.168.32.102|6380

CNBlog is located at instance: 192.168.32.101|6379

SegmentFault at instance: 192.168.32.102|6380

Three, afterword

Different from the lightweight solution of Redis Sharding, Redis Cluster is a server-side sharding solution officially launched by Redis after the release of Redis 3.0. It solves the coordination problem under multiple Redis instances. The collaboration here includes functions such as automatic data sharding, automatic failover of different hash slots (slots), and expansion of new nodes. You see, data sharding used to be solved by the client itself, and automatic failover of hash slots used to be solved by an additional sentinel mechanism. Now the official has come up with an overall solution to help the client be lightweight so that the client can focus more For the development of business logic.
 
My understanding is that Redis Cluster is a decentralized cluster solution, and each node in the cluster is equal (because each node knows the information of other nodes in the entire cluster, such as IP, port, status, etc., each Nodes maintain communication with each other through long chains), and the essential difference between it and Sentinel lies here. A decentralized model, another centralized master-slave model.
 
Specifically how the cluster does automatic data sharding, automatic fault transfer, node expansion and contraction and other details, it will be sorted out and released in another blog later, let’s stop here today.
 
 

4. Reference

 

Guess you like

Origin blog.csdn.net/justyman/article/details/109018731