How RedisCluster achieve Pipeline batch mode?

An article on " redis pipeline batch processing to improve performance ," we talked about redis pipeline mode is a big performance boost on batch data processing, let's take a look at the pipeline principle, the use of between redis client and server a request response mode, as follows:

Client: command1 
Server: response1 
Client: command2 
Server: response2 
…

In this case, if you want to complete the 10 command, you need 20 times to complete the interaction. Therefore, even if redis strong processing power, network traffic will still be affected, resulting in throughput can not go up. In the pipeline (Pipeline) mode, the plurality of request may become so:

Client: command1,command2… 
Server: response1,response2…

In this case, complete command requires only twice interaction. On such network traffic more efficiently, plus redis itself strong processing power, data processing to bring great performance. But the problem is actually encountered is that the project is used Redis cluster, initialization of classes used is JedisCluster instead of Jedis. JedisCluster to check the documents and did not find offers like Jedis have the same access to Pipeline object pipelined () method.

Why RedisCluster not use the pipeline?

We know that the key space Redis cluster is divided into 16384 groove (slot), the maximum number of nodes in the cluster is 16,384. Each master node responsible for a hash slots 16384 wherein part. Specifically redis command, it calculates according to a key slot (slot), and then perform operations according to the particular node redis slot. As follows:

master1(slave1): 0~5460
master2(slave2):5461~10922
master3(slave3):10923~16383

There are three master cluster nodes, wherein the slots assigned master1 0 ~ 5460, 5461 ~ Master2 assigned slot of 10922, 10923 ~ Master3 assigned slot 16383.

A batch pipeline will execute multiple commands, each command needs (JedisClusterCRC16.getSlot (key)), then execute the command based on the slot machine according to the specific "key" operation a slot, that is to say a pipeline action redis using a plurality of nodes connected, and the current JedisCluster is not supported.

Based on how JedisCluster extension pipeline?

Design ideas

1. First calculate the pipeline will be used to connect the corresponding node (i.e. jedis objects, each node typically corresponds to a Pool) The key.
2. The same slot key, use the same jedis.pipeline to execute the command.
3. The pipeline merge all the response returned.
4. Connection Release returned to the pool.

That is, at a JedisCluster into separate pipeline at each node jedisPipeline single operation, and finally returns combined response. Specific implementation is through JedisClusterCRC16.getSlot (key) to calculate the value of key slot by slot distribution of each node, you should know which key on which nodes. Then get this node JedisPool can use the pipeline to read and write.
The above process can achieve There are many ways, this article will introduce a minimal amount of code might be a solution.

solution

The above-mentioned process, in fact, have been the object JedisClusterInfoCache help developers achieve, but the object is not protected and open in JedisClusterConnectionHandler in, and by JedisCluster the API can not get JedisClusterConnectionHandler object. Therefore, by the following two classes expose these objects, so that we know that each use getJedisPoolFromSlot corresponding JedisPool the key.

Maven relies

<dependency>
    <groupId>redis.clients</groupId>
    <artifactId>jedis</artifactId>
    <version>2.9.0</version>
</dependency>

JedisClusterPipeline

import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisCluster;

import java.util.Set;

public class JedisClusterPipeline extends JedisCluster {
    public JedisClusterPipeline(Set<HostAndPort> jedisClusterNode, int connectionTimeout, int soTimeout, int maxAttempts, String password, final GenericObjectPoolConfig poolConfig) {
        super(jedisClusterNode,connectionTimeout, soTimeout, maxAttempts, password, poolConfig);
        super.connectionHandler = new JedisSlotAdvancedConnectionHandler(jedisClusterNode, poolConfig,
                connectionTimeout, soTimeout ,password);
    }

    public JedisSlotAdvancedConnectionHandler getConnectionHandler() {
        return (JedisSlotAdvancedConnectionHandler)this.connectionHandler;
    }

    /**
     * 刷新集群信息,当集群信息发生变更时调用
     * @param
     * @return
     */
    public void refreshCluster() {
        connectionHandler.renewSlotCache();
    }
}

JedisSlotAdvancedConnectionHandler

import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisSlotBasedConnectionHandler;
import redis.clients.jedis.exceptions.JedisNoReachableClusterNodeException;

import java.util.Set;

public class JedisSlotAdvancedConnectionHandler extends JedisSlotBasedConnectionHandler {

    public JedisSlotAdvancedConnectionHandler(Set<HostAndPort> nodes, GenericObjectPoolConfig poolConfig, int connectionTimeout, int soTimeout,String password) {
        super(nodes, poolConfig, connectionTimeout, soTimeout, password);
    }

    public JedisPool getJedisPoolFromSlot(int slot) {
        JedisPool connectionPool = cache.getSlotPool(slot);
        if (connectionPool != null) {
            // It can't guaranteed to get valid connection because of node
            // assignment
            return connectionPool;
        } else {
            renewSlotCache(); //It's abnormal situation for cluster mode, that we have just nothing for slot, try to rediscover state
            connectionPool = cache.getSlotPool(slot);
            if (connectionPool != null) {
                return connectionPool;
            } else {
                throw new JedisNoReachableClusterNodeException("No reachable node in cluster for slot " + slot);
            }
        }
    }
}

Write test classes, writing data to redis clusters 10000, were tested for calling a normal mode and a call JedisCluster JedisCluster Pipeline performance comparison of patterns to achieve the above, the following type test:

import redis.clients.jedis.*;
import redis.clients.util.JedisClusterCRC16;
import java.io.UnsupportedEncodingException;
import java.util.*;

public class PipelineTest {
    public static void main(String[] args) throws UnsupportedEncodingException {
        PipelineTest client = new PipelineTest();
        Set<HostAndPort> nodes = new HashSet<>();
        nodes.add(new HostAndPort("node1",20249));
        nodes.add(new HostAndPort("node2",20508));
        nodes.add(new HostAndPort("node3",20484));
        String redisPassword = "123456";
        //测试
        client.jedisCluster(nodes,redisPassword);
        client.clusterPipeline(nodes,redisPassword);
    }
    //普通JedisCluster 批量写入测试
    public void jedisCluster(Set<HostAndPort> nodes,String redisPassword) throws UnsupportedEncodingException {
        JedisCluster jc = new JedisCluster(nodes, 2000, 2000,100,redisPassword, new JedisPoolConfig());
        List<String> setKyes = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
            setKyes.add("single"+i);
        }
        long start = System.currentTimeMillis();
        for(int j = 0;j < setKyes.size();j++){
            jc.setex(setKyes.get(j),100,"value"+j);
        }
        System.out.println("JedisCluster total time:"+(System.currentTimeMillis() - start));
    }
    //JedisCluster Pipeline 批量写入测试
    public void clusterPipeline(Set<HostAndPort> nodes,String redisPassword) {
        JedisClusterPipeline jedisClusterPipeline = new JedisClusterPipeline(nodes, 2000, 2000,10,redisPassword, new JedisPoolConfig());
        JedisSlotAdvancedConnectionHandler jedisSlotAdvancedConnectionHandler = jedisClusterPipeline.getConnectionHandler();
        Map<JedisPool, List<String>> poolKeys = new HashMap<>();
        List<String> setKyes = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
                setKyes.add("pipeline"+i);
        }
        long start = System.currentTimeMillis();
        //查询出 key 所在slot ,通过 slot 获取 JedisPool ,将key 按 JedisPool 分组
        jedisClusterPipeline.refreshCluster();
        for(int j = 0;j < setKyes.size();j++){
            String key = setKyes.get(j);
            int slot = JedisClusterCRC16.getSlot(key);
            JedisPool jedisPool = jedisSlotAdvancedConnectionHandler.getJedisPoolFromSlot(slot);
            if (poolKeys.keySet().contains(jedisPool)){
                List<String> keys = poolKeys.get(jedisPool);
                keys.add(key);
            }else {
                List<String> keys = new ArrayList<>();
                keys.add(key);
                poolKeys.put(jedisPool, keys);
            }
        }
        //调用Jedis pipeline进行单点批量写入
        for (JedisPool jedisPool : poolKeys.keySet()) {
            Jedis jedis = jedisPool.getResource();
            Pipeline pipeline = jedis.pipelined();
            List<String> keys = poolKeys.get(jedisPool);
            for(int i=0;i<keys.size();i++){
                pipeline.setex(keys.get(i),100, "value" + i);
            }
            pipeline.sync();//同步提交
            jedis.close();
        }
        System.out.println("JedisCluster Pipeline total time:"+(System.currentTimeMillis() - start));
    }
}

Test results are as follows:

JedisCluster total time:29147
JedisCluster Pipeline total time:190

Conclusion: For batch operation, JedisCluster Pipeline significant performance boost.

to sum up

This article aims to introduce a function Pipeline providing bulk operations in Redis cluster mode. The basic idea is redis cluster data hash algorithm modulo, to calculate the position data storage slot, and then depending on the node data into batches, batches of different single-point data processing pipeline.
However, note that, due to the presence of a cluster node dynamically add and delete mode, and the client can not perceive real time (only when command execution may know a cluster changed), therefore, the implementation does not guarantee success, it is recommended to call before the batch operation refreshCluster () method retrieve cluster information. Applications need to ensure that regardless of success or failure will call close () method, as this may result in the disclosure. If you need to apply yourself to failure retry, the number of commands executed by the need to control each batch, after the failure to prevent an excessive number of retries.
Based on the above description, it is recommended to use in a clustered environment has become more stable (increase or decrease the nodes are not too frequent) case, and allowed to fail or retry strategy corresponding.

Guess you like

Origin www.cnblogs.com/xiaodf/p/11002184.html