Remember the deadlock problem caused by Redis Cluster Pipeline

Author: vivo Internet server team-Li Gang


This article introduces the process of troubleshooting the Dubbo thread pool exhaustion problem. By checking the Dubbo thread status, analyzing the Jedis connection pool to obtain the source code of the connection, and troubleshooting deadlock conditions, it was finally confirmed that the deadlock problem was caused by using the cluster pipeline mode without setting a timeout.


1. Background introduction


Redis Pipeline is an efficient command batch processing mechanism that can significantly reduce network latency and improve reading and writing capabilities in Redis. Redis Cluster Pipeline is a pipeline based on Redis Cluster. By packaging multiple operations into a set of operations and sending them to multiple nodes in Redis Cluster at one time, it reduces communication delays and improves the read and write throughput and performance of the entire system. It is applicable For scenarios where Redis Cluster commands need to be processed efficiently.


The scenario where pipeline is used this time is to batch query reservation game information from Redis Cluster. The process of Redis Cluster Pipeline used in the project is as follows. JedisClusterPipeline is a tool class we use internally to provide pipeline capabilities in Redis Cluster mode:


JedisClusterPipelineUse

JedisClusterPipline jedisClusterPipline = redisService.clusterPipelined();List<Object> response;try {    for (String key : keys) {        jedisClusterPipline.hmget(key, VALUE1, VALUE2);    }    // 获取结果    response = jedisClusterPipline.syncAndReturnAll();} finally {    jedisClusterPipline.close();}


2. Fault site records


One day, I received a warning that the Dubbo thread pool was exhausted. Checking the logs, I found that only one machine had a problem, and it has not been restored, and the number of completed tasks has not increased.



Check the request number monitoring and find that the request number has returned to zero. It is obvious that the machine has hung up.



Use arthas to view Dubbo threads and find that all 400 threads are in the waiting state.



3. Failure process analysis


There is no problem that the Dubbo thread is in the waiting state. The Dubbo thread is also in the waiting state when waiting for tasks, but looking at the complete call stack, it is found that there is a problem. The first of the two pictures below is the stack of the problem machine, and the second is the normal machine. stack, it is obvious that the thread of the problem machine is waiting for an available connection in the Redis connection pool.




After using jstack to export the thread snapshot, it was found that all Dubbo threads of the problem machine were waiting for available connections in the Redis connection pool.


After investigating here, two problems can be found.

  1. The thread keeps waiting for the connection without being interrupted.

  2. The thread cannot obtain the connection.


3.1 Analysis of the reasons why the thread keeps waiting for connection without being interrupted


The logic of Jedis getting the connection is in

Below the org.apache.commons.pool2.impl.GenericObjectPool#borrowObject(long) method.

public T borrowObject(long borrowMaxWaitMillis) throws Exception {    ...    PooledObject<T> p = null;     // 获取blockWhenExhausted配置项,该配置默认值为true    boolean blockWhenExhausted = getBlockWhenExhausted();     boolean create;    long waitTime = System.currentTimeMillis();     while (p == null) {        create = false;        if (blockWhenExhausted) {            // 从队列获取空闲的对象,该方法不会阻塞,没有空闲对象会返回null            p = idleObjects.pollFirst();            // 没有空闲对象则创建            if (p == null) {                p = create();                if (p != null) {                    create = true;                }            }            if (p == null) {                // borrowMaxWaitMillis默认值为-1                if (borrowMaxWaitMillis < 0) {                    // 线程栈快照里所有的dubbo线程都卡在这里,这是个阻塞方法,如果队列里没有新的连接会一直等待下去                    p = idleObjects.takeFirst();                    } else {                    // 等待borrowMaxWaitMillis配置的时间还没有拿到连接的话就返回null                    p = idleObjects.pollFirst(borrowMaxWaitMillis,                            TimeUnit.MILLISECONDS);                }            }            if (p == null) {                throw new NoSuchElementException(                        "Timeout waiting for idle object");            }            if (!p.allocate()) {                p = null;            }                }         ...     }     updateStatsBorrow(p, System.currentTimeMillis() - waitTime);     return p.getObject();}


Since the business code does not set borrowMaxWaitMillis, the thread has been waiting for an available connection. This value can be set by configuring the maxWaitMillis attribute of the jedis pool.


The reason why the thread has been waiting has been found here, but the reason why the thread cannot obtain the connection needs to continue to be analyzed.


3.2 Analysis of the reasons why the thread cannot obtain the connection


There are only two situations when the connection cannot be obtained:

  1. Unable to connect to Redis and cannot create a connection

  2. All connections in the connection pool are occupied and the connection cannot be obtained.


Guess 1: Is it impossible to connect to Redis?


After asking the operation and maintenance, I learned that there was indeed a wave of network jitter at the time when the problem occurred, but it recovered quickly. During the troubleshooting, the problem machine was able to connect to Redis normally. Is it possible that there is a problem with the process of creating a Redis connection and that it cannot recover from network jitters, causing the thread to get stuck? The answer to this point needs to be found in the source code.


Create connection

private PooledObject<T> create() throws Exception {    int localMaxTotal = getMaxTotal();    long newCreateCount = createCount.incrementAndGet();    if (localMaxTotal > -1 && newCreateCount > localMaxTotal ||            newCreateCount > Integer.MAX_VALUE) {        createCount.decrementAndGet();        return null;    }     final PooledObject<T> p;    try {        // 创建redis连接,如果发生超时会抛出异常        // 默认的connectionTimeout和soTimeout都是2秒        p = factory.makeObject();    } catch (Exception e) {        createCount.decrementAndGet();        // 这里会把异常继续往上抛出        throw e;    }     AbandonedConfig ac = this.abandonedConfig;    if (ac != null && ac.getLogAbandoned()) {        p.setLogAbandoned(true);    }     createdCount.incrementAndGet();    allObjects.put(new IdentityWrapper<T>(p.getObject()), p);    return p;}


As you can see, an exception will be thrown when the connection to Redis times out. Calling borrowObject() of the create() function will not catch this exception. This exception will eventually be caught in the business layer, so if you cannot connect to Redis, you will not wait forever. If it continues, the connection can be re-created by calling the create() method again after the network is restored.


In summary, the first situation can be ruled out, and we continue to analyze the situation 2. It is no problem if the connection is occupied, but there is a problem if it is not released.


Conjecture 2: Is it because the business code did not return the Redis connection?


The connection is not released. The first thing that comes to mind is that the code for returning the Redis connection may be missing somewhere in the business code. In pipeline mode, you need to manually call the JedisClusterPipeline#close() method in the finally block to return the connection to the connection pool. In normal mode, you need to manually call the JedisClusterPipeline#close() method in the finally block. There is no need to manually release (refer to redis.clients.jedis.JedisClusterCommand#runWithRetries, it will be automatically released after each command is executed). Globally search all codes that use the cluster pipeline in the business code, and manually call JedisClusterPipeline#close( ) method, so it is not a problem with the business code.


Conjecture 3: Is there a connection leak problem in Jedis?


Since there is no problem with the business code, is it possible that there is something wrong with the code for returning the connection and there is a connection leak? Connection leakage may indeed occur in version 2.10.0 of Jedis. For details, please see this issue: https://github.com/redis/jedis/issues/1920 . However, our project uses version 2.9.0, so connection leakage is ruled out. Case.


Conjecture 4: Is there a deadlock?


After excluding the above possibilities, the only reason I can think of is deadlock. After thinking about it, I found that there is a probability of deadlock when using the pipeline without setting a timeout. This deadlock occurs when the connection is obtained from the connection pool (LinkedBlockingDeque). when.


Let’s first look at the difference between Redis in cluster pipeline mode and ordinary Redis. Jedis maintains a connection pool for each Redis instance. In cluster pipeline mode, first use the query key to calculate the list of Redis instances where it is located, and then obtain the connections from the connection pools corresponding to these instances, and unify them after use. freed. In normal mode, only one connection pool connection will be obtained at a time, and it will be released immediately after use. This means that the cluster pipeline mode meets the "hold and wait" condition of deadlock when acquiring a connection, but the normal mode does not meet this condition.


JedisClusterPipelineUse

JedisClusterPipline jedisClusterPipline = redisService.clusterPipelined();List<Object> response;try {    for (String key : keys) {        // 申请连接,内部会先调用JedisClusterPipeline.getClient(String key)方法获取连接        jedisClusterPipline.hmget(key, VALUE1, VALUE2);        // 获取到了连接,缓存到poolToJedisMap    }    // 获取结果    response = jedisClusterPipline.syncAndReturnAll();} finally {    // 归还所有连接    jedisClusterPipline.close();}


JedisClusterPipeline partial source code

public class JedisClusterPipline extends PipelineBase implements Closeable {     private static final Logger log = LoggerFactory.getLogger(JedisClusterPipline.class);     // 用于记录redis命令的执行顺序    private final Queue<Client> orderedClients = new LinkedList<>();    // redis连接缓存    private final Map<JedisPool, Jedis> poolToJedisMap = new HashMap<>();     private final JedisSlotBasedConnectionHandler connectionHandler;    private final JedisClusterInfoCache clusterInfoCache;     public JedisClusterPipline(JedisSlotBasedConnectionHandler connectionHandler, JedisClusterInfoCache clusterInfoCache) {        this.connectionHandler = connectionHandler;        this.clusterInfoCache = clusterInfoCache;    }     @Override    protected Client getClient(String key) {         return getClient(SafeEncoder.encode(key));    }     @Override    protected Client getClient(byte[] key) {         Client client;        // 计算key所在的slot        int slot = JedisClusterCRC16.getSlot(key);        // 获取solt对应的连接池        JedisPool pool = clusterInfoCache.getSlotPool(slot);        // 从缓存获取连接        Jedis borrowedJedis = poolToJedisMap.get(pool);        // 缓存中没有连接则从连接池获取并缓存        if (null == borrowedJedis) {            borrowedJedis = pool.getResource();            poolToJedisMap.put(pool, borrowedJedis);        }                 client = borrowedJedis.getClient();             orderedClients.add(client);         return client;    }     @Override    public void close() {        for (Jedis jedis : poolToJedisMap.values()) {            // 清除连接内的残留数据,防止连接归还时有数据漏读的现象            try {                jedis.getClient().getAll();            } catch (Throwable throwable) {                log.warn("关闭jedis时遍历异常,遍历的目的是:清除连接内的残留数据,防止连接归还时有数据漏读的现象");            }            try {                jedis.close();            } catch (Throwable throwable) {                log.warn("关闭jedis异常");            }        }        // 归还连接        clean();        orderedClients.clear();        poolToJedisMap.clear();    }     /**     * go through all the responses and generate the right response type (warning :     * usually it is a waste of time).     *     * @return A list of all the responses in the order     */    public List<Object> syncAndReturnAll() {        List<Object> formatted = new ArrayList<>();        List<Throwable> throwableList = new ArrayList<>();        for (Client client : orderedClients) {            try {                Response response = generateResponse(client.getOne());                if(response == null){                    continue;                }                formatted.add(response.get());            } catch (Throwable e) {                throwableList.add(e);            }        }        slotCacheRefreshed(throwableList);        return formatted;    }}



for example:

Suppose there is a cluster with two Redis master nodes (the minimum number of master nodes in cluster mode is 3, this is just for example), recorded as node 1/2, and there is a java program with 4 Dubbo threads, recorded as thread 1/ On 2/3/4, each Redis instance has a connection pool of size 2.


Thread 1 and thread 2 first obtain the connection of Redis1 and then the connection of Redis2. Thread 3 and Thread 4 first obtain the connection of Redis2 and then the connection of Redis1. Assuming that these four threads wait for a while after obtaining the first connection, a deadlock will occur when obtaining the second connection. (The longer the waiting time, the greater the probability of triggering).



Therefore, the pipeline may cause deadlock. This deadlock condition is easily destroyed. Just set the timeout when waiting for a connection. You can also increase the size of the connection pool. If the resources are sufficient, deadlock will not occur.


4. Deadlock proof


The above is just conjecture. In order to prove that a deadlock does occur, the following conditions are required:

  1. Which connection pool connections are currently obtained by the thread?

  2. Which connection pools are the threads currently waiting for?

  3. How many connections are left in each connection pool?


The Dubbo thread pool size of the known problem machine is 400, the number of Redis cluster master nodes is 12, and the connection pool size configured by Jedis is 20.


4.1 Step 1: Get which connection pool the thread is waiting for to have an idle connection


Step 1 : First export the stack and heap through jstack and jmap respectively


第二步:通过分析栈可以知道线程在等待的锁的地址,可以看到Dubbo线程383在等待0x6a3305858这个锁对象,这个锁属于某个连接池,需要找到具体是哪个连接池。



第三步:使用mat(Eclipse Memory Analyzer Tool)工具分析堆,通过锁的地址找到对应的连接池。



使用mat的with incoming references功能顺着引用一层层的往上找。



引用关系:

ConditionObject->LinkedBlockingDeque



引用关系:

LinkedBlockingDeque->GenericObjectPool



引用关系:GenericObjectPool->JedisPool。这里的ox6a578ddc8就是这个锁所属的连接池地址。



这样我们就能知道Dubbo线程383当前在等待0x6a578ddc8这个连接池的连接。


通过这一套流程,我们可以知道每个Dubbo线程分别在等待哪些连接池有可用连接。


4.2 步骤二:获取线程当前持有了哪些连接池的连接


第一步:使用mat在堆中查找所有JedisClusterPipeline类(正好400个,每个Dubbo线程都各有一个),然后查看里面的poolToJedisMap,其中保存了当前

JedisClusterPipeline已经持有的连接和其所属的连接池。


下图中,我们可以看到

JedisClusterPipeline(0x6ac40c088)对象当前的poolToJedisMap里有三个Node对象

(0x6ac40dd40, 0x6ac40dd60, 0x6ac40dd80),代表其持有三个连接池的连接,可以从Node对象中找到JedisPool的地址。



第二步:第一步拿到JedisClusterPipeline持有哪个连接池的连接后,再查找持有此

JedisClusterPipeline的Dubbo线程,这样就能得到Dubbo线程当前持有哪些连接池的连接。



4.3 死锁分析


通过流程一可以发现虽然Redis主节点有12个,但是所有的Dubbo线程都只在等待以下5个节点对应的连接池之一:

  • 0x6a578e0c8

  • 0x6a578e048

  • 0x6a578ddc8

  • 0x6a578e538

  • 0x6a578e838


通过流程二我们可以得知这5个连接池的连接当前被哪些线程占用:



已知每个连接池的大小都配置为了20,这5个连接池的所有连接已经被100个Dubbo线程占用完了,而所有的400个Dubbo线程又都在等待这5个连接池的连接,并且其等待的连接当前没被自己占用,通过这些条件,我们可以确定发生了死锁。


五、总结


这篇文章主要展现了一次系统故障的分析过程。在排查过程中,作者使用jmap和jstack保存故障现场,使用arthas分析故障现场,再通过阅读和分析源码,从多个可能的角度一步步的推演故障原因,推测是死锁引起的故障。在验证死锁问题时,作者使用mat按照一定的步骤来寻找线程在等待哪个连接池的连接和持有哪些连接池的连接,再结合死锁检测算法最终确认故障机器发生了死锁。


排查线上问题并非易事,不仅要对业务代码有足够的了解,还要对相关技术知识有系统性的了解,推测出可能导致问题的原因后,再熟练运用好排查工具,最终确认问题原因。



END

猜你喜欢


本文分享自微信公众号 - vivo互联网技术(vivoVMIC)。
如有侵权,请联系 [email protected] 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。

Qt 6.6 正式发布 国美 App 抽奖页面弹窗辱骂其创始人 Ubuntu 23.10 正式发布,不妨趁周五升级一波! RISC-V:不受任何单一公司或国家的控制 Ubuntu 23.10 发版插曲:因包含仇恨言论,ISO 镜像被紧急“召回” 俄罗斯企业基于龙芯处理器生产电脑和服务器 ChromeOS 是使用 Google 桌面环境的 Linux 发行版 23 岁博士生修复 Firefox 中的 22 年“幽灵老 Bug” TiDB 7.4 发版:正式兼容 MySQL 8.0 微软推出 Windows Terminal Canary 版本
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/vivotech/blog/10117429