jstack应用实例,java方法调用阻塞实例,dubbo创建消费者阻塞。

一:问题描述

  • 在一段代码执行前,会写日志,在代码的finally快,也会写日志。现在遇到的问题是finally的代码没有执行。

二:问题查找

同事让我看下项目中为什么没有执行结束的日志(该日志可以在管理员界面看到)。我登录系统看到确实没有结束日志,只有开始日志。于是开始查找问题。

2.1 jstack

  • 登录系统使用ps -ef|grep java 找到项目的pid,然后jstack pid >> thread.log,然后把thread.log下载到本地。在执行栈中寻找我打印日志前后相关代码。部分日志如下

2.2 日志查询

"xxxxx_Worker-113" prio=10 tid=0x00007f12709e2000 nid=0x552 waiting on condition [0x00007f10e72f0000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000770db96b8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
    at com.alibaba.dubbo.registry.support.AbstractRegistryFactory.getRegistry(AbstractRegistryFactory.java:88)
    at com.alibaba.dubbo.registry.RegistryFactory$Adpative.getRegistry(RegistryFactory$Adpative.java)
    at com.alibaba.dubbo.registry.integration.RegistryProtocol.refer(RegistryProtocol.java:240)
    at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper.refer(ProtocolFilterWrapper.java:60)
    at com.alibaba.dubbo.rpc.protocol.ProtocolListenerWrapper.refer(ProtocolListenerWrapper.java:63)
    at com.alibaba.dubbo.rpc.Protocol$Adpative.refer(Protocol$Adpative.java)
    at com.alibaba.dubbo.config.ReferenceConfig.createProxy(ReferenceConfig.java:401)
    at com.alibaba.dubbo.config.ReferenceConfig.init(ReferenceConfig.java:309)
    at com.alibaba.dubbo.config.ReferenceConfig.get(ReferenceConfig.java:136)
    - locked <==0x00000007aa0ace80==> (a com.alibaba.dubbo.config.ReferenceConfig)
    at com.xxxx.scheduler.server.process.AbstractProcess.createDubboProxy(AbstractProcess.java:168)
    at com.xxxx.scheduler.server.process.DubboProcess.process(DubboProcess.java:82)
    at com.xxxx.scheduler.server.process.DubboProcess$$FastClassBySpringCGLIB$$be6a767d.invoke(<generated>)

如上日志显示出createDubboProxy在我的业务代码中创建代理类,如上有2个锁相关。locked <0x00000007aa0ace80> 和parking to wait for <0x0000000770db96b8> 对应的代码分别是。

    public synchronized T get() {
        if (destroyed){
            throw new IllegalStateException("Already destroyed!");
        }
        if (ref == null) {
            init();
        }
        return ref;
    }

 public Registry getRegistry(URL url) {
        url = url.setPath(RegistryService.class.getName())
                .addParameter(Constants.INTERFACE_KEY, RegistryService.class.getName())
                .removeParameters(Constants.EXPORT_KEY, Constants.REFER_KEY);
        String key = url.toServiceString();
        // 锁定注册中心获取过程,保证注册中心单一实例
        LOCK.lock();
        try {
            Registry registry = REGISTRIES.get(key);
            if (registry != null) {
                return registry;
            }
            registry = createRegistry(url);
            if (registry == null) {
                throw new IllegalStateException("Can not create registry " + url);
            }
            REGISTRIES.put(key, registry);
            return registry;
        } finally {
            // 释放锁
            LOCK.unlock();
        }
    }
//这里的lock定义是static的lock。
 private static final ReentrantLock LOCK = new ReentrantLock();

对于第一个锁方法 public synchronized T get() ,锁的是this对象,因为当前类在外面的调用中是new的,每次调用都是全新的this,所以不存在锁争用,搜索0x00000007aa0ace80也能看到每个调用栈lock都是不同的对象

对于第二个存在锁的方法public Registry getRegistry(URL url) 可以看到静态的锁,全局只能有一个线程通过。于是我需要在栈信息中搜索获取到锁的调用链,在线程信息中搜索相关代码,于是我搜关键字ZookeeperRegistryFactory.createRegistry ,:ctrl + F 搜索 “ZookeeperRegistryFactory.createRegistry”,大多数线程调用栈都和上面的日志一样,到AbstractRegistryFactory.getRegistry后就进入ReentrantLock类了,只有如下线程调用栈不同,日志如下:

"quartzScheduler_Worker-70" prio=10 tid=0x00007f127098a800 nid=0x527 waiting on condition [0x00007f12a1edc000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007a92154b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:267)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUntil(AbstractQueuedSynchronizer.java:2130)
    at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:651)
    at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:634)
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:894)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:92)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:76)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:72)
    at 

//调用栈中的关键点ZkclientZookeeperClient.java:25行。
    com.alibaba.dubbo.remoting.zookeeper.zkclient.ZkclientZookeeperClient.<init>(ZkclientZookeeperClient.java:25)
    at com.alibaba.dubbo.remoting.zookeeper.zkclient.ZkclientZookeeperTransporter.connect(ZkclientZookeeperTransporter.java:10)
    at com.alibaba.dubbo.remoting.zookeeper.ZookeeperTransporter$Adpative.connect(ZookeeperTransporter$Adpative.java)
    at com.alibaba.dubbo.registry.zookeeper.ZookeeperRegistry.<init>(ZookeeperRegistry.java:71)
    at com.alibaba.dubbo.registry.zookeeper.ZookeeperRegistryFactory.createRegistry(ZookeeperRegistryFactory.java:37)
    at com.alibaba.dubbo.registry.support.AbstractRegistryFactory.getRegistry(AbstractRegistryFactory.java:94)
    at com.alibaba.dubbo.registry.RegistryFactory$Adpative.getRegistry(RegistryFactory$Adpative.java)
    at com.alibaba.dubbo.registry.integration.RegistryProtocol.refer(RegistryProtocol.java:240)
    at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper.refer(ProtocolFilterWrapper.java:60)
    at com.alibaba.dubbo.rpc.protocol.ProtocolListenerWrapper.refer(ProtocolListenerWrapper.java:63)
    at com.alibaba.dubbo.rpc.Protocol$Adpative.refer(Protocol$Adpative.java)
    at com.alibaba.dubbo.config.ReferenceConfig.createProxy(ReferenceConfig.java:401)
    at com.alibaba.dubbo.config.ReferenceConfig.init(ReferenceConfig.java:309)
    at com.alibaba.dubbo.config.ReferenceConfig.get(ReferenceConfig.java:136)

根据这个调用找到代码如下:

public class ZkclientZookeeperClient extends AbstractZookeeperClient<IZkChildListener> {

    private final ZkClient client;

    private volatile KeeperState state = KeeperState.SyncConnected;

    public ZkclientZookeeperClient(URL url) {
        super(url);
        //这里初始化ZkClient使用的是只有url地址的构造函数。
        //内部实现是 this(serverstring, Integer.MAX_VALUE);
        //也就是超时设置的是Integer.MAX_VALUE毫秒
        client = new ZkClient(url.getBackupAddress());
        client.subscribeStateChanges(new IZkStateListener() {
            public void handleStateChanged(KeeperState state) throws Exception {
                ZkclientZookeeperClient.this.state = state;
                if (state == KeeperState.Disconnected) {
                    stateChanged(StateListener.DISCONNECTED);
                } else if (state == KeeperState.SyncConnected) {
                    stateChanged(StateListener.CONNECTED);
                }
            }
            public void handleNewSession() throws Exception {
                stateChanged(StateListener.RECONNECTED);
            }
        });
    }

三:解决办法。

升级dubbo是不可能的,于是我去看了dubbo的github代码,发现已经在 2017.10.31 修复了这个问题。

提交记录:

https://github.com/alibaba/dubbo/commits/master/dubbo-remoting/dubbo-remoting-zookeeper/src/main/java/com/alibaba/dubbo/remoting/zookeeper/zkclient/ZkclientZookeeperClient.java

具体提交记录:

https://github.com/alibaba/dubbo/commit/74ff1b65299aa7fa7c16d500cef1e3c0bf9a0e97#diff-e395f171e3a604f41c9703bc07191b78

到此确认这个确实是dubbo的bug。于是需要解决这个问题。升级dubbo风险太大,修改单个dubbo的jar,又害怕因为重新编译出问题。最后只能是把ZkclientZookeeperClient复制一份到我业务的src目录下,然后做了如下修改:

//dubbo中ZkclientZookeeperClient.java第25行中的

client = new ZkClient(url.getBackupAddress());

//修改为如下代码,加入超时时间。
//为了确保dubbo使用的是我业务的ZkclientZookeeperClient类,而不是dubbo.jar中的ZkclientZookeeperClient类,于是加了日志。测试发现没有问题。
 logger.info(" ignored this error, My_ZkclientZookeeperClient init | start new ZkClient,url is:{} ", url.getBackupAddress());
  try {
            client = new ZkClient(url.getBackupAddress(), 10000);
        } catch (Exception e) {
            logger.error(" new  ZkClient is fail ,url is ", url.getBackupAddress(), e);
            throw e;
        }
  • tomcat有自己的类加载器,会在lib和classes中优先加载classes的class文件。如果是java,则会是java -cp(等价于java -classpath)指定,平时我们用ide运行java程序,ide都会指定-cp。 *

到此,问题解决。

猜你喜欢

转载自blog.csdn.net/faicm/article/details/78841470