问题的解决方法:https://issues.jboss.org/browse/JBAS-8598 其实就是升级到1.8.3 就解决了
问题的源头:
项目中的代码发现死锁,jstack查看线程堆栈:
Found one Java-level deadlock: ============================= "resin-port-9007-1558": waiting to lock monitor 0x00007f6e04001a38 (object 0x00000007890586c0, a java.lang.Object), which is held by "scheduler_Worker-2" "scheduler_Worker-2": waiting to lock monitor 0x00007f6e34004a78 (object 0x0000000788fc1a68, a java.lang.Object), which is held by "scheduler_QuartzSchedulerThread" "scheduler_QuartzSchedulerThread": waiting to lock monitor 0x00007f6e04001a38 (object 0x00000007890586c0, a java.lang.Object), which is held by "scheduler_Worker-2" "scheduler_Worker-2": at org.quartz.core.QuartzSchedulerThread.signalSchedulingChange(QuartzSchedulerThread.java:204) - waiting to lock <0x0000000788fc1a68> (a java.lang.Object) at org.quartz.core.SchedulerSignalerImpl.signalSchedulingChange(SchedulerSignalerImpl.java:87) at org.quartz.simpl.RAMJobStore.triggeredJobComplete(RAMJobStore.java:1408) - locked <0x00000007890586c0> (a java.lang.Object) at org.quartz.core.QuartzScheduler.notifyJobStoreJobComplete(QuartzScheduler.java:1767) at org.quartz.core.JobRunShell.run(JobRunShell.java:270) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) "scheduler_QuartzSchedulerThread": at org.quartz.simpl.RAMJobStore.releaseAcquiredTrigger(RAMJobStore.java:1282) - waiting to lock <0x00000007890586c0> (a java.lang.Object) at org.quartz.core.QuartzSchedulerThread.releaseIfScheduleChangedSignificantly(QuartzSchedulerThread.java:432) at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:288) - locked <0x0000000788fc1a68> (a java.lang.Object)
查看代码,具体的逻辑是:
RAMJobStore 和 QuartzSchedulerThread 都有个自己的锁对象
逻辑一:QuartzSchedulerThread-》RAMJobStore 。
QuartzSchedulerThread 本身其实是个线程,他的run方法会加自己的锁,会去检查定时的改变情况,然后会调用RAMJobStore的方法
逻辑二:RAMJobStore -> QuartzSchedulerThread
定时任务执行的时候,会调用RAMJobStore,RAMJobStore会调用QuartzSchedulerThread的方法通知任务的完成情况
逻辑一和逻辑二的加锁顺序相反,死锁形成。
1.8.3 版本是如何解决死锁的呢:
是改变了逻辑一:检查定时任务的情况的时候,不会去调用RAMJobStore的方法去尝试取消任务:
第一个红线的地方在1.8.0的时候,和第二个红线地方一样。红线一的地方不再调用RAMJobStore的加锁,所以死锁解决。