java应用crash案例

最近，应用总会时不时crash（jdk6u24），hs_err_pid.log截取如下：
Current thread (0x0000000040124800): GCTaskThread [stack: 0x0000000000000000,0x0000000000000000] [id=23997]
出错的时候运行的是GCTaskThread,说明是GC的时候出错的,执行的是libjvm.so 代码

R9 =0x00002b1fea3ea080
0x00002b1fea3ea080: <offset 0x9b9080> in /data/dynasty/jdk/jre/lib/amd64/server/libjvm.so at 0x00002b1fe9a31000
出错的代码段在libjvm.so 的偏移量0x9b9080位置(特别要注意这个偏移量，log中的其他内存地址都是不可比较的，只有偏移量,在同一版本的so库中，偏移量相同必然是表示的是同一代码)。多次crash都是这个偏移值。

Heap
par new generation total 943744K, used 438097K [0x00000006f4000000, 0x0000000734000000, 0x0000000734000000)
eden space 838912K, 50% used [0x00000006f4000000, 0x000000070ded3cf8, 0x0000000727340000)
from space 104832K, 12% used [0x000000072d9a0000, 0x000000072e6a09d0, 0x0000000734000000)
to space 104832K, 0% used [0x0000000727340000, 0x0000000727340000, 0x000000072d9a0000)
concurrent mark-sweep generation total 3145728K, used 1573716K [0x0000000734000000, 0x00000007f4000000, 0x00000007f4000000)
concurrent-mark-sweep perm gen total 196608K, used 86517K [0x00000007f4000000, 0x0000000800000000, 0x0000000800000000)
内存一切正常,不是因为内存不足造成的.

代码也找不到任何问题，直到在网上搜到了一个jdk的bug： http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7002666，理解为在一定情况下（GC时指针压缩），java本地类库中某段逻辑会引用错误的内存地址。这个bug在jdk6u25版本时修复了。
怀疑是这个问题导致，于是升级jdk版本至6u25，试用了一个多月，应用没有再无故crash。问题解决！

官网引用了一段代码，可验证此bug（要求环境为linux64位系统）：

    static byte[] bb;
    public static void main(String[] args) {
        bb = new byte[1024 * 1024];//这句的意义在于促进GC
        for (int i = 0; i < 25000; i++) {
            Object[] a = test(TestJdk024bug.class, new TestJdk024bug());
            if (a[0] != null) {
                // The element should be null but if it's not then
                // we've hit the bug.  This will most likely crash but
                // at least throw an exception.
                System.out.println("i = " + i);
                System.err.println(a[0]);
                throw new InternalError(a[0].toString());

            }
        }
        System.out.println("end........");
    }
    public static Object[] test(Class c, Object o) {
        // allocate an array small enough to be trigger the bug
        Object[] a = (Object[])java.lang.reflect.Array.newInstance(c, 1);
        return a;
    }

在多线程情况下跑，会出现a[0] != null为true的情况
我的启动参数是-server -Xms30M -Xmx30M -XX:NewSize=4m

猜你喜欢