In Java, there is a more than a decade of bug ...

I compiled a free Java Advanced information, covering Java, Redis, MongoDB, MySQL, Zookeeper, Spring Cloud, Dubbo distributed high concurrency and other tutorials, a total of 30G, needs its own collection.
Portal: https://mp.weixin.qq.com/s/osB-BOl6W-ZLTSttTkqMPQ

Today, surprisingly BUG share in a JDK, the BUG magic that reproduce its use cases too simple, the human eye will be able to answer the question, JDK, but in the presence of a dozen years. After testing, we found that from JDK8 to 14 have this problem.

You can try this code on your development platform:

public class Hello {
    public void test() {
        int  i = 8;
        while  ((i -= 3) > 0);
        System.out.println("i = " + i);
    }

    public static void main(String[] args) {
        Hello hello = new Hello();
        for (int  i = 0; i < 50_000; i++) {
            hello.test();
        }
    }
}

And then execute the following command:
the Java the Hello

Then, you will see this output:

 

Of course, at the beginning of the program, or to print the correct "i = -1".

The question ultimately Huawei two colleagues to get rid of the JDK, and turn to the community. I probably talk here about ideas analysis.

First, explain the implementation can be found, the results are correct, which indicates that this is essentially a matter JIT compiler, and then by -XX: -TieredCompilation closed C1 compiled reproduce the same problem, but using -XX: TieredStopAtLevel = 3 JIT compiler will remain in the C stage, the problem is not reproducible, which can determine the problem of the C2.

Next, a colleague immediately guessed the "/" is actually ( '0'-1), is exactly zero character ascii code subtracting one. Ah importance ascii code table memorized manifested. Next, it is to find a place int c2 in turn characters. The key point is that this character '0', of course, here to have sufficient understanding of the C2, c2 immediately find a way to convert the characters (the specific code, please refer to the OpenJDK Community):

void PhaseStringOpts::int_getChars(GraphKit& kit, Node* arg, Node* char_array, Node* start, Node* end) {
  // ......
  // char sign = 0;

  Node* i = arg;
  Node* sign = __ intcon(0);

  // if (i < 0) {
  // sign = '-';
  // i = -i;
  // }
  {
    IfNode* iff = kit.create_and_map_if(kit.control(),
                                        __ Bool(__ CmpI(arg, __ intcon(0)), BoolTest::lt),
                                        PROB_FAIR, COUNT_UNKNOWN);

    RegionNode *merge = new (C) RegionNode(3);
    kit.gvn().set_type(merge, Type::CONTROL);
    i = new (C) PhiNode(merge, TypeInt::INT);
    kit.gvn().set_type(i, TypeInt::INT);
    sign = new (C) PhiNode(merge, TypeInt::INT);
    kit.gvn().set_type(sign, TypeInt::INT);

    merge->init_req(1, __ IfTrue(iff));
    i->init_req(1, __ SubI(__ intcon(0), arg));
    sign->init_req(1, __ intcon('-'));
    merge->init_req(2, __ IfFalse(iff));
    i->init_req(2, arg);
    sign->init_req(2, __ intcon(0));

    kit.set_control(merge);

    C->record_for_igvn(merge);
    C->record_for_igvn(i);
    C->record_for_igvn(sign);
  }

  // for (;;) {
  // q = i / 10;
  // r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
  // buf [--charPos] = digits [r];
  // i = q;
  // if (i == 0) break;
  // }

  {
   // 略去和这个循环相对应的代码
  }

  // 略去很多代码
}

It can be seen here showing introduces a phase "i <0" is determined in the middle. The main point is that CmpI node, where the logic looks wrong, resulting in i obviously smaller than 0, the result is greater than 0 has come to branch so directly take the character '0' and i the summed results is the wrong .

That this CmpI Why wrong? Use c2visualizer tools can be seen in GVN stage, and the above cycle CmpI CmpI incorporated herein were incorporated. GVN stands for Global Value Numbering, a very big name, in fact, expressions deduplication. E.g:

 

In the above example, the input parameters are identical two CmpI. And the variable i is an integer of 0, then the two CmpI nodes is actually exactly the same. In this case, the compiler will do when optimizing these intermediate CmpI two nodes be combined into one.

Up to this point, but it is still no problem. But then the compiler will empty loop body to do something special conversion, the compiler can be calculated directly after the end of the empty body of the loop, the value of i is -1, also found an empty loop body does not do anything, so it just put two parameters CmpI are replaced by -1, in order to let go does not come in cycles - and the compiler constant propagation can do it again this CmpI completely blown away. However, here CmpI have a problem, here forced into an False let cycle is not executed, and the value of i also directly into the value end of the cycle. But just combined that CmpI also been eaten.

This leads directly to the value i = -1 holding into the i> = 0 in the branch. So modifications are also very simple, that is, when converted to CmpI to see if it has no other out, if so, a copy of it.

The BUG-related issue and patch here:

https://bugs.openjdk.java.net/projects/JDK/issues/JDK-8231988?filter=allissues

JBS系统上没有详细的分析过程,只有最后的patch,所以我把这个问题写了个总结发在这里。可以看到,即使是很简单的测试用例,在编译器内部也会经历各种复杂的变换和优化。然后一些阶段的优化可能会影响后一个阶段的,所以编译器的BUG也往往晦涩。但反过来说,也很有意思。

Guess you like

Origin www.cnblogs.com/yunxi520/p/12373332.html