How does the Java code run on the machine for getting started with the basics of Java development

What the computer can recognize is machine instruction code, referred to as machine code. Machine code is binary, and computers can recognize it directly, but it is too different from human language, so it is not easy for people to understand and remember. Later, various high-level languages ​​were born. People write programs in high-level languages, and then interpret or compile the programs into machine codes.

Python, for example, is an interpreted language. The source code of the Python program does not need to be compiled, and the program can be run directly from the source code. The Python interpreter converts the source code into bytecode, and then forwards the compiled bytecode to the Python Virtual Machine (PVM) for execution.

The C language is a typical compiled language, which needs to be compiled into machine code with a compiler first. For example, we usually use gcc to compile C language programs:

$ gcc hello.c # compile

$ ./a.out # execute

hello world!

So is Java an interpreted language or a compiled language?

"Java has the characteristics of both a compiled language and an interpreted language." After the programmer writes the Java program, he needs to use javac to compile it into a bytecode class file that the JVM can use. Then the JVM loads the class file and interprets and executes it one by one. During the running process, some hot codes will be compiled into machine code by the just-in-time compiler.

source code to bytecode

The source code of the Java language is a file with the suffix .java. Of course, many other high-level languages ​​are also built on the JVM, such as groovy, kotlin, etc. The source code is for people to see, easy to read, understand, and maintain.

The source code is compiled to obtain the bytecode, which is used by the JVM and is easy to understand and identify. The bytecode is suffixed with .class, and its format is a set of plans of the JVM. The bytecode can barely be understood by humans compared to the document, but it is more difficult to understand than Java code.

Java is different from Python. Python does not need to compile bytecode files (of course, Python also provides this operation). Compilation is an automatic process, and generally you don't care about its existence. Java will compile the bytecode file first, so that the JVM can directly read the bytecode file, which can save the time of loading modules and improve efficiency. At the same time, the form of bytecode also increases the difficulty of reverse engineering, which can protect the source code (of course, it can also be decompiled).

Friends who are familiar with JVM know that it has a "class loading process", which can be said to be an old stereotype, and is often asked by interviewers. The class loading process actually refers to the entire process of the JVM from reading a class file to preparing the class, and finally destroying it.

So "class files are actually based on "classes", which are somewhat different from java files." If we declare multiple classes in a Java file, we will find multiple class files when compiled with Javac. For example, we declare a One.java file:

public class One {

public class OneInner {}

private class OnePrivateInner {}

public static class OneStaticInner {}

private static class OneprivateStaticInner {}

}

class Two{}

After compiling with Javac, there will be 6 class files

➜ $ ls

‘OneKaTeX parse error: Double superscript at position 25: …class' '̲OneOneStaticInner.class’ One.class Two.class

‘OneKaTeX parse error: Double superscript at position 25: …eInner.class' '̲OneOneprivateStaticInner.class’ One.java

bytecode to machine code

Load and use bytecode

As mentioned earlier, the JVM will load the class file, and then the loaded Java class will be stored in the Method Area. Start running from the main method of the specified class as the entry point. When actually running, the virtual machine executes the code in the method area, and the JVM uses the heap and stack to store runtime data.

Whenever a method is entered, the Java virtual machine generates a stack frame in the stack of the current thread to store local variables and bytecode operands. The size of this stack frame is calculated in advance.
IMG_256
When exiting a method, whether it is a normal return or an abnormal return, the Java virtual machine will "pop up the current stack frame of the current thread" and discard it.

The Java virtual machine needs to translate bytecode into machine code in order for the machine to execute it. There are two forms of this process, one is interpretation and execution, which translates bytecodes into machine codes one by one and executes them; the other is Just-In-Time compilation (JIT), which is to include "in a method" All bytecodes are compiled into machine code and then executed.
IMG_257
layered compilation

How do these two compilation methods cooperate?

The HotSpot virtual machine includes multiple just-in-time compilers C1, C2 and Graal. Among them, Graal is an experimental just-in-time compiler, which can be enabled by the parameter -XX:+UnlockExperimentalVMOptions -XX:+UseJVMCICompiler and replaces C2.

C1 and C2 have their own advantages and disadvantages, and are suitable for different scenarios. Before Java 7, only one compiler could be chosen. C1 compiles quickly, but the execution efficiency of the generated code is average. It is often used for programs that have a short execution time or have requirements for startup performance. Programs that take a long time to execute or require peak performance are often used on the server side. In fact, the parameter corresponding to C1 is client, and the parameter corresponding to C2 is server, which also match their application scenarios.

Java7 introduces the concept of layered compilation, which combines the startup performance advantages of C1 and the peak performance advantages of C2. The machine code compiled by C1 and C2 is different. The execution efficiency of C2 code is more than 30% higher than that of C1 code. The faster the machine code, the longer it takes to compile. Layered compilation is a compromise method, which can not only satisfy some of the less hot codes to be compiled in a short time, but also satisfy the best optimization of hot codes.

hot code

So how to determine the hot code?

The JVM will collect the runtime information of the method, mainly including the number of calls and the number of loop backs. Just-in-time compilation is triggered when "the sum of the number of method invocations and the number of loopbacks exceeds the specified threshold".

->

The number of loop backs can be simply understood as the number of loops of the code inside the method, for example, there are for loops or while loops inside the method.

<-

Before the emergence of layered compilation, this threshold was specified by the parameter -XX:CompileThreshold. When using C1, the value was 1500; when using C2, the value was 10000.

When tiered compilation is enabled, the JVM uses another threshold system. In this system, the size of the threshold is dynamically adjusted. The JVM multiplies the threshold with some coefficient s. This coefficient is positively correlated with the number of methods currently to be compiled and negatively correlated with the number of compilation threads.

compile thread

By default the total number of compilation threads is scaled according to the number of processors. The Java virtual machine allocates these compilation threads to C1 and C2 (at least 1 each) in a ratio of 1:2. For example, for a quad-core machine, the total number of compilation threads is 3, including one C1 compilation thread and two C2 compilation threads.

->

When there are too few machine resources, there may be 1 thread for each.

<-

You can see the compilation threads with arthas:
IMG_258
Arthas
can see that their ID is -1, and their priority is also -1. The thread priority we created ourselves is 0~10, so the priority of the compilation thread will be higher.

Summarize

In one sentence, how does a Java program run on a machine? First, a Java programmer writes Java code, and then the Java code will be compiled into a class file, and multiple class files will be packaged into a jar package or a war package. Then the JVM loads the class file, and then first interprets and executes it as bytecode. After the program runs for a period of time, the JVM will continue to judge whether a method is a hot code through the number of method calls and loops. If so, it will use layered compilation, compile it into bytecode through the compilation thread, and run it on the machine.

Article source: The network copyright belongs to the original author

The above content is not for commercial purposes, if it involves intellectual property issues, please contact the editor, we will deal with it immediately

Guess you like

Origin blog.csdn.net/xuezhangmen/article/details/132035953