Understand Java Virtual Machine Architecture

Original address: http://www.importnew.com/18689.html

1 Overview As we

all know , Java supports platform independence, security and network mobility. The Java platform is composed of a Java virtual machine and Java core classes, which provide a unified programming interface for pure Java programs, regardless of the underlying operating system. It is thanks to the Java virtual machine that its so-called "compile once, run anywhere" can be guaranteed.

1.1 Java program execution flow

The execution of Java programs depends on the compilation environment and the running environment. The conversion of source code into executable machine code is completed by the following process:



The core of Java technology is the Java virtual machine, because all Java programs run on the virtual machine. The operation of the Java program requires the cooperation of the Java virtual machine, the Java API and the Java Class file. A Java virtual machine instance is responsible for running a Java program. When a Java program is started, a virtual machine instance is created. When the program ends, the virtual machine instance also dies.



The cross-platform nature of Java because it has virtual machines for different platforms.

1.2 Java Virtual

Machine The main task of the Java virtual machine is to load the class file and execute the bytecode in it. As can be seen from the figure below, the Java virtual machine includes a class loader, which can load class files from programs and APIs. In the Java API, only the classes required for program execution will be loaded, and the bytecodes are loaded by execution engine to execute.



When a Java virtual machine is implemented by software on a host operating system, Java programs interact with the host by calling native methods. Java methods are written in the Java language, compiled into bytecode, and stored in class files. Native methods are written in C/C++/assembly language, compiled into processor-related machine code, and stored in a dynamic link library in a format specific to each platform. So the native method is the connection between the Java program and the underlying host operating system.

Since the Java virtual machine does not know how a class file was created, and whether it has been tampered with, it implements a class file detector to ensure that the types defined in the class file can be used safely. The class file checker ensures the robustness of the program through four independent scans:

class file structure check,
type data semantic check,
bytecode verification,
symbol reference
verification The operation of security mechanisms, which as the Java programming language ensures the robustness of Java programs, is also a feature of the Java virtual machine:

type-safe reference conversion
Structured memory access
Automatic garbage collection
Array bounds checking
Null reference checking
1.3 Java virtual machine data Types

The Java virtual machine performs computations with certain data types. Data types can be divided into two types: basic types and reference types, as shown in the figure below:



but boolean is a bit special. When the compiler compiles Java source code into bytecode, it will use int or byte to represent boolean. In the Java virtual machine, false is represented by 0, and true by all non-zero integers. Like the Java language, the range of the Java virtual machine's primitive types is consistent everywhere, regardless of the host platform, and a long is always a 64-bit two's complement signed integer in any virtual machine.

For returnAddress, this primitive type is used to implement finally clauses in Java programs. Java programmers cannot use this type. Its value points to the opcode of a virtual machine instruction.

2 Architecture

In the Java Virtual Machine Specification, the behavior of a virtual machine instance is described in terms of subsystems, memory areas, data types, and instructions, which together represent an abstract virtual machine's internal architecture.



2.1 class file

Java class file contains all the information about the class or interface. The "basic type" of the class file is as follows:

u1 1 byte, unsigned type
u2 2 bytes, unsigned type
u4 4 bytes, unsigned type
u8 8 bytes, unsigned type
If you want to know more , Oracle's JVM SE7 gives the official specification: The Java® Virtual Machine Specification

class file contains:

ClassFile {

    u4 magic; //Magic number: 0xCAFEBABE, used to determine whether it is a Java class file
    u2 minor_version; //Minor version No.
    u2 major_version; // major version number
    u2 constant_pool_count; // constant pool size
    cp_info constant_pool[constant_pool_count-1]; // constant pool
    u2 access_flags; //Access flags of class and interface level (obtained by | operation)
    u2 this_class; //Class index (pointing to the class constant in the constant pool)
    u2 super_class; //Super class index (pointing to the class constant in the constant pool) )
    u2 interfaces_count; //interface index counter
    u2 interfaces[interfaces_count]; //interface index set
    u2 fields_count; //field number counter
    field_info fields[fields_count]; //field table set
    u2 methods_count; //method number counter
    method_info methods[ methods_count]; //Method table set
    u2 attributes_count; //Attribute number
    attribute_info attributes[attributes_count]; //Attribute table

}
2.2 Class loader subsystem

The class loader subsystem is responsible for finding and loading type information. In fact, the Java virtual machine has two kinds of class loaders: system loaders and user-defined loaders. The former is part of the Java virtual machine implementation, and the latter is part of the Java program.



Bootstrap class loader (bootstrap class loader): It is used to load the core library of Java, which is implemented by native code and does not inherit from java.lang.ClassLoader.
Extensions class loader (extensions class loader): It is used to load Java extension library. Implementations of the Java Virtual Machine provide an extension library directory. The class loader finds and loads Java classes in this directory.
Application class loader (application class loader): It loads Java classes according to the class path (CLASSPATH) of the Java application. Generally speaking, the classes of Java applications are loaded by it. It can be obtained by ClassLoader.getSystemClassLoader().
In addition to the class loader provided by the system, developers can implement their own class loader by inheriting the java.lang.ClassLoader class to meet some special needs.

The class loader subsystem involves several other components of the Java virtual machine and classes from the java.lang library. The methods defined by ClassLoader provide programs with an interface to access the class loader mechanism. In addition, for each loaded type, the Java virtual machine creates an instance of the java.lang.Class class for it to represent the type. Like other objects, user-defined class loaders and instances of the Class class are placed in the heap area in memory, while the loaded type information is located in the method area.

In addition to locating and importing binary class files, the class loader subsystem must also be responsible for verifying the correctness of the imported classes, allocating and initializing memory for class variables, and resolving symbolic references. These actions also need to be done in the following order:

load (find and load the type's binary data)
connect (perform validation: ensure the correctness of the imported type; prepare: allocate memory for class variables and initialize them to default values; resolve: Convert the symbolic reference in the type to a direct reference)
initialization (class variables are initialized to the correct initial value)
2.3 Method area

In the Java virtual machine, information about the type being loaded is stored in memory in a method area. When the virtual machine loads a certain type, it uses the class loader to locate the corresponding class file, then reads the class file and transfers it to the virtual machine, and then the virtual machine extracts the type information in it and stores the information in the method area. The method area can also be collected by the garbage collector, because the virtual machine allows the dynamic extension of Java programs through user-defined class loaders.

The following information is stored in the method area:

The fully qualified name of this type (such as the fully qualified name java.lang.Object) The fully qualified name
of the direct superclass of this type
This type is a class type or an interface type
The access modifier of this type ( some subset of public, abstract, final)
an ordered list of the fully qualified names of any direct superinterfaces
the type's constant pool (an ordered collection of direct constants [string, integer and floating point constants] and for other types, Symbolic references to fields and methods)
Field information (field name, type, modifiers)
Method information (method name, return type, number and types of parameters, modifiers)
All class (static) variables except constants
A reference to a ClassLoader class (when each type is loaded, the virtual machine must keep track of whether it was loaded by a startup class loader or a user-defined class loader)
A reference to a Class class (for each type loaded, a virtual The machine creates an instance of the java.lang.Class class for it accordingly. For example, if you have a reference to an object of the java.lang.Integer class, you only need to call the getClass() method of the Integer object reference to get the representation of java .lang.Integer class Class object)
2.4 Heap

All class instances or arrays (arrays are real objects in the Java virtual machine) created by a Java program at runtime are placed in the same heap. Since a Java virtual machine instance has only one heap space, all threads will share this heap. It should be noted that the Java virtual machine has an instruction to allocate objects in the heap, but there is no instruction to free memory, because the virtual machine hands this task to the garbage collector. The Java Virtual Machine specification doesn't mandate a garbage collector, it only requires that virtual machine implementations must manage their own heap space "in some way". For example, an implementation may only have a fixed size of heap space. When the space is full, it simply throws an OutOfMemory exception and does not consider the issue of garbage object collection, but it is in compliance with the specification.

The Java Virtual Machine specification does not specify how Java objects are represented on the heap, leaving it up to the implementer of the virtual machine to decide how to design it. A possible heap design is as follows:



a pool of handles, a pool of objects. An object reference is a local pointer to the handle pool. The advantage of this design is that it is conducive to heap defragmentation. When moving objects in the object pool, the handle part only needs to change the new address of the pointer to the object. The disadvantage is that each access to an instance variable of an object requires two pointer passes.

2.5 Java stack

Whenever a thread is started, the Java Virtual Machine allocates a Java stack for it. The Java stack consists of many stack frames, and a stack frame contains the state of a Java method call. When a thread calls a Java method, the virtual machine pushes a new stack frame onto the thread's Java stack, and when the method returns, the stack frame is popped from the Java stack. The Java stack stores the state of Java method calls in threads – including local variables, parameters, return values, and intermediate results of operations. The Java virtual machine has no registers, and its instruction set uses the Java stack to store intermediate data. The reason for this design is to keep the instruction set of the Java virtual machine as compact as possible, and also to facilitate the implementation of the Java virtual machine on platforms with few general-purpose registers. In addition, the stack-based architecture also contributes to the code optimization of dynamic compilers and just-in-time compilers implemented by some virtual machines at runtime.

2.5.1 Stack frame The

stack frame consists of local variable area, operand stack and frame data area. When the virtual machine calls a Java method, it obtains the method's local variable area and the size of the operand stack from the type information of the corresponding class, allocates stack frame memory based on this, and then pushes it into the Java stack.

2.5.1.1 Local variable area The
local variable area is organized as a 0-based array in word length. Bytecode instructions use the data in them by indexing from 0. Values ​​of type int, float, reference and returnAddress occupy an entry in the array, while values ​​of type byte, short and char are converted to int values ​​before being stored in the array and also occupy an entry. But values ​​of type long and double occupy consecutive two items in the array.



2.5.1.2 Operand stack
Like the local variable area, the operand stack is organized as an array of word lengths. It is accessed via standard stack operations - push and pop. Since the program counter cannot be directly accessed by program instructions, the instructions of the Java virtual machine obtain operands from the operand stack, so its operation mode is based on the stack instead of the register. The virtual machine uses the operand stack as its work area, because most instructions pop data from there, perform the operation, and then push the result back onto the operand stack.

2.5.1.3 Frame data area
In addition to local variable area and operand stack, Java stack frame also needs frame data area to support constant pool parsing, normal method return and exception dispatch mechanism. Whenever the virtual machine wants to execute an instruction that requires constant pool data, it accesses it through the pointer to the constant pool in the frame data area. In addition to the parsing of the constant pool, the frame data area also helps the virtual machine handle the normal termination or abnormal termination of the Java method. If it ends normally with return, the virtual machine must restore the stack frame of the calling method, including setting the program counter to point to the next instruction that called the method; if the method has a return value, the virtual machine needs to push it into the calling method's Operand stack. To handle abnormal exit conditions during Java method execution, the frame data area also holds a reference to the method's exception table.

2.6 Program Counter

For a running Java program, each thread has its program counter. The program counter is also called the PC register. The program counter can hold either a local pointer or a returnAddress. When a thread executes a Java method, the value of the program counter is always the address of the next instruction to be executed. The address here can be a local pointer or an offset in the method bytecode relative to the method's starting instruction. If the thread is executing a native method, then the value of the program counter is "undefined".

2.7 Native method stack

Any native method interface uses some kind of native method stack. When a thread calls a Java method, the virtual machine creates a new stack frame and pushes it onto the Java stack. When it calls a native method, the virtual machine keeps the Java stack unchanged, and no longer pushes a new stack into the thread's Java stack. The virtual machine simply dynamically connects and directly calls the specified native method.

The method area and heap are shared by all threads in the virtual machine instance. When the virtual machine loads a class file, it parses the type information from the binary data contained in the class file, and then puts the type information into the method area. When a program runs, the virtual machine places all objects created by the program at runtime on the heap.

Like other runtime memory areas, the memory area occupied by the native method stack can dynamically expand or contract as needed.

3 Execution engine

In the Java virtual machine specification, the behavior of the execution engine is defined using the instruction set. The designer implementing the execution engine will decide how to execute the bytecode, and the implementation can take the form of interpretation, just-in-time compilation, or direct execution using instructions on the chip, or a mix of them.

The execution engine can be understood as an abstract specification, a concrete implementation or a running instance. The abstract specification specifies the behavior of the execution engine using a set of instructions. Specific implementations may use a number of different techniques - including software aspects, hardware aspects or a combination of tree species technologies. The execution engine as a runtime instance is a thread.

Each thread of a running Java program is an instance of an independent virtual machine execution engine. From the beginning to the end of the thread's life cycle, it is either executing bytecode or executing native methods.

3.1 Instruction set The bytecode stream of the

method consists of the instruction sequence of the Java virtual machine. Each instruction consists of a single-byte opcode followed by zero or more operands. The opcode represents the operation that needs to be performed; the operands provide the Java virtual machine with additional information needed to execute the opcode. When the virtual machine executes an instruction, it may use an entry in the current constant pool, a value in a local variable for the current frame, or a value at the top of the operand stack for the current frame.

The abstract execution engine executes one bytecode instruction at a time. This operation is performed by each thread (execution engine instance) of the program running in the Java virtual machine. The execution engine fetches the opcode, and if the opcode has operands, fetches its operands. It performs the action specified by the opcode and the following operands, and then fetches the next opcode. This process of executing bytecode continues until the thread completes, either by returning from its initial method, or by not catching an exception thrown.

4 Native method interface

Java native interface, also called JNI (Java Native Interface), is prepared for portability. The native method interface allows native methods to do the following:

Pass or return data
Manipulate instance variables
Manipulate class variables or call class methods
Manipulate arrays
Lock objects on the heap
Load new classes
Throw exceptions
Catch thrown by native method calls to Java methods
Catch Asynchronous exception thrown by the virtual machine
Indicates to garbage collector that an object is no longer needed
Reference:

"In-depth Java Virtual Machine"

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326756978&siteId=291194637