Walk into the Java Virtual Machine

Part of the content of this article is directly taken from Zhou Zhiming's in-depth understanding
of what is the third edition of the Java Virtual Machine JVM?
Baidu Encyclopedia explains: A virtual machine is an abstract computer, which is realized by simulating various computer functions on an actual computer. The Java virtual machine has its own complete hardware architecture, such as processors , stacks , registers, etc., as well as corresponding instruction systems. The Java virtual machine shields the information related to the specific operating system platform, so that the Java program only needs to generate the object code ( bytecode ) that runs on the Java virtual machine , and it can run on multiple platforms without modification.

The main content of this article is to walk into the Java virtual machine from the Java memory area, that is, the runtime data area of ​​the Java virtual machine. The
  Java virtual machine divides the memory it manages into several different data areas during the execution of the Java program. These data areas have their own purposes, as well as the time of creation and destruction. Some areas always exist with the start of the virtual machine process, and some areas are created and destroyed depending on the start and end of the user thread. According to the "Java Virtual Machine Specification", the memory managed by the Java Virtual Machine will include the following operating data areas.

1. Program counter

The Program Counter Register is a small memory space, which can be seen as a line number indicator of the bytecode executed by the current thread. In the conceptual model of the Java virtual machine (it represents the unified appearance of all virtual machines, but each specific Java virtual machine is not necessarily designed completely in accordance with the definition of the conceptual model, and may be designed through some more efficient etc. To achieve it), when the bytecode interpreter works, it selects the next bytecode instruction to be executed by changing the value of this counter. It is an indicator of program control flow, such as branch, loop, jump, and exception. Basic functions such as processing and thread recovery all need to rely on this counter to complete.

Since the multi-threading of the Java virtual machine is realized by the way of thread switching and allocating processor execution time in turn, at any certain moment, a processor (of course, a core for a multi-core processor) will only execute A thread's instruction. Therefore, in order to restore to the correct execution position after the thread is switched, each thread needs to have an independent program counter. The counters between the threads do not affect each other and are stored independently. We call this memory area private to the thread. Memory (will not be shared with other areas).

If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed. If the thread is executing a Native method, the counter value is empty. This memory area is the only area where OutOfMemoryError is not specified in the "Java Virtual Machine Specification".

2. Java virtual machine stack

Like the program counter, the Java Virtual Machine Stack (Java Virtual Machine Stack) is also thread-private, and its life cycle is the same as that of a thread. The virtual machine stack describes the thread memory model of Java method execution: when each method is executed, the Java virtual machine will synchronously create a stack frame to store local variable tables, operand stacks, dynamic connections, and methods. Export and other information. The process from when each method is called to the completion of execution corresponds to the process of a stack frame from stacking to stacking in the virtual machine.

Some people often divide the Java memory area into heap memory and stack memory in general. This division method directly inherits the memory layout structure of traditional C and C++ programs. It appears a bit rough in the Java language. The actual memory area is divided More complicated than this. However, the popularity of this way of division indirectly shows that the areas that programmers are most concerned about and are most closely related to object memory allocation are the "heap" and the "stack". Among them, "heap" can first look at the figure, which will be introduced later, and "stack" usually refers to the virtual machine stack mentioned here, or more often just refers to the local variable table in the virtual machine stack.

The local variable table stores the basic data types of the Java virtual machine (the eight basic Java data types) known to the compiler, and the object reference type (reference type, which is not equivalent to the object itself, but may be a reference to the starting address of the object) A pointer may also point to a handle representing an object or other locations related to this object) and return Address type (pointing to a bytecode instruction address).

The storage space of these data types in the local variable table is represented by a local variable slot. The 64-bit long and double data will occupy two variable slots, and the remaining data types will only occupy one. The memory space required by the local variable table is allocated during compilation. When entering a method, how much local variable space this method needs to allocate in the stack frame is completely determined, and the size of the local variable table will not be changed during the running of the method. . The size mentioned here refers to the number of variable slots. How much memory space the virtual machine really uses to implement a variable slot is completely determined by the specific virtual machine.

In the "Java Virtual Machine Specification", two types of abnormal conditions are specified for this memory area: If the stack depth of the thread requesting i is greater than the depth allowed by the virtual machine, a StackOverFlowError exception will be thrown; if the Java virtual machine stack capacity can be dynamic Expansion. When the stack is expanded, it will throw an OutOfMemoryError if enough memory is not available.

3. Local method stack

The native method stack (Native Method Stack) and the virtual machine stack play very similar roles. The difference is that the virtual machine stack serves the virtual machine to execute Java methods (that is, bytecode), while the native method stack serves the virtual machine. The machine uses native (Native) method services.

The "Java Virtual Machine Specification" does not have any mandatory provisions on the language, usage and data structure of the methods in the local method stack. Therefore, the specific virtual machine can freely implement it according to needs, and even some Java virtual machines (Hot-Spot ) Directly merge the local method stack and the virtual machine stack into one. Like the virtual machine stack, the local method stack will also throw StackOverFlowError and OutOfMemoryError exceptions respectively when the stack depth overflows or the stack expansion fails.

4.Java heap

For Java applications, the Java heap is the largest piece of memory managed by the virtual machine. The Java heap is a memory area shared by all threads and is created when the virtual machine starts. The only purpose of this memory area is to store object instances, and all object instances "geometry" in the Java world allocate memory here. The description of Java in the "Java Virtual Machine Specification" is: "All object instances and arrays should be allocated on the heap".

The Java heap is a memory area managed by the garbage collector, so it is also called the "GC heap" in some materials. From the perspective of reclaiming memory, since most modern garbage collectors are designed based on generational collection theory, there are often "new generation", "old generation", "permanent generation", and "Eden area" in the Java heap. ", "From area", "To area" and other nouns. These area divisions are just some of the common features or design styles of garbage collectors, rather than the inherent memory layout of a Java virtual machine, not to mention the "Java The Java Heap is further divided in detail in the Virtual Machine Specification.

From the perspective of memory allocation, the Java heap shared by all threads can be divided into multiple thread-private allocation buffers to improve the efficiency of object allocation. However, no matter from any angle, no matter what the division is, it will not change the commonality of the contents stored in the Java heap. No matter which area, the storage can only be the instance of the object. The purpose of subdividing the Java heap is only for better recycling. Memory, or allocate memory faster.

According to the "Java Virtual Machine Specification", the Java heap can be in a physically discontinuous memory space, but logically it should be regarded as contiguous. This is just like when we use disk space to store files, it does not require every All files are stored continuously. But for large objects (typically array objects), most virtual machine implementations are likely to require contiguous memory space for the sake of simple implementation and high storage efficiency.

The Java heap can be implemented as a fixed size or expandable, but the current mainstream Java virtual machines are all implemented according to scalability (set by parameters -Xmx and -Xms). If there is no memory in the Java heap to complete the instance allocation, and the heap can no longer be expanded, the Java virtual machine will throw an OutOfMemoryError exception.

5. Method area

The method area, like the Java heap, is a memory area shared by each thread. It is used to store data such as type information, constants, static variables, and code cache compiled by the just-in-time compiler that have been loaded by the virtual machine. Although the "Java Virtual Machine Specification" describes the method area as a logical part of the heap, it has an alias called "non-heap" to distinguish it from the Java heap (the figure also shows the permanent generation, which has been abolished ). After jdk7, the string constant pool and static variables that were originally placed in the permanent generation have been moved out. After jdk8, the original space (Mate-Space) has been implemented in the local memory instead, and the remaining content of the permanent generation in jdk7 (mainly Is the type information) all moved to the metaspace (oop-klass model).

The "Java Virtual Machine Specification" has very relaxed restrictions on the method area. In addition to the same as the Java heap, it does not require contiguous memory and can choose a fixed size or expandable. You can even choose not to implement garbage collection. Relatively speaking, garbage collection behavior is relatively rare in this area, but it is not that the data enters the method area as "permanent" as the name of the permanent generation. The goal of memory recovery in this area is mainly for the recovery of the constant pool and the unloading of types. Generally speaking, the recovery effect of this area is more difficult and satisfactory. Especially for the unloading of types, the conditions are quite harsh, but the recovery of this part of the area is sometimes It is indeed necessary.

According to the "Java Virtual Machine Specification", if the method area cannot meet the new memory allocation requirements, an OutOfMemoryError exception will be thrown.

Supplement-runtime constant pool

The runtime constant pool is part of the method area. In addition to the description information of the class version, fields, methods, and interfaces in the class file, there is also a constant pool table (Constant Pool Table), which is used to store various literal and symbolic references generated during compile time. This part The content will be stored in the runtime constant pool in the method area after the class is loaded.

The Java virtual machine has strict regulations on the format of each part of the class file (naturally including the constant pool). For example, what kind of data each byte is used to store must meet the requirements of the specification before it can be recognized, loaded and executed by the virtual machine , But for the runtime constant pool, the "Java Virtual Machine Specification" does not make any detailed requirements. Virtual machines implemented by different providers can implement this memory area according to their own needs, but generally speaking, in addition to the description in the save class file The symbol reference (this symbol points to another constant in the constant pool, and this constant may also point to another constant again, such as the object instance will become a direct reference, that is, the memory address), and the symbol reference The direct reference is also stored in the runtime constant pool.

Another important feature of the runtime constant pool compared to the class file constant pool is that it is dynamic. The Java language does not require constants to be generated only at compile time, that is, they are not preset into the constant pool in the class file. In order to enter the method area runtime constant pool, you can also put new constants into the pool during runtime. This feature is more commonly used by developers in the intern() method of the String class.

Since the runtime constant pool is part of the method area, it is naturally limited by the memory of the method area. When the constant pool cannot apply for memory again, an OutOfMemoryError will be thrown.

Tips: Usually the OOM situation in the system is that the garbage collector does not have time to recycle the object during the creation process, resulting in more and more objects that cannot be recycled, then the consumption of heap memory will become larger and larger (referring to the new life in the heap memory) Generation Eden area). When the memory space of the Eden area is full and overflows, it will start to occupy another From area memory space. If the memory space of the From area is full and overflows, it will continue to occupy the memory space of the To area. If the object is still too late to reclaim, it will start to occupy the memory space of the old generation after the To area is full (Young GC), and OOM will occur when the memory space of the old generation is full (Full GC).

Guess you like

Origin blog.csdn.net/qq_38108719/article/details/115286466