JVM runtime data area

       In the process of executing a Java program, the Java virtual machine divides the memory it manages into several different data areas. These areas have their own purposes, as well as the creation and destruction time. Some areas exist when the virtual machine process is started, and some areas are established and destroyed depending on the start and end of user threads. According to the "Java Virtual Machine Specification (Java SE7 Edition)", the memory managed by the Java virtual machine will include the following runtime data areas. As shown below:

1.1> Program Counter

The Program Counter Register is a small memory space that can be seen as a line number indicator of the bytecode executed by the current thread. In the conceptual model of the virtual machine (only the conceptual model, various virtual machines may be implemented in some more efficient ways), the bytecode interpreter works by changing the value of this counter to select the next item to be executed Bytecode instructions, branches, loops, jumps, exception handling, thread recovery and other basic functions all rely on this counter to complete.

Since the multi-threading of the Java virtual machine is achieved by switching threads in turn and allocating processor execution time, at any given moment, a processor (for a multi-core processor, a core) will only execute one Instructions in the thread. Therefore, in order to restore the correct execution position after thread switching, each thread needs to have an independent program counter. The counters between each thread do not affect each other and are stored independently. We call this type of memory area "thread private". of memory.

If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed; if the thread is executing a Native method, the counter value is empty (Undefined). This memory region is the only region that does not specify any OutOfMemoryError conditions in the Java Virtual Machine Specification.

1.2>Java virtual machine stack

Like program counters, Java Virtual Machine Stacks are also thread-private and have the same life cycle as threads. The virtual machine stack describes the memory model of Java method execution: each method will create a stack frame (Stack Frame) to store the local variable table, operand stack, dynamic link, method exit and other information when it is executed. The process of each method from invocation to completion of execution corresponds to the process of a stack frame being pushed to the stack in the virtual machine stack.

Some people often divide Java memory into heap memory (Heap) and stack memory (Stack). This division method is relatively rough, and the division of Java memory area is actually far more complicated than this. The popularity of this division method can only show that most programmers are most concerned about the memory areas that are most closely related to object memory allocation. "Stack" is the virtual machine stack, or the part of the local variable table in the virtual machine stack.

The local variable table stores various basic data types known to the compiler (byte, short, int, long, float, double, char, boolean), object references (reference type, which is not equivalent to the object itself, may be a pointer to an object A reference pointer to the starting address, possibly a handle representing an object or other location relative to the object) and returnAddress type (pointing to the address of a bytecode instruction).

The 64-bit long and double types of data occupy 2 local variable spaces (Slots), and the rest of the data types only occupy 1. The memory space required by the local variable table is allocated during compilation. When entering a method, how much local variable space this method needs to allocate in the frame is completely determined, and the size of the local variable table will not be changed during the method execution.

In the Java virtual machine specification, two exception conditions are specified for this area: if the stack depth requested by the thread is greater than the depth allowed by the virtual machine, a StackOverflowError exception will be thrown; if the virtual machine can be dynamically expanded (most current Java virtual machines All machines can be dynamically extended, but the Java virtual machine specification also allows a fixed-length virtual machine stack). If enough memory cannot be applied for during expansion, an OutOfMemoryError exception will be thrown.

Note: The stack frame is the basic data structure of the method runtime

1.3> Local method stack

The native method stack (Native Method Stack) and the virtual machine stack play a very similar role, the difference between them is that the virtual machine stack serves for the virtual machine to execute Java methods (that is, bytecode), while the native method stack serves. It serves the Native method used by the virtual machine. In the virtual machine specification, there is no mandatory provision for the language, usage and data structure of methods in the native method stack, so a specific virtual machine can freely implement it. Even some virtual machines (such as the Sun HotSpot virtual machine) directly combine the native method stack and the virtual machine stack into one. Like the virtual machine stack, the native method stack area also throws StackOverflowError and OutOfMemoryError exceptions.

1.4> Java heap

For most applications, the Java Heap is the largest piece of memory managed by the Java Virtual Machine. The Java heap is a memory area shared by all threads, created when the virtual machine starts. The only purpose of this memory area is to store object instances, and almost all object instances allocate memory here. This point is described in the Java virtual machine specification: all object instances and arrays must be allocated on the heap, but with the development of JIT compilers and the gradual maturity of escape analysis technology, stack allocation and scalar replacement optimization technology will be Caused some subtle changes to occur, all objects allocated on the heap gradually became less "absolute".

The Java heap is the main area managed by the garbage collector, so it is often referred to as the "GC heap". From the perspective of memory recovery, since the collectors basically use the generational collection algorithm, the Java heap can be subdivided into: the new generation and the old generation; more detailed are Eden space, From Survivor space, To Survivor space etc. From the perspective of memory allocation, the Java heap shared by threads may be divided into multiple thread-private allocation buffers (Thread Local Allocation Buffer, TLAB). However, no matter how it is divided, it has nothing to do with the storage content. No matter which area, the object instance is still stored. The purpose of further division is to reclaim memory better or allocate memory faster.

According to the Java Virtual Machine Specification, the Java heap can be in a physically discontinuous memory space, as long as it is logically continuous, just like our disk space. When implemented, it can be implemented as either fixed size or extensible, but the current mainstream virtual machines are implemented according to extensibility (controlled by -Xmx and -Xms). If there is no memory in the heap to complete the instance allocation, and the heap can no longer be expanded, an OutOfMemoryError exception will be thrown.

1.5> Method area

The Method Area, like the Java heap, is a memory area shared by each thread. It is used to store data such as class information, constants, static variables, and code compiled by the real-time compiler that have been loaded by the virtual machine. Although the Java virtual machine specification describes the method area as a logical part of the heap, it has an alias called Non-Heap (non-heap), which should be distinguished from the Java heap.

The Java virtual machine specification has very loose restrictions on the method area, except that it does not require contiguous memory like the Java heap and can choose a fixed size or expandability, and you can also choose not to implement garbage collection. Relatively speaking, garbage collection behavior is less implemented in this area, but it is not that data enters the method area as "permanent" as the name of the permanent generation. The memory reclamation target in this area is mainly for the reclamation of the constant pool and the unloading of types. Generally speaking, the recovery "score" of this area is relatively unsatisfactory, especially the unloading of types, the conditions are quite harsh, but this part of the area recycling is indeed necessary.

According to the Java virtual machine specification, when the method area cannot meet the memory allocation requirements, an OutOfMemoryError exception will be thrown.

1.6> Runtime constant pool

The Runtime Constant Pool is part of the method area. In addition to the description information of the class version, field, method, interface, etc. in the Class file, there is also a constant pool (Constant Pool Table), which is used to store various literals and symbolic references generated during compilation. This part of the content It will be stored in the runtime constant pool that enters the method area after the class is loaded.

The Java virtual machine has strict regulations on the format of each part of the Class file (including the constant pool, of course). What kind of data each byte is used to store must meet the requirements of the specification before it can be recognized, loaded and executed by the virtual machine. But for the runtime constant pool, the Java virtual machine specification does not require any details, and virtual machines implemented by different providers can implement this memory area according to their own needs. However, in general, in addition to saving the symbolic references described in the Class file, the translated direct references are also stored in the runtime constant pool.

Another important feature of the runtime constant pool compared to the class file constant pool is that it is dynamic. The Java language does not require constants to be generated only at compile time, that is, the content of the constant pool that is not preset in the class file can enter the method area. Run-time constant pool, new constants may also be put into the pool during runtime, this feature is more used by developers is the intern() method of the String class.

Since the runtime constant pool is part of the method area, it is naturally limited by the memory of the method area. When the constant pool can no longer apply for memory, an OutOfMemoryError exception will be thrown.

1.7>Direct memory

Direct Memory (Direct Memory) is not part of the virtual machine runtime data area, nor is it a memory area defined in the Java Virtual Machine Specification. But this part of the memory is also used frequently, and it may also cause an OutOfMemoryError exception.

In JDK1.4, the NIO (New Input/Output) class was newly added, and an I/O method based on Channel and Buffer was introduced, which can directly allocate off-heap memory using the Native function library. Then operate through a DirectByteBuffer object stored in the Java heap as a reference to this memory. This can significantly improve performance in some scenarios by avoiding copying data back and forth between the Java heap and the Native heap.

Obviously, the allocation of native direct memory will not be limited by the size of the Java heap. However, since it is memory, it will definitely be limited by the size of the total native memory (including RAM and SWAP area or paging file) and the addressing space of the processor. . When configuring virtual machine parameters, server administrators will set parameter information such as -Xmx according to actual memory, but often ignore direct memory, so that the sum of each memory area is greater than the physical memory limit (including physical and operating system-level limits), resulting in An OutOfMemoryError exception occurs when expanding dynamically.

2.1> Creation of objects

When the virtual machine encounters a new instruction, it will first check whether the parameters of this instruction can locate a symbolic reference of a class in the constant pool, and check whether the class represented by this symbolic reference has been loaded, resolved and initialized. If not, the corresponding class loading process must be performed.

After the class loading check passes, the virtual machine next allocates memory for the nascent object. The size of the memory required by the object can be completely determined after the class is loaded. The task of allocating space for the object is equivalent to dividing a certain size of memory from the Java heap. Assuming that the memory in the Java heap is absolutely regular, all the used memory is placed on one side, the free memory is placed on the other side, and a pointer is placed in the middle as an indicator of the demarcation point, then the allocated memory is just the pointer. Move a distance equal to the size of the object to the free space, this allocation method is called "pointer collision". If the memory in the Java heap is not regular, and the used memory and the free memory are interleaved, there is no way to simply perform pointer collisions, and the virtual machine must maintain a list of which memory blocks are available. When allocating, find a large enough space from the list to divide it into the object instance, and update the records on the list. This allocation method is called "free list". Which allocation method is selected is determined by whether the Java heap is regular, and whether the Java heap is regular is determined by whether the garbage collector used has a compaction function. Therefore, when using a collector with a Compact process such as Serial and ParNew, the allocation algorithm adopted by the system is pointer collision, while when using a collector based on the Mark-Sweep algorithm such as CMS, a free list is usually used.

In addition to how to divide the available space, there is another problem that needs to be considered. Object creation is a very frequent behavior in the virtual machine. Even just modifying the location pointed to by a pointer is not thread-safe under concurrent conditions. , it may happen that memory is being allocated to object A, the pointer has not had time to be modified, and object B uses the original pointer to allocate memory at the same time. There are two solutions to this problem, one is to synchronize the action of allocating memory space - in fact, the virtual machine uses CAS coupled with failed retry to ensure the atomicity of update operations; the other is to allocate memory Actions are performed in different spaces according to the division of threads, that is, each thread pre-allocates a small piece of memory in the Java heap, which is called Thread Local Allocation Buffer (TLAB). Which thread needs to allocate memory, it is allocated on the TLAB of which thread, only when the TLAB is used up and a new TLAB is allocated, the synchronization lock is required. Whether the virtual machine uses TLAB can be determined by the -XX:+/-UseTLAB parameter.

After the memory allocation is completed, the virtual machine needs to initialize the allocated memory space to a zero value (excluding the object header). If TLAB is used, this work process can also be performed in advance of TLAB allocation. This step ensures that the instance fields of the object can be used directly in the Java code without assigning an initial value, and the program can access the zero values ​​corresponding to the data types of these fields.

Next, the virtual machine needs to make necessary settings for the object, such as which class instance the object is, how to find the metadata information of the class, the hash code of the object, and the GC generation age of the object. This information is stored in the Object Header of the object. Depending on the current running state of the virtual machine, such as whether to enable bias lock, the object header will be set differently.

After the above work is completed, from the perspective of the virtual machine, a new object has been created, but from the perspective of the Java program, the object creation has just begun - the <init> method has not been executed, all the fields are still zero. Therefore, in general (determined by whether the invokespecial instruction is followed in the bytecode), after executing the new instruction, the <init> method will be executed to initialize the object according to the programmer's wishes, so that a truly usable object is considered complete. produced.

2.2> Object memory layout

In the HotSpot virtual machine, the layout of objects stored in memory can be divided into three areas: object header (Header), instance data (Instance Data) and alignment padding (Padding).

The object header of the HotSpot virtual machine includes two parts of information. The first part is used to store the runtime data of the object itself, such as hash code, GC generation age, lock status flag, lock held by thread, biased thread ID, biased timestamp Etc., the length of this part of the data is 32bit and 64bit respectively in 32-bit and 64-bit virtual machines (without opening the compression pointer), which is officially called "Mark Word". The object needs to store a lot of runtime data, which has exceeded the limit that can be recorded by the 32-bit and 64-bit Bitmap structure, but the object header information is an additional storage cost independent of the data defined by the object itself, considering the space efficiency of the virtual machine. , Mark Word is designed as a non-fixed data structure in order to store as much information as possible in a very small space, it will reuse its own storage space according to the state of the object. For example, in a 32-bit HotSpot virtual machine, if the object is in an unlocked state, 25 bits in the 32-bit space of Mark Word are used to store the object hash code, 4 bits are used to store the object generation age, and 2 bits are used to store the object generation age. The storage lock flag bit, 1bit is fixed to 0, and the storage content of the object in other states (lightweight lock, heavyweight lock, GC mark, deflectable) is as follows:

Another part of the object header is the type pointer, that is, the pointer to the object's class metadata, and the virtual machine uses this pointer to determine which class the object is an instance of. Not all virtual machine implementations must retain type pointers on object data, in other words, looking up metadata information for an object does not necessarily go through the object itself. In addition, if the object is a Java array, there must also be a piece of data for recording the length of the array in the object header, because the virtual machine can determine the size of the Java object through the metadata information of the ordinary Java object, but from the metadata of the array However, the size of the array cannot be determined.

The following instance data part is the effective information that the object actually stores, and it is also the field content of various types defined in the program code. Whether it is inherited from the parent class or defined in the subclass, it needs to be recorded. The storage order of this part is affected by the virtual machine allocation strategy (FieldsAllocationStyle) and the order in which the fields are defined in the Java source code. The default allocation strategy of the HotSpot virtual machine is longs/doubles, ints, shorts/chars, bytes/booleans, and oops (Ordinary Object Pointers). It can be seen from the allocation strategy that fields of the same width are always allocated together. When this precondition is met, variables defined in the parent class will appear before the child class. If the value of the CompactFields parameter is true (the default is true), then the narrower variables in the subclass may also be inserted into the gaps in the parent class variable.

The third part of alignment padding does not necessarily exist, nor does it have a special meaning, it just acts as a placeholder. Because the automatic memory management system of HotSpot VM requires that the starting address of the object must be an integer multiple of 8 bytes, in other words, the size of the object must be an integer multiple of 8 bytes. The object header part is exactly a multiple of 8 bytes (1 or 2 times), so when the object instance data part is not aligned, it needs to be completed by alignment padding.

2.3> Access positioning of objects

Objects are created in order to use objects, and our Java program needs to manipulate specific objects on the heap through the reference data on the stack. Since the reference type only specifies a reference to an object in the Java virtual machine specification, and does not define how the reference should locate and access the specific location of the object in the heap, the object access method also depends on the virtual machine implementation. Depends. The current mainstream access methods are the use of handles and direct pointers.

If handle access is used, then a piece of memory will be divided into the Java heap as the handle pool, the handle address of the object is stored in the reference, and the handle contains the specific address information of the object instance data and type data, as shown in the following figure :

If direct pointer access is used, then the layout of the Java heap object must consider how to place the relevant information of the access type data, and the object address is directly stored in the reference, as shown in the following figure:

These two object access methods have their own advantages. The biggest advantage of using handles to access is that the stable handle address is stored in the reference. When the object is moved (moving objects is a very common behavior during garbage collection), only the handle will be changed. Instance data pointer, and the reference itself does not need to be modified.

The biggest advantage of using the direct pointer access method is that it is faster. It saves the time overhead of a pointer positioning. Since object access is very frequent in Java, this kind of overhead is also a very considerable execution cost. . For Sun HotSpot, it uses the second method for object access, but from the perspective of the entire software development, it is also very common for various languages ​​and frameworks to use handles to access.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326273656&siteId=291194637