Operating mechanism of the Java virtual machine, memory management, garbage collection and four references

This article explains the main points are as follows:

First, what is the JVM

Two, JVM operating mechanism

Three, java memory management mechanism

Four, java garbage collection

Five, java garbage collection algorithm

Six, java kinds of four kinds of references

A, JVM is and what role

JVM is a virtual computer that can run java code.

Java virtual machine includes a bytecode instruction set, a set of registers, a stack, a heap garbage collection method and a storage domain.

java source code is first compiled after the corresponding .class bytecode files (one for each source file to generate a java corresponding .class bytecode files), resulting .class files by the JVM in the bytecode interpreter, and is the Java virtual machine bytecode instructions set .... compiled into machine code on a particular machine.

Java source files -> Compiler -> bytecode files -> Jvm -> Encoding

So, the most important is the role of JVM for each operating system to develop its corresponding interpreter, as long as it has a corresponding version of the operating system JVM, so this code of Java compiler can be up and running, this is the first Java can be compiled, the reason running everywhere.

Java JVM when the program begins execution, and it was running, it stops when the program ends.

Two, JVM operating mechanism

1, JVM structural system

First, Java source code files (.java extension) are Java compiler to compile bytecode file (.class suffix), and then in the JVM class loader loads the bytecode for each class file, then loaded, referred to JVM execution engine execution (process further comprising performing bytecode compiler into machine code), when executed JVM execution engine will first bytecode class files scanned four times to ensure the safety of the defined type, then check null reference data cross-border, automatic garbage collection. In the course of the entire program execution, JVM will be used to store program data and related information used during execution need some space, this space is generally referred to as Runtime Data Area (runtime data area), that is, we often say that the JVM RAM

Class loader into the boot class loader (not inherited classLoader, part of the virtual machine; responsible for loading the Java core libraries native code implementation, including loading JAVA_HOME in jre / lib / rt.jar in all class); the extension class loader is (responsible for the extensions directory to find the JVM to load Java extension library, including JAVA_HOME in jre / lib / ext / jar package under xx.jar or -Djava.ext.dirs specified directory); application class loader ( ClassLoader.getSystemClassLoader () is responsible for loading the Java class path classpath in class)

Class loading mechanism of the process:

1, Load : load find binary file, binary byte stream class acquired by the fully qualified name of a class, and static storage structure represented by the byte stream into runtime data structure of the zone method; stack in Java generating a representative of this class java.lang.Class objects, such as a method of access to the data area of the inlet.

2, verification : Class byte stream in order to ensure the information contained in the file to meet the requirements of the current virtual machine, complete verification of the following four stages: verification of format, the metadata verification, validation and bytecode verification of symbolic references.

3. Preparation : Prepare stage is the stage of the formal allocation of memory and set the initial value of the class variable is a class variable, that memory will be assigned in the method area

4, parsing : parsing stage is a virtual machine to a symbolic constant pool references into direct reference to the process

5, initialization : initialization phase is to initialize class variables and other resources in accordance with procedures specified by the programmer subjective plan, that is, during the execution of the class constructor () method

Three, java memory management mechanism

JVM memory divided into separate zones 5, heap memory, stack memory, the method area and local area method, the program counter

Heap memory:

       1, store the object itself, member variables memory

       2, the memory is dynamically allocated (allocation of memory operation) of

       3, JVM only a heap memory, when the data inside all threads share the

       4, the heap memory by the JVM the GC recovered

       5, the most prone to OOM

Stack memory:

       1, the method of storing the local data base types of variables and custom object reference , is a reference to the object in mind, the object itself is located in the heap memory.

       2, a thread corresponding to a stack, each of the data stack are private and can not be accessed other stacks.

       3, a stack may have a plurality of stack frames, a stack frame corresponds to a method of the first call .

       4, the stack memory is automatically released after the end of the method, the memory of which will be created automatically recovered

Every call a method, that method will allocate a memory stack frame, the stack memory used to store this method is mainly defined basic data types of variables and returns the result, if the custom object, then the object itself is certainly memory assigned to the heap of them, but the object is referenced in this stack memory, after the implementation of the method, the stack memory has also been recovered (pop), all local variables defined life cycle is over, automatic memory directly release, the reference object does not exist, and therefore, the method of object reference type defined in the heap will have no references to it, so that when the scan GC, will be disposed as garbage collection to the beginning of the discharge the memory occupied by objects

Native method stacks:

       Native method stacks and functions very similar JVM stack, local variable table stores the native methods (C / C ++), the method of the operand stack local information and the like.

Method area:

      The method storage area is already loaded virtual machine data:

     1,  threads share

     2, the stored data type: type of information, a constant, static variables, the time compiler to compile the code, and the like.

Program Counter:

      In the conceptual model of the JVM bytecode interpreter is working by changing the value of the counter to select the next bytecode instruction to be executed . Branches, loops, jumps, exception handling, thread resume and other basic functions need to rely on the counter to complete.

     Multithreading is the JVM thread alternately switched by the processor execution time and distribution to achieve, for the switching between the pieces of thread counter can be restored to the correct position is performed, so that each thread will have an independent program counter .

java object is created and initialized:

After java object is created, it will have its own memory heap in an area, then the process is to initialize the object. Typically by the object to be constructed is initialized , the configuration is the same special method does not return a value to the class name; if a class does not define a constructor, the system will automatically generate a default constructor does not accept any parameters; but if you have defined a constructor (regardless of whether there are parameters), the compiler will not automatically create a default constructor; we can make multiple overloads for the constructor (that is, the argument list passed a different number or a different order), You can also call another constructor in a constructor, but only called once, and the constructor must be placed at the very start, otherwise the compiler will complain.

Then the class member initialization is how to do it? Order what is it? java all the variables before use should get properly initialized, even the local variables of the method, if not initialized compile-time error occurs; and if the variable is a member of the class, even if you do not initialize assignment, the system will also one of its initial value, the initial value of e.g. char, int type is 0, then the object reference is not initialized default to null.

Class member initialization sequence summary: general construction again, after my late father after the first static class subclass, see the same level of writing order

1. First run a parent static variables and code block, then perform subclass static variables and code block
       2 to perform a normal variable and parent class code block, then perform the parent class constructor (static method) 
       3. The first execution sub Common variables and class code block, then performing sub-class constructor (static method) 
       4.static method initializes prior to the conventional method, the static initialization is carried out only at the necessary time and initialized only once.

Note: The constructor of a subclass, regardless of the constructor that takes no arguments, it will go to the default constructor with no parameters to find the parent class. , Then the subclass must call the parent class constructor parameters of the key child with supper if no parent constructor with no arguments, otherwise it does not compile.

Four, java garbage collection

java automatic memory management, automated management of the main two aspects: one is to the object automatically allocate memory , one is automatically reclaims memory objects, and memory regions of these two issues involved is the Java memory model of the heap area . We know that garbage collection is a significant feature of the Java language, which can effectively prevent memory leaks to ensure the effective use of memory, so that Java programmers in the preparation of the program no longer need to consider the memory management problem

java garbage collection main processes: When the GC is triggered (at any time may trigger), the garbage collector will scan the entire memory area, if you find a particular object has no memory when referring to it, that the GC will be occupied by the object recovery off during GC triggered, in addition to the GC threads are blocked, triggering GC only after the completion of other threads will continue.

From this process we can be found three problems:

1. What kind of memory can be recovered (two classical algorithms whether an object can be recycled: Reference counting and reachability analysis algorithm)

2, when the memory is recovered (when scanning GC)

3, how memory is recovered (using garbage collection algorithm --- will be mentioned below)

The method of recovery zone:

Memory recovery target zone method is mainly for  recycling constant pool  and  unloading of type . Constant recycling of waste and recycling Java heap object is very similar. To recover constant literal pool, for example, if a string "abc" has entered the constant pool, but the current system does not have any a String object is called the "abc", in other words there is no constant String object reference "abc" constant pool, there is no other references to these literal, if memory recycling occurs at this time, and if necessary, the "abc" constant system will be "invited" the constant pool. Symbols other classes (interfaces), methods, fields in the constant pool reference is also similar.

Constant is determined whether or not a "waste constant" is relatively simple, and to determined whether a class is "useless class" is relatively many harsh conditions. Class needs to meet the following three conditions in order to be regarded as "useless class":

  • All instances of the class have been recovered, which is the Java heap any instance of the class does not exist;

  • ClassLoader loaded class has been recovered;

  • Java.lang.Class corresponding to the class object is not referenced in any place, not by the method of accessing the class reflected anywhere.

How to determine whether an object can be recycled?

1, reference notation: determining the number of references to objects

Reference counting algorithm to determine whether an object is determined by the number of reference objects can be recovered.

Reference counting algorithm is a strategy early in the garbage collector. In this method, each object instance has a reference count of the heap. When an object is created, and the object instance is assigned to a reference variable, the reference count of the object instance is set to 1. When any other variable is assigned as a reference to the object, the object reference to the instance count is incremented 1 (a = b, b of the counter object instance referenced plus 1), but a reference to an object instance exceeds the life cycle of or when a new value is set, the instance of the object reference count by 1. In particular, when an object instance is garbage collected, any object instance that references a reference counters are decremented. Any reference count of an object instance 0 can be garbage collected.

Reference counting collector can be quickly executed and interwoven in the program is running, the program needs more favorable for a long time will not be interrupted by the real-time environment, but it is difficult to solve the problem of mutual circulation between object references. As shown in the following schematic procedures and reference count between the object and objA objB it never is 0, then the two objects can never be recovered.

public class ReferenceCountingGC {
  
        public Object instance = null;
 
        public static void testGC(){
 
            ReferenceCountingGC objA = new ReferenceCountingGC ();
            ReferenceCountingGC objB = new ReferenceCountingGC ();
 
            // 对象之间相互循环引用,对象objA和objB之间的引用计数永远不可能为 0
            objB.instance = objA;
            objA.instance = objB;
 
            objA = null;
            objB = null;
 
            System.gc();
    }
}

The code will objA and the rearmost two objB assigned to null, and that is objA objB point has been the object can no longer be accessed, but because they refer to each other, causing them to reference counter is not 0, then the garbage collector is that they will never recover.

2, reachability analysis algorithm: determine whether the object reference chains up

Reachability analysis algorithm determines whether the object is referenced by a chain of up to determine whether the object can be recovered.

Reachability analysis algorithm is introduced from discrete mathematics graph theory, program all references to the relationship seen as a map, through a series of objects called "GC Roots" as a starting point, these nodes from the start down search, called search path traversed reference chain (reference chain). When an object is not connected to any reference GC Roots chain (in the words from graph theory, it is the object to GC Roots unreachable), then it proves that this object is not available, as shown in FIG. In Java, can be used as an object GC Root include the following:

  • Virtual Machine stack (local variable table stack frame) in the object reference;

  • The method of the object in a static class attribute references;

  • A method of object reference constant region;

  • Native method stacks local objects referenced method;

Five, java garbage collection algorithm

1, clear labeling algorithm

This is the most basic garbage collection algorithm, it is the most basic reason is because it is the easiest to implement, is also the simplest ideas. Mark - sweep algorithm is divided into two stages: marking phase and cleanup phase. Mark phase task is to mark all the objects that need to be recovered, the cleanup phase is to reclaim the space marked occupied by the object . FIG using the following procedure:

Mark - sweep algorithm has two main problems:

  • Efficiency: mark and sweep efficiency of the two processes is not high;

  • Space problem: mark - clearing algorithm does not require moving object, and the object is not processed only survive, it will produce a large number of discrete memory fragmentation mark after clearing space debris may cause too much after the program is running when the need to allocate large objects, can not find enough contiguous memory and had to trigger another garbage collection operation in advance.

2, replication algorithm

In order to address the shortcomings Mark-Sweep algorithm, Copying algorithm it was put out. It is divided into the available memory capacity by two of equal size, uses only one of them. When this piece of memory runs out, the copy will also survive object to another piece on top of the memory space that has been used once and then clean out, so that is not prone to memory fragmentation problems. FIG using the following procedure:

Copy the advantages and disadvantages of the algorithm:

Advantages: efficient operation and not prone to memory fragmentation

Disadvantages: the use of memory space to make a high price, because the available RAM reduced to half. Obviously, the number of live objects with the efficiency of the algorithm Copying how much of a great relationship, if you survive many objects, then the efficiency Copying algorithms will be greatly reduced.

3, marking Collation Algorithm

In order to address the shortcomings Copying algorithms, make full use of memory space, we put forward Mark-Compact algorithm. The algorithm mark phase and Mark-Sweep same, but after the completion flag, it is not directly recycled to clean up the object, but the object is moved to one end of survival, and then clean off the end of the memory outside of the boundary . FIG using the following procedure:

4, generational collection algorithm

For a large system, when objects and methods to create variables relatively long time, heap memory object will be more and more, whether the object is analyzed one by one if recovered, it will inevitably result in inefficiency. Generational collection algorithm is based on this fact: life cycle (survival) of different objects is not the same, but different life cycle of an object located in different areas of the heap, so the heap memory in different regions adopt different strategies for recycling you can improve the efficiency of the JVM. Contemporary commercial use virtual machines are generational collection algorithm: the new generation of low survival rate of the object, on the use of replication algorithm; old's high survival rate, use clear labeling algorithm or tags to organize algorithm. Java heap memory generally be divided into the new generation, and the permanent generation of three year old module, as shown below:

1). New Generation (Young Generation)

  New Generation goal is to collect as quick off the short life cycle of those objects, under normal circumstances, all newly generated objects are on the first of the new generation. The new generation memory in accordance with 8: 1: 1 ratio into a eden region and two survivor (survivor0, survivor1) area, most of the objects generated in the Eden area. During garbage collection, the first district eden live objects copied to survivor0 area, then emptied eden zone, when the survivor0 area is also full, then the eden zone area and survivor0 survival copy objects to survivor1 area, then emptied and the eden survivor0 area, this time survivor0 area is empty, then switch roles survivor0 area and survivor1 area (ie, the next garbage will scan the Eden area and survivor1 zone recovery), that is, keeping survivor0 area is empty, and so forth. In particular, when survivor1 storage area is not enough to live objects eden region and survivor0 area, it will be live objects placed directly on to the old era. If the old year was full, it will trigger a FullGC, is the new generation, the old year are recycled. Note, GC Cenozoic is also called MinorGC, MinorGC frequency of occurrence is relatively high, and so is not necessarily the only trigger Eden area is full.


2) Old's (Old Generation)

  Old's store are some of the longer life cycle of the object, as described above, as the object went through N times after garbage collection is still alive in the new generation will be placed in the old era. In addition, the memory of old age is also much larger than the new generation (roughly the ratio is 1: 2), when the old year full trigger Major GC (Full GC), the object's old survived a long time, so the frequency of occurrence of relatively low FullGC .


3) Permanent Generation (Permanent Generation)

  Mainly used for storage of permanent generation of static files, such as Java classes and methods. When the permanent generation of no significant impact on garbage collection, but some applications may be dynamically generated, or call some class, such as the use of a reflective, dynamic agency, CGLib and other bytecode framework, at this time need to set a relatively large permanent generation of space to store the run during the new class.

Six, java 4 in reference types

1, strong references

Strong ubiquitous in the reference refers to the program code, similar to "Object obj = new Object () " of such references. As long as there are strong references, the garbage collector will never referenced objects recovered off. If insufficient memory, JVM would rather throw OutOfMemoryError memory overflow error does not recover strongly references, if you want to recover JVM strong reference to the object type, will change its reference to null , null reference JVM will recover this formation at the right time .

2, soft references

There used to describe some of the soft references with, but not required object. For soft references associated with the object in the GC scanning process, only when there is insufficient memory space, the object will be recovered out of the soft references . If the recovery is still not enough memory, memory overflow exception will be thrown. After JDK 1.2, it provides SoftReference classes to implement soft references.

Soft references are suitable for caching in memory is sufficient that the reference values directly through the soft, without having to query data from real sources, can significantly improve the website performance, when there is insufficient memory, allowing JVM for recycling, thus removing the cache, which when only query data from real sources

3, weak references

Weak references are also used to describe non-essential objects, but its strength is weaker than the soft reference number is associated with a weak reference to an object can only survive until the next garbage collection occurs. When the garbage collector job, regardless of the adequacy of current memory will only recover lost objects are associated with weak references . After JDK 1.2, provided WeakReference classes to implement weak.

As can be seen, the weak reference object associated, will be recovered at a later call to the garbage collector,

Weak references can prevent memory leaks in the callback function, because the callback function is anonymous inner classes, a non-static inner classes will hold a strong reference to the outer class by default , when the JVM at the time of recovery outside the class, in which case the callback function in a thread when the callback, JVM will not be collected outside class, resulting in a memory leak.

4, phantom reference

Virtual reference is a reference to the weakest relationship. Whether there is a phantom reference object has completely will not affect their survival time, if an object is associated with a virtual reference, then at any time may be recovered virtual machine, the virtual reference can not be used alone and must be used in conjunction with a reference queue . After JDK 1.2, provided PhantomReference classes to implement virtual reference.

When the garbage collector is ready to reclaim an object, if it is found associated with the virtual reference. In the past he would recover this phantom reference is added to the reference queue, the program can determine whether to join the queue reference phantom reference to understand referenced whether objects to be recovered, if you really want to be recovered, they can do some finishing work before recycling.

 

Published 23 original articles · won praise 19 · views 2134

Guess you like

Origin blog.csdn.net/huyinda/article/details/104769688