Shang Silicon Valley-Song Hongkang-JVM Complete Notes-JVM Part 1_Memory and Garbage Collection

foreword

1. jvm and java architecture

1. Introduction to Java and JVM

TIOBE Language Popularity Ranking
https://www.tiobe.com/tiobe-index/
There is no best programming language in the world, only the most suitable programming language for specific application scenarios.

  • java: a cross-platform language
    insert image description here
  • JVM: A cross-language platform
    insert image description here
  • With the official release of Java 7, the designers of the Java virtual machine have basically implemented programs written in non-Java languages ​​on the Java virtual machine platform through the JSR-292 specification.
  • The Java virtual machine doesn't care what programming language the program running inside is written in. It only cares about the "bytecode" file. That is to say, the Java virtual machine has language independence and is not simply related to Java. Language "lifetime binding", as long as the compilation results of other programming languages ​​meet and include the internal instruction set, symbol table and other auxiliary information of the Java virtual machine, it is a valid bytecode file and can be recognized by the virtual machine and load and run.
  • bytecode
  • The Java bytecode we usually talk about refers to the bytecode compiled in the java language. To be precise, any bytecode format that can be executed on the jvm platform is the same. Therefore, it should be collectively referred to as: jvm byte code.
  • Different compilers can compile the same bytecode file, and the bytecode file can also run on different jvms.
  • The java virtual machine is not necessarily related to the java language. It is only associated with a specific binary file format—Class file format. The Class file contains the java virtual machine instruction set (or called bytecode, Bytecodes) and symbol table , and some other auxiliary information.
  • Multi-language mixed programming
  • Multi-language mixed programming on the Java platform is becoming the mainstream, and solving problems in specific domains through domain-specific languages ​​is a direction for current software development to meet increasingly complex project requirements.
  • Just imagine, in a project, parallel processing is written in Clojure language, the presentation layer uses JRuby/Rails, the middle layer uses java, each application layer will use different programming languages ​​to complete, and the interface is for each The developers of the layer are transparent. There is no difficulty in the interaction between various languages, just as convenient as using the native API of their own language, because they all run on a virtual machine in the end.
  • For these running on the Java virtual machine, languages ​​other than Java, from the system level, the underlying support is rapidly increasing, a series of projects and functional improvements centered on JSR-292 (such as DaVinci Machine project, Nashorn engine, InvokeDynamic instructions, java.lang.invoke package, etc.), to promote the development of the Java virtual machine from "Java language virtual machine" to "multilingual virtual machine".

2. Major events in the development of Java

insert image description hereinsert image description here

  • Open JDK和Oracle JDK
    Please add a picture description

Before JDK11, there will be some closed-source functions in OracleJDK that are not in OpenJDK. But in JDK11, we can think that the OpenJDK and OracleJDK codes are essentially identical.

3. Virtual machine and Java virtual machine

  • virtual machine
  • The so-called virtual machine (Virtual Machine) is a virtual computer, which is a piece of software used to execute a series of virtual computer instructions. In general, virtual machines can be divided into system virtual machines and program virtual machines.
  • The well-known Visual Box and VMware belong to the system virtual machines, which are completely simulations of physical computers and provide a software platform that can run a complete operating system.
  • The typical representative of the program virtual machine is the Java virtual machine, which is specially designed to execute a single computer program. The instructions executed in the java virtual machine are called Java bytecode instructions.
  • Whether it is a system virtual machine or a program virtual machine, the software running on it is limited to the resources provided by the virtual machine.
  • Java virtual machine
  • The java virtual machine is a virtual computer that executes Java bytecode. It has an independent operating mechanism, and the java bytecode it runs may not be compiled from the Java language.
  • Various languages ​​​​of the JVM platform can share the cross-platform type brought by the Java virtual machine, excellent garbage collectors, and scientific just-in-time compilers.
  • The core of JAVA technology is the Java Virtual Machine (JVM, Java Virtual Machine), because all Java programs run inside the Java Virtual Machine.

- Function:

  • The JAVA virtual machine is the operating environment of the binary bytecode. It is responsible for loading the bytecode into its interior, interpreting/compiling it into machine instructions on the corresponding platform for execution. Each Java instruction has a detailed definition in the Java virtual machine specification, such as how to fetch Operands, how to process the operands, and where to place the processing results.

- Features:

  1. Compile once, run everywhere. 2. Automatic memory management. 3. Automatic garbage collection function
  • The location of the JVM
    Please add a picture description

The JVM runs on top of the operating system and has no direct interaction with the hardware.

4. The overall structure of the JVM

  • HotSpot VM is one of the representatives of high-performance virtual machines currently on the market.
  • It adopts an architecture where an interpreter and a just-in-time compiler coexist.
  • Today, the operating performance of Java programs has been reborn, and has reached the point where it can compete with C/C++ programs.

Talk about your overall understanding of the JVM?

  • class loading subsystem
  • Runtime data area (we focus on the stack, heap, and method areas here)
  • Execution engine (interpreter and JIT compiler coexist)

5. Java code execution process

Please add a picture description

6. Architecture model of JVM

The instruction stream input by the Java compiler is basically a stack-based instruction set architecture, and another instruction set architecture is a register-based instruction set architecture. Specifically: the difference between the two architectures
:

  • Features based on stack architecture:
  1. Simpler to design and implement, suitable for resource-constrained systems.
  2. Avoiding the problem of register allocation: the allocation of zero-address instructions is applicable.
  3. Most of the instructions in the instruction stream are zero-address instructions, and their execution depends on the operation stack. The instruction set is smaller and the compiler is easy to implement.
  4. No need for hardware support, better portability, and better cross-platform implementation.
  • Features of register-based architecture:
  1. A typical application is the x86 binary instruction set, such as the traditional PC-level Android Davlik virtual machine.
  2. The instruction set architecture is completely dependent on hardware and has poor portability.
  3. Excellent performance and more efficient execution.
  4. It takes fewer instruction fetches to complete an operation.
  • In most cases, register-based instruction sets are based on one-address instructions, two-address instructions and three-address instructions, and the second stack-based instruction set is indeed dominated by zero-address instructions.

Summarize:

  • Due to the cross-platform design, Java instructions are designed according to the stack. Different platforms have different CPU architectures, so they cannot be related to register-based. The advantages are cross-platform, small instruction set, and easy implementation of the compiler. The disadvantage is that performance drops , to achieve the same function requires more instructions.
  • Today, although the embedded platform is no longer the mainstream operating platform for Java programs (accurately speaking, the host environment of HotSportVM is no longer limited to the embedded platform), why not replace the architecture with a register-based architecture?

7. JVM life cycle

Start the virtual machine:

  • The startup of the Java virtual machine is completed by creating an initial class (initial class) through the bootstrap class loader, which is specified by the specific implementation of the virtual machine.

Running the virtual machine:

  • A running Java virtual machine has one clear mission: to execute Java programs.
  • It runs when the program starts executing and stops when the program ends.
  • When a so-called Java program is executed, what is actually being executed is a process called the Java virtual machine.

Exit of the virtual machine:

There are several situations as follows:

  • The program ends normally
  • The program terminates abnormally when an exception or error is encountered during execution.
  • The Java virtual machine process terminated due to an error in the operating system.
  • A thread calls the exit method of the Runtime class or the System class, or the halt method of the Runtime class, and the Java security manager also executes this exit or halt operation.
  • In addition, the JNI (Java Native InterFace) specification describes the use of the JNI Invocation API to load or unload the Java virtual machine and the exit of the Java virtual machine.

8. JVM development history

  • Sun Classic VM
  • As early as 1996, when Java1.0 was released, Sun could not release a Java virtual machine called Sun Classic VM. It was also the world's first commercial Java virtual machine, and it was completely eliminated in JDK1.4.
  • Only an interpreter is provided inside this virtual machine.
  • If you use the JIT compiler, you need to plug in. But once the JIT compiler is used, the JIT will take over the execution system of the virtual machine. The interpreter will no longer work. The interpreter and the compiler cannot work together.
  • Now hotspot has this virtual machine built in.
  • Exact VM
  • In order to solve the previous virtual machine problem, when jdk1.2, sun provides this virtual machine.
  • Exact Memory Management: Accurate memory management.
  • It can also be called Non-Conservative/Accurate Memory Management
  • The virtual machine can know what type of data is in a certain location in memory.
  • Concrete prototype of modern high-performance virtual machine
  1. hotspot detection
  2. Mixed working mode of compiler and interpreter
  • It is only used briefly on the Solaris platform, and it is still a classic VM on other platforms
  1. The hero was short of breath and was finally replaced by the Hotspot virtual machine.
  • Sun's HotSpot VM
  • History of HotSpot
  1. Initially there was a small company called "Longview Technologies" involved with .
  2. In 1997, the company was acquired by Sun, and in 2009, Sun was acquired by Oracle.
  3. In JDK1.3, HotSpot VM becomes the default virtual machine.
  • Currently HotSpot occupies an absolute market position, dominating the martial arts
  1. Whether it is JDK6, which is still widely used, or JDK8, which is more widely used, the default virtual machine is HotSpot.
  2. The default virtual machine for Sun/Oracle JDK and OpenJDK.
  3. Therefore, the default virtual machines introduced in this course are all HotSpot .
  • From server, desktop to mobile, there are embedded applications.
  1. Find the code with the most compilation value through the counter, trigger just-in-time compilation or replacement on the stack.
  2. The compiler and interpreter work together to strike a balance between optimized program response time and optimal execution performance.
  • BEA's JRockit
  • Focus on server-side applications
  1. It can pay less attention to the program startup speed, so JRockit does not include an interpreter implementation inside, and all codes are compiled and executed by a just-in-time compiler.
  • Numerous industry benchmarks show that the JRockit JVM is the fastest JVM in the world
  1. Using JRockit products, you can already experience significant performance improvements (typically over 70%) and hardware cost reductions (50%).
  • Strengths: Comprehensive portfolio of Java runtime solutions
  1. JRockit's solution for delay-sensitive applications, JRockit Real Time, provides JVM response time in milliseconds or microseconds, which is suitable for financial, military command, and telecommunications networks.
  2. MissionControl suite of services, which is a set of tools to monitor, manage and analyze applications in production environments with very low overhead.
  • In 2008, BEA was acquired by Oracle.
  • Oracle expressed the salary of integrating the two advantages of the virtual machine, which is roughly completed in JDK 8. The way of integration is to transplant the excellent features of JRockit on the basis of HotSpot.
  • IBM J9
  • Full name: IBM Technology for Java Virtual Machine, referred to as IT4J, internal code: J9.
  • The market point is close to HotSpot, server-side, desktop application, embedded and other multi-purpose VMs.
  • Widely used in various Java products of IBM
  • Currently, it is one of the three most influential commercial servers and is also known as the fastest Java virtual machine in the world.
  • Around 2017, IBM released the open source J9 VM, named Open J9, which was managed by the Eclipse Foundation and became Eclipse OpenJ9.
  • KVM and CDC/CLDC Hotspots
  • Oracle's two virtual machines in the Java ME product line are: CDC/CLDC HotSpot Implementation VM>
  • KVM (Kilobyte) is an early product of CLDC-HT.
  • At present, the position in the mobile field is embarrassing, and the smart phone is divided into two parts by Android and IOS.
  • KVM is simple, lightweight, highly portable, and maintains its own market for lower-end devices
  1. intelligent controller, sensor
  2. Mobile phones for the elderly, feature phones in economically underdeveloped areas
  • The principle of all virtual machines: compile once, run everywhere.
  • Blue VM
  • The first three "high-performance Java virtual machines" are used on general-purpose hardware platforms.
  • Here Azul VM and BEA Liquid VM are specific hardware platform binding, dedicated virtual machines with hardware and software
  • A fighter in a high-performance Java virtual machine.
  • Azul VM is a Java virtual machine that Azul Systems has made a lot of improvements on the basis of Hotspot and runs on Azul Systems' proprietary Vega system.
  • Each Azul VM instance can manage hardware resources of at least dozens of CPUs and hundreds of GB of memory, and provides a garbage collector that realizes controllable GC time in a huge memory range, and thread scheduling optimized by proprietary hardware, etc. characteristic.
  • In 2010, Azul Systems began to shift from hardware to software, and released its own Zing JVM, which can provide features close to Vega systems on the general X86 platform.
  • Liquid VM
  • A fighter in a high-performance Java virtual machine.
  • Developed by BEA, it runs directly on its own Hypervisor system.
  • Liquid VM Even the current JRockit VE, Liquid VM does not need the support of the operating system, or it itself implements the necessary functions of a proprietary operating system, such as thread scheduling, file system, network support, etc.
  • With the end of development of the JRockit virtual machine, the Liquid VM project was also discontinued.
  • Apache Harmony
  • Apache has also launched Apache Harmony, a Java runtime platform compatible with JDK1.5 and JDK1.6.
  • It is an open source JVM jointly developed by IBM and Intel. It was suppressed by the same open source OpenJDK. Sun resolutely refused to allow Harmony to obtain JCP certification, and finally retired in 2011. IBM turned to participate in OpenJDK
  • Although there is no large-scale commercial use of Apache Harmony, its Java class library code has been absorbed into the Android SDK.
  • Microsoft JVM
  • Microsoft developed the Microsoft JVM to support Java Applets in the IE3 browser.
  • It can only run on the window platform. But it was indeed the Java VM with the best performance under Windows at that time.
  • In 1997, Sun sued Microsoft for trademark infringement and unfair competition charges, and spent a lot of money with Sun. Microsoft erased its VM in WinowsXP SP3. Now the JDK installed on Windows is HotSpot.
  • TaobaoJVM
  • Released by the AliJVM team. Ali, the most powerful company using Java in China, covers cloud computing, finance, logistics, e-commerce and many other fields, and needs to solve high concurrency, high availability, and distributed compound problems. There are a large number of open source products .
  • Based on OpenJDK, it developed its own customized version AlibabaJDK, referred to as AJDK. It is the base time of the entire Alibaba Java system.
  • Based on OpenJDK HotSpot VM, the first domestically optimized, deeply customized and open source high-performance server version Java virtual machine.
  1. The innovative GCIH (GC invisible heap) technology realizes off-heap, that is, Java objects with a long life cycle are moved from the heap to outside the heap, and the GC cannot manage the Java objects inside the GCIH, reducing the recycling frequency of the GC at one time And the purpose of improving the recovery efficiency of GC.
  2. Objects in GCIH can also be shared among multiple java virtual machine processes.
  3. Use crc32 instructions to implement JVM intrinsics to reduce the calling overhead of JNI.
  4. Java profiling tool and diagnostic assistant function of PMU hardware.
  5. ZenGC for big data scenarios.
  • The Taobao VM application has high performance on Ali products, and the hardware relies heavily on Intel's CPU, which loses compatibility but improves performance.
  1. At present, it has been launched on Taobao and Tmall, replacing all the official Oracle JVM versions.

Class II Loading Subsystem

insert image description here

1. The role of the class loader subsystem

insert image description here

  • The class loader subsystem is responsible for loading Class files from the file system or network center, and class files have specific file identifiers at the beginning of the file.
  • ClassLoader is only responsible for loading class files, and whether it can run is determined by Execution Engine.
    The loaded class information is stored in a memory space called the method area. In addition to class information, the method area also stores runtime constant pool information, which may also include string literals and numeric constants (this part of constant information is the memory map of the constant pool part in the Class file)

2. Class loading ClassLoader role

insert image description here

  • The class file exists on the local hard disk, which can be understood as a template drawn by the designer on paper, and finally the template is loaded into the JVM when it is executed to instantiate n identical instances based on this file.
  • The class file is loaded into the JVM, known as the DNA metadata template, and placed in the method area.
  • In the .class file --> JVM --> eventually become a metadata template, this process requires a transport tool (Class Loader) to play the role of a courier.

3. Class loading process

insert image description here
insert image description here

3.1 Loading phase

  • Get the binary byte stream defining this class by its fully qualified name.
  • Convert the static storage result represented by this byte stream into the runtime data structure of the method area.
  • A java.lang.Class object representing this class is generated in the memory as an access entry for various data of this class in the method area.
  • After the class loads the .class file into the metaspace, a java.lang.Class object will be created in the heap to encapsulate the data structure of the class in the method area. The Class object is created during the class loading process, and each class corresponds to an object of the Class type.

Supplement: The way to load .class files

  • Load directly from local system
  • Obtained through the network, typical scenario: Web Applet
  • Read from the zip archive and become the basis of jar and war formats in the future
  • Runtime calculation generation, the most used is: dynamic proxy technology
  • Generated by other files, typical scenario: JSP application
  • Extract .class files from proprietary databases, relatively rare
  • Obtained from encrypted files, typical protection measures against decompilation of Class files

3.2 Linking

3.2.1 Verify

  • The purpose is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine, to ensure the correctness of the loaded class, and not to endanger the safety of the virtual machine itself.
  • It mainly includes four types of verification: file format verification, metadata verification, bytecode verification, and symbol reference verification.
  • Format check: whether it starts with magic oxCAFEBABE, whether the major version and minor version are within the range supported by the current Java virtual machine, whether each item in the data has the correct length, etc.

insert image description here

3.2.2 Prepare

  • Allocate memory for a class variable and set the variable's default initial value, which is zero.
  • 这里不包含用final修饰的static,因为final在编译的时候就会分配了,准备阶段会显示初始化.
    这里不会为实例变量分配初始化,Class variables are allocated in the method area, while instance variables are allocated in the Java heap along with the object.
  • Note: Java does not support the boolean type. For the boolean type, the internal implementation is int. Since the default value of int is 0, correspondingly, the default value of boolean is false

insert image description here

3.2.3 Resolve

  • The process of converting symbolic references in the constant pool to direct references.
  • In fact, the parsing operation is often accompanied by the execution of the JVM after initialization.
  • A symbolic reference is a set of symbols that describe the referenced object. The literal form of symbol reference is clearly defined in the Class file format of "Java Virtual Machine Specification". A direct reference is a pointer directly to the target, a relative offset, or a handle that indirectly locates the target.
  • The parsing action is mainly for classes or interfaces, fields, class methods, interface methods, method types, etc. Corresponding to CONSTANT_Class_info, CONSTANT_Fieldref_info, CONSTANT_Methodref_info, etc. in the constant pool.
  • Symbolic quotes are: fully qualified names of classes and interfaces, names and descriptors of fields, names and descriptors of methods

Explain what are symbolic quotes and direct quotes?

  1. There is an empty seat in the classroom, and the sign on the seat says Xiao Ming’s seat (symbolic reference), and then Xiao Ming comes in and sits down and drops the sign (the symbolic reference is replaced by a direct reference)
  2. When we cook, look at the recipe, what are the steps (this is the symbolic quotation mark), when we actually do it, the process is directly quoted
  3. Example: The bytecode corresponding to the output operation System.out.println():
    invokevirtual #24 <java/io/PrintStream.println>

insert image description here

Taking methods as an example, the Java virtual machine prepares a method table for each class, and lists all its methods in the table. When you need to call a method of a class, you only need to know the method table The offset can directly call this method. Through the parsing operation, the symbol reference can be transformed into the position of the target method in the method table of the class, so that the method is successfully called

3.3 Initialization

  • Give the class variable the correct initialization value.
  • The initialization phase is the process of executing the class constructor clinit().
public class ClassInitTest {
private  static int num=1; //类变量的赋值动作

//静态代码快中的语句static{
num=2;
number=20;
System.out.println(num);//System.out.println(number); 报错:非法的前向引用

}

//Linking之prepare: number=0 -->initial:20-->10
private static int number=10;

public static void main(String[] args) {

System.out.println(ClassInitTest.num);
System.out.println(ClassInitTest.number);
}
}

insert image description here

  • This method does not need to be defined, it is the combination of the assignment actions of all class variables in the class automatically collected by the javac compiler and the statements in the static code block.
  • Instructions in a constructor method are executed in the order in which the statements appear in the source file.
  • clinit() is different from a class constructor. (Association: the constructor is init() from the perspective of the virtual machine)
  • If the class has a parent class, the JVM will ensure that the clinit() of the parent class is executed before the clinit() of the subclass is executed. clinit is different from the constructor (init) of the class (from parent to child, static first)
public class ClinitTest1 {

static class Father{
public static int A=1;
static{
A=2;
}
}

static class Son extends Father{
public static int B=A;

}

public static void main(String[] args) {
//这个输出2,则说明父类已经全部加载完毕
System.out.println(Son.B);
}

}
  • The virtual machine must ensure that the clinit() method of a class is locked synchronously under multi-threading.
  • The Java compiler does not generate clinit() initialization methods for all classes. After which classes are compiled into bytecode, the bytecode file will not contain the clinit() method?
  1. When a class does not declare any class variables, and there is no static code block
  2. When a class variable is declared in a class, but the initialization statement of the class variable and the static code block are not explicitly used to perform the initialization operation
  3. A class contains fields of basic data types modified by static final, and these class field initialization statements use compile-time constant expressions (if the static final is not passed through methods or constructors, then in the linking phase)
/**
* @author TANGZHI* @create 2021-01-01 18:49
* 哪些场景下,java编译器就不会生成<clinit>()方法
*/

public class InitializationTest1 {
//场景1:对应非静态的字段,不管是否进行了显式赋值,都不会生成<clinit>()方法
public int num = 1;

//场景2:静态的字段,没有显式的赋值,不会生成<clinit>()方法
public static int num1;

//场景3:比如对于声明为static final的基本数据类型的字段,不管是否进行了显式赋值,都不会生成<clinit()
//方法
public static final int num2 = 1;
}
  • The problem with the combination of static and final (use static + final decoration, and the explicit assignment of the basic data type or String type that does not involve method or constructor call in the assignment is carried out in the preparation link of the link stage)

/*** @author TANGZHI* @create 2021-01-01 *
* 说明:使用static + final修饰的字段的显式赋值的操作,到底是在哪个阶段进行的赋值?
* 情况1:在链接阶段的准备环节赋值
* 情况2:在初始化阶段<clinit>()中赋值
* 结论:
* 在链接阶段的准备环节赋值的情况:
* 1. 对于基本数据类型的字段来说,如果使用static final修饰,则显式赋值(直接赋值常量,而非调用方法)通常是在链接阶段的准备环节进行
* 2. 对于String来说,如果使用字面量的方式赋值,使用static final修饰的话,则显式赋值通常是在链接阶
*段的准备环节进行
* 在初始化阶段<clinit>()中赋值的情况:
* 排除上述的在准备环节赋值的情况之外的情况。
* 最终结论:使用static + final修饰,且显示赋值中不涉及到方法或构造器调用的基本数据类型或String类型的显式赋值,是在链接阶段的准备环节进行。*/

public class InitializationTest2 {
public static int a = 1;

//在初始化阶段<clinit>()中赋值
public static final int INT_CONSTANT = 10;

//在链接阶段的准备环节赋值
public static final Integer INTEGER_CONSTANT1 = Integer.valueOf(100);

//在初始化阶段<clinit>()中赋值
public static Integer INTEGER_CONSTANT2 = Integer.valueOf(1000);

//在初始化阶段<clinit>()中赋值
public static final String s0 = "helloworld0";

//在链接阶段的准备环节赋值
public static final String s1 = new String("helloworld1");

//在初始化阶段<clinit>()中赋值
public static String s2 = "helloworld2";
public static final int NUM1 = new Random().nextInt(10);//在初始化阶段<clinit>()中赋值
}
  • Will the call to clinit() deadlock?
  1. The virtual machine ensures that the () method of a class is correctly locked and synchronized in a multi-threaded environment. If multiple threads initialize a class at the same time, only one thread will execute the () method of this class, and other threads will Need to block and wait until the active thread executes the () method
  2. It is precisely because the function () is thread-safe with a lock, therefore, if there is a long-time-consuming operation in the () method of a class, it may cause multiple threads to block and cause a deadlock. And such deadlocks are hard to find, because it seems that they have no lock information available

4. Class loader classification

  • The JVM supports two types of class loaders. They are Bootstrap ClassLoader and User-Defined ClassLoader respectively.
  • Conceptually speaking, a custom class loader generally refers to a type of class loader customized by the developer in the program, but the Java Virtual Machine Specification does not define it in this way, but loads all classes derived from the abstract class ClassLoader Classifiers are divided into custom class loaders.
  • No matter how the type of class loader is divided, there are always only three of our most common class loaders in the program, as follows:

insert image description here
The relationship among the four here is a containment relationship. It is not the upper and lower layers, nor is it the inheritance relationship between the parent and child classes.

4.1 The loader that comes with the virtual machine

Bootstrap ClassLoader (Bootstrap ClassLoader)

  • This class loading is implemented in C/C++ language and nested inside the JVM.
  • It is used to load Java's core library (JAVA_HOME/jre/lib/rt.jar, resources.jar or content under the sun.boot.class.path path) to provide the classes needed by the JVM itself
  • Does not inherit from ava.lang.ClassLoader, no parent loader.
  • Load extension classes and application class loaders, and specify them as their parent class loaders.
  • For security reasons, the Bootstrap startup class loader only loads classes whose package names start with java, javax, sun, etc.

Extension ClassLoader

  • Written in Java language, implemented by sun.misc.Launcher$ExtClassLoader.
  • Derived from the ClassLoader class
  • The parent class loader is the startup class loader
  • Load the class library from the directory specified by the java.ext.dirs system property, or load the class library from the jre/lib/ext subdirectory (extension directory) of the JDK installation directory. If user-created JARs are placed in this directory, they will also be automatically loaded by the extension class loader.

4.2 User-defined class loader

  • In Java's daily application development, class loading is almost performed by the cooperation of the above three types of loaders. When necessary, we can also customize the class loader to customize the way the class is loaded. Why custom classloader?
  • Loading classes in isolation
  • Modify the way the class is loaded
  • Extended load source
  • Prevent source code leakage

User-defined class loader implementation steps:

  • Developers can implement their own class loaders by inheriting the abstract class ava.lang.ClassLoader to meet some special needs
  • Before JDK1.2, when customizing the class loader, always inherit the ClassLoader class and rewrite the loadClass() method, so as to realize the custom class loading class, but after JDK1.2, it is no longer recommended for users to override loadclass() method, it is recommended to write custom class loading logic in the findClass() method
  • When writing a custom class loader, if there are no too complicated requirements, you can directly inherit the URLClassLoader class, so that you can avoid writing the findClass() method and the way to obtain the bytecode stream yourself, so that the custom class loader Write more concisely.
public class ClassLoaderDemo {

public static void main(String[] args) {

ClassLoader classloader1 = ClassLoader.getSystemClassLoader();
//sun.misc.Launcher$AppClassLoader@18b4aac2
System.out.println(classloader1);

//获取到扩展类加载器
//sun.misc.Launcher$ExtClassLoader@424c0bc4
System.out.println(classloader1.getParent());

//获取到引导类加载器 null
System.out.println(classloader1.getParent().getParent());

//获取系统的ClassLoader
ClassLoader classloader2 = Thread.currentThread().getContextClassLoader();

//sun.misc.Launcher$AppClassLoader@18b4aac2
System.out.println(classloader2);

String[]strArr=new String[10];
ClassLoader classLoader3 = strArr.getClass().getClassLoader();

//null,表示使用的是引导类加载器
System.out.println(classLoader3);

ClassLoaderDemo[]refArr=new ClassLoaderDemo[10];
//sun.misc.Launcher$AppClassLoader@18b4aac2
System.out.println(refArr.getClass().getClassLoader());

int[]intArr=new int[10];
//null,如果数组的元素类型是基本数据类型,数组类是没有类加载器的System.out.println(intArr.getClass().getClassLoader());
}
}

4.3 Instructions for using ClassLoader

  • The ClassLoader class is an abstract class, and all subsequent class loaders inherit from ClassLoader (excluding the startup class loader)

insert image description here
insert image description here
Ways to get ClassLoader

方式一:获取当前ClassLoader
clazz.getClassLoader()

方式二:获取当前线程上下文的ClassLoader 
Thread.currentThread().getContextClassLoader()

方式三:获取系统的ClassLoader
ClassLoader.getSystemClassLoader()

方式四:获取调用者的ClassLoader 
DriverManager.getCallerClassLoader()

4.3 Parental delegation mechanism

The Java virtual machine uses an on-demand loading method for class files, that is to say, when the class needs to be used, its class file will be loaded into memory to generate a class object. Moreover, when loading the class file of a certain class, the Java virtual machine adopts the parent delegation mode, that is, the request is handed over to the parent class for processing, which is a task delegation mode.

working principle

  • If a class loader receives a class loading request, it does not load it first, but delegates the request to the parent class loader for execution;
  • If the parent class loader still has its parent class loader, it will further delegate upwards, recurse in turn, and the request will eventually reach the top-level startup class loader;
  • If the parent class loader can complete the class loading task, it will return successfully. If the parent class loader cannot complete the loading task, the child loader will try to load it by itself. This is the parent delegation mode.

insert image description here
insert image description here

example

  • When we load jdbc.jar for database connection, the first thing we need to know is that jdbc.jar is implemented based on the SPI interface, so when loading, parent delegation will be performed, and finally SPI will be loaded from the root loader The core class, and then load the SPI interface class, and then perform reverse delegation, and load the implementation class jdbc.jar through the thread context class loader.

insert image description here

Advantage

  • Avoid duplicate loading of classes
  • Protect the security of the program and prevent the core API from being tampered with at will
    • Custom class: java.lang.String
    • Custom class: java.lang.ShkStart (error: preventing the creation of classes beginning with java.lang)

Sandbox Security Mechanism

  • Customize the String class, but when loading the custom String class, it will first use the bootstrap class loader to load, and the bootstrap class loader will first load the files that come with jdk during the loading process (java\lang in the rt.jar package \String.class), the error message says that there is no main method, because the string class in the rt.jar package is loaded. This can ensure the protection of the java core source code, which is the sandbox security mechanism.
  • As shown in the figure, although we have customized a String under the java. Load the class whose package name starts with java, javax, sun, etc.), and the String in the core class library does not have a main method

insert image description here

5. Other

How to determine whether two class objects are the same

In the JVM, there are two necessary conditions to indicate whether two class objects are the same class:

  • The full class name of the class must match, including the package name.
  • The ClassLoader (referring to the ClassLoader instance object) that loads this class must be the same.

In other words, in the JVM, even if the two class objects (class objects) originate from the same Class file and are loaded by the same virtual machine, as long as the ClassLoader instance objects that load them are different, the two class objects are also different. equal.

a reference to the class loader

The JVM must know whether a type was loaded by the boot loader or by the user class loader. If a type is loaded by a user class loader, the JVM will save a reference to the class loader as part of the type information in the method area. When resolving a reference from one type to another, the JVM needs to ensure that the class loaders for both types are the same.

Active and passive use of classes

The use of classes by Java programs is divided into: active use and passive use.

Active use can be divided into seven situations:

  • Create an instance of the class
  • Access a static variable of a class or interface, or assign a value to the static variable
  • Calling a static method of a class
  • Reflection (eg: Class.forName("com.atguigu.Test"))
  • Initialize a subclass of a class
  • The class marked as the startup class when the Java virtual machine starts
  • The dynamic language support provided by JDK 7:
    the parsing result of java.lang.invoke.MethodHandle instance
    REF_getStatic, REF_putStatic, REF_invokeStatic handle corresponding class is not initialized, then initialize

In addition to the above seven cases, other ways of using Java classes are regarded as passive use of the class, which will not lead to the initialization of the class.

Three runtime data areas

1. Runtime data area

1.1 Overview

This section mainly talks about the runtime data area, which is the part in the figure below, which is the stage after the class loading is completed
insert image description here

When we pass the previous stages: class loading -> verification -> preparation -> parsing -> initialization, the execution engine will be used to use our class, and the execution engine will be used until we run time data area
insert image description here

Memory is a very important system resource. It is the intermediate warehouse and bridge between the hard disk and the CPU. It carries the real-time operation of the operating system and applications. efficient and stable operation. Different JVMs have some differences in memory division methods and management mechanisms. Combined with the JVM virtual machine specification, let's discuss the classic JVM memory layout.

insert image description here

The data we get through disk or network IO needs to be loaded into the memory first, and then the CPU gets the data from the memory to read, that is to say, the memory acts as a bridge between the CPU and the disk

insert image description here

The Java virtual machine defines several types of runtime data areas that will be used during the running of the program, some of which will be created when the virtual machine starts and destroyed when the virtual machine exits. Others are one-to-one correspondence with threads, and these data areas corresponding to threads will be created and destroyed as threads start and end.

Gray ones are private to a single thread, red ones are shared by multiple threads. Right now:

  • Each thread: independently includes the program counter, stack, and local stack.
  • Sharing between threads: heap, off-heap memory (permanent generation or metaspace, code cache)

insert image description here
There is only one Runtime instance per JVM. It is the runtime environment, which is equivalent to the frame in the middle of the memory structure: the runtime environment.
insert image description here

1.2 Threads

  • A thread is a unit of execution in a program. JVM allows an application to have multiple threads of execution in parallel. In the Hotspot JVM, each thread maps directly to a native thread of the operating system.
  • When a Java thread is ready to execute, an operating system native thread is also created at the same time. After the execution of the Java thread terminates, the native thread is also recycled.
  • The operating system is responsible for scheduling all threads to any available CPU. Once the native thread is successfully initialized, it calls the run() method in the Java thread.

1.3 JVM system thread

  • If you use the console or any debugging tool, you can see that there are many threads running in the background. These background threads do not include the main thread calling public static void main(String[] args) and all threads created by the main thread itself.
  • These main background system threads are mainly the following in Hotspot JVM:
    • Virtual machine thread: The operation of this thread will only appear when the JVM reaches a safe point. The reason these operations have to happen in different threads is that they all require the JVM to reach a safe point where the heap doesn't change. This type of thread execution includes "stop-the-world" garbage collection, thread stack collection, thread suspension, and biased lock revocation.
    • Periodic task thread: This thread is the embodiment of time period events (such as interrupts), and they are generally used for scheduling execution of periodic operations.
    • GC thread: This thread provides support for different kinds of garbage collection behaviors in the JVM.
    • Compiler thread: This thread compiles bytecode into native code at runtime.
    • Signal Dispatch Thread: This thread receives signals and sends them to the JVM, which handles them internally by calling appropriate methods.

2. Program counter (PC register)

In the program counter register (Program Counter Register) in the JVM, the name of the Register is derived from the register of the CPU, and the register stores the scene information related to the instruction. The CPU can only run if it has data loaded into the registers. Here, it is not a physical register in a broad sense. It may be more appropriate to translate it into a PC counter (or instruction counter) (also called a program hook), and it is not easy to cause some unnecessary misunderstandings. The PC register in the JVM is an abstract simulation of the physical PC register.

insert image description here

effect

  • The PC register is used to store the address pointing to the next instruction, which is the instruction code to be executed. The next instruction is read by the execution engine.
  • Features (thread-private, no memory overflow)
  • The physical implementation of the program counter is implemented in registers, the fastest execution unit in the entire cpu
  • is the only area that does not have OOM in the java virtual machine specification

insert image description here

  • It's such a small memory space that it's almost negligible. It is also the fastest storage area.
  • In the JVM specification, each thread has its own program counter, which is private to the thread, and its life cycle is consistent with the life cycle of the thread.
  • There is only one method executing in a thread at any time, which is == the so-called current method. == The program counter stores the JVM instruction address of the Java method being executed by the current thread; or, if the native method is being executed, it is an unspecified value (undefined).
  • It is an indicator of program control flow. Basic functions such as branching, looping, jumping, exception handling, and thread recovery all need to rely on this counter to complete.
  • When the bytecode interpreter works, it changes the value of this counter to select the next bytecode instruction to be executed.
  • It is the only area in the Java Virtual Machine Specification that does not specify any OutofMemoryError conditions.

What is the use of using the PC register to store the bytecode instruction address? Why use the PC register to record the execution address of the current thread?

  • Because the CPU needs to switch each thread continuously, after switching back at this time, you have to know where to start to continue execution.
  • The bytecode interpreter of the JVM needs to change the value of the PC register to clarify what bytecode instruction should be executed next.

insert image description here

Why is the PC register set as private?

  • We all know that the so-called multi-threading method will only execute one of the threads in a specific period of time. The CPU will continue to switch tasks, which will inevitably lead to frequent interruptions or recovery. How to ensure that there is no difference? In order to accurately record the address of the current bytecode instruction being executed by each thread, the best way is naturally to allocate a PC register for each thread, so that independent calculations can be performed between each thread, so that there will be no situation of mutual interference.
  • Due to the limitation of CPU time slices, during the concurrent execution of many threads, at any given moment, a processor or a core in a multi-core processor will only execute one instruction in a certain thread.
  • This will inevitably lead to frequent interruption or recovery. How to ensure that there is no difference? After each thread is created, it will generate its own program counter and stack frame, and the program counter does not affect each other among the threads.

CPU time slice

  • The CPU time slice is the time allocated by the CPU to each program, and each thread is allocated a time period, called its time slice.
  • On the macro level: we can open multiple applications at the same time, and each program runs in parallel and runs at the same time.
  • But at the micro level: Since there is only one CPU, it can only process a part of the program's requirements at a time. How to deal with fairness, one way is to introduce time slices, and each program will execute in turn.

Four Virtual Machine Stacks

4.1. Virtual machine stack overview

4.1.1 Background

  • Due to the cross-platform design, Java instructions are designed based on the stack. The CPU architecture of different platforms is different, so it cannot be designed as register-based.
  • The advantage is that it is cross-platform, the instruction set is small, and the compiler is easy to implement. The disadvantage is that the performance decreases, and more instructions are needed to realize the same function.

4.1.2 Stack and Heap in Memory

  • The stack is the unit of runtime, while the heap is the unit of storage
  • The stack solves the running problem of the program, that is, how the program executes, or how to process data.
  • The heap solves the problem of data storage, that is, how and where to put data

4.1.3 Basic content of the virtual machine stack

What is the Java virtual machine stack?

Java Virtual Machine Stack (Java Virtual Machine Stack), also known as the Java stack in the early days. Each thread will create a virtual machine stack when it is created, and each stack frame (Stack Frame) is stored inside, corresponding to each Java method call, which is private to the thread.

life cycle

The life cycle is consistent with the thread

effect

In charge of the operation of the Java program, it saves the local variables and partial results of the method, and participates in the calling and returning of the method.

The characteristics of the stack

  • The stack is a fast and efficient way to allocate storage, and its access speed is second only to the program counter.
  • There are only two direct operations of the JVM on the Java stack:
    • Each method is executed, accompanied by push (push, push)
    • Popping work after execution
  • There is no garbage collection problem for the stack ( the stack overflows )

insert image description here
Exceptions that may occur on the stack

  • The Java Virtual Machine Specification allows the size of the Java stack to be dynamic or fixed.
    • If a fixed-size Java virtual machine stack is used, the Java virtual machine stack capacity of each thread can be independently selected when the thread is created. If the stack size requested by the thread exceeds the maximum capacity allowed by the Java virtual machine stack, the Java virtual machine will throw a StackOverflowError exception.
    • If the Java virtual machine stack can be dynamically expanded, and cannot apply for enough memory when trying to expand, or if there is not enough memory to create the corresponding virtual machine stack when creating a new thread, the Java virtual machine will throw a ==OutOfMemoryError == Exception.
public static void main(String[] args) {

test();
}

public static void test() {
test();
}
//抛出异常:Exception in thread"main"java.lang.StackoverflowError
//程序不断的进行递归调用,而且没有退出条件,就会导致不断地进行压栈。

Set the stack memory size

  • We can use the parameter -Xss option to set the maximum stack space of the thread, the size of the stack directly determines the maximum reachable depth of the function call
  • How to set the size of the stack memory? -Xss size (ie: -XX:ThreadStackSize)
    • Generally, the default is 512k-1024k, depending on the operating system (before jdk5, the default stack size is 256k; after jdk5, the default stack size is 1024k) The
      size of the stack directly determines the maximum reachable depth of the function call
public class StackDeepTest{ 

private static int count=0; 
public static void recursion(){
count++; 
recursion(); 
}

public static void main(String args[]){
try{
recursion();
} catch (Throwable e){
System.out.println("deep of calling="+count); 
e.printstackTrace();}
}
}

4.2 The storage unit of the stack

4.2.1 What is stored in the stack?

  • Each thread has its own stack, and the data in the stack exists in the format of a stack frame (Stack Frame) .
  • Each method being executed on this thread corresponds to a stack frame (Stack Frame).
  • A stack frame is a memory block and a data set that maintains various data information during method execution.

4.2.2 Principle of stack operation

  • There are only two direct operations of the JVM on the Java stack, which is to push and pop the stack frame, following the "first in, first out"/"last in, first out" principle.
  • In an active thread, there will only be one active stack frame at a point in time. That is, only the stack frame (stack top stack frame) of the currently executing method is valid. This stack frame is called the current stack frame (Current Frame), and the method corresponding to the current stack frame is the current method (Current Method). The class that defines this method is the current class (Current Class).
  • All bytecode instructions run by the execution engine only operate on the current stack frame.
  • If other methods are called in this method, the corresponding new stack frame will be created and placed on the top of the stack to become the new current frame.

insert image description here

  • The stack frames contained in different threads are not allowed to have mutual references, that is, it is impossible to refer to the stack frame of another thread in a stack frame.
  • If the current method calls other methods, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame, and then, the virtual machine discards the current stack frame, making the previous stack frame become the current stack frame again.
  • There are two ways to return a function in a Java method, one is a normal function return, using the return instruction; the other is to throw an exception. Either way, the stack frame will be popped.

4.2.3 The internal structure of the stack frame

Each stack frame stores:

  • Local Variables Table (Local Variables)
  • Dynamic Linking (or a method reference to the runtime constant pool)
  • Dynamic Linking (or a method reference to the runtime constant pool)
  • Method return address (Return Address) (or the definition of method exit normally or abnormally)
  • some additional information

insert image description here
The stack under each parallel thread is private, so each thread has its own stack, and there are many stack frames in each stack. The size of the stack frame is mainly determined by the local variable table and the operand stack.
insert image description here

4.3 Local Variables Table (Local Variables)

  • Local variable tables are also called local variable arrays or local variable tables
  • Defined as a numeric array, it is mainly used to store method parameters and local variables defined in the method body. These data types include various basic data types, object references (reference), and returnAddress types.
  • Since the local variable table is built on the stack of the thread, it is the private data of the thread, == so there is no data security problem ==
  • == The required capacity of the local variable table is determined at compile time , == and stored in the maximum local variables data item of the Code attribute of the method. The size of the local variable table is not changed during the execution of the method.
  • The number of method nested calls is determined by the size of the stack. Generally speaking, the larger the == stack, the more times the method is nested and called. == For a function, the more parameters and local variables it has, the larger the local variable table will be, and the larger its stack frame will be, to meet the increasing demand for the information passed by the method call. In turn, function calls will take up more stack space, resulting in a reduction in the number of nested calls.
  • == The variables in the local variable table are only valid in the current method call. == When the method is executed, the virtual machine completes the transfer process of the parameter value to the parameter variable list by using the local variable table. When the method call ends, as the method stack frame is destroyed, the local variable table will also be destroyed.
//使用javap -v 类.class 或者使用jclasslib
public class LocalVariableTest {

public static void main(String[] args) {
LocalVariableTest test=new LocalVariableTest();
int num=10;
test.test1();
}

public static void test1(){
Date date=new Date();
String name="xiaozhi";
}
}

The screenshot of jclasslib is as follows:
insert image description here
insert image description here
insert image description here

4.3.1 Understanding about Slot

  • Local variable table, the most basic storage unit is Slot (variable slot)
  • The storage of the parameter value always starts at index0 of the local variable array and ends at the index of the array length -1.
  • Various basic data types (8 types), reference type (reference), and returnAddress type variables known at compile time are stored in the local variable table.
  • In the local variable table, types within 32 bits only occupy one slot (including the returnAddress type), and types of 64 bits (long and double) occupy two slots.
  • byte, short, and char are converted to int before storage, boolean is also converted to int, 0 means false, non-zero means true.
  • The JVM will assign an access index to each Slot in the local variable table, through which the local variable value specified in the local variable table can be successfully accessed
  • When an instance method is called, its method parameters and local variables defined inside the method body will be copied to each slot in the local variable table in order
  • If you need to access a 64bit local variable value in the local variable table, you only need to use the previous index. (For example: access to long or double type variables)
  • If the current frame is created by a construction method or an instance method, then the object reference this will be stored in the slot with index 0, and the rest of the parameters will continue to be arranged in the order of the parameter list.

insert image description here
insert image description here

4.3.2 Slot reuse

The slots in the local variable table in the stack frame can be reused. If a local variable has passed its scope, the new local variable declared after its scope is likely to reuse the slot of the expired local variable. bit, so as to achieve the purpose of saving resources.

public class SlotTest {

public void localVarl() {

int a = 0;

System.out.println(a);

int b = 0;

}

public void localVar2() {

int a = 0;
System.out.println(a);

}//此时的就会复用a的槽位int b = 0;

}
}

4.3.3. Static variables vs. local variables

  • After the parameter list is allocated, it is allocated according to the order and scope of the variables defined in the method body.
  • We know that the class variable table has two opportunities to initialize, the first time is in the == "preparation phase", the system initialization is performed, and a zero value is set for the class variable, and the other is in the "initialization" == phase, which is given to the programmer Initial value defined in code.
  • Different from class variable initialization, the local variable table does not have a system initialization process, which means that once a local variable is defined, it must be manually initialized, otherwise it cannot be used.
//这样的代码是错误的,没有赋值不能够使用。
public void test(){int i;System. out. println(i);
}

Supplementary Note

  • In the stack frame, the part most closely related to performance tuning is the local variable table mentioned above. When the method is executed, the virtual machine uses the local variable table to complete the transfer of the method.
  • The variables in the local variable table are also important garbage collection root nodes, as long as the objects directly or indirectly referenced in the local variable table will not be recycled.

4.4. Operand Stack

  • In addition to the local variable table, each independent stack frame also contains a last-in-first-out (Last-In-First-Out) operand stack, which can also be called an expression stack (Expression Stack).
  • Operand stack, in the process of method execution, according to the bytecode instruction, write data to the stack or extract data, that is, push (push) and pop (pop)
    • Some bytecode instructions push values ​​onto the operand stack, others pop operands off the stack. Use them and push the result onto the stack
    • For example: perform operations such as copying, exchanging, and summing

insert image description here

public void testAddOperation(){byte i = 15; int j = 8; int k = i + j;
}
public void testAddOperation();
 Code:
0: bipush 15
2: istore_1 
3: bipush 8
5: istore_2 
6:iload_1 
7:iload_2 
8:iadd
9:istore_3 
10:return
  • The operand stack is mainly used to save the intermediate results of the calculation process, and at the same time as a temporary storage space for variables during the calculation process.
  • The operand stack is a working area of ​​the JVM execution engine. When a method starts to execute, a new stack frame will be created accordingly. The operand stack of this method is empty.
  • Each operand stack will have a clear stack depth for storing values. The maximum depth required is defined at compile time and stored in the Code attribute of the method, which is the value of max_stack.
  • Any element in the stack can be any Java data type
    • The 32bit type occupies a stack unit depth
    • The 64bit type occupies two stack unit depths
  • The operand stack does not access data by accessing the index, but can only complete a data access through standard push and pop operations
  • If the called method has a return value, its return value will be pushed into the operand stack of the current stack frame, and the next bytecode instruction to be executed in the PC register will be updated.
  • The data type of the elements in the operand stack must strictly match the sequence of bytecode instructions, which is verified by the compiler during the compiler, and verified again during the data flow analysis stage of the class verification stage in the class loading process.
  • In addition, we say that the interpretation engine of the Java virtual machine is a stack-based execution engine, where the stack refers to the operand stack.

4.5. Code Tracking

public void testAddOperation() {byte i = 15;int j = 8;int k = i + j;
}

Use the javap command to decompile the class file: javap -v class name.class

public void testAddoperation(); 
Code:
0: bipush 15 
2: istore_1 
3: bipush 8
5: istore_2
6: iload_1
7: iload_2
8: iadd
9: istore_3
10: return

insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here

4.6. Top Of Stack Cashing Technology

  • As mentioned earlier, the zero-address instructions used by the virtual machine based on the stack architecture are more compact, but it is necessary to use more push and pop instructions to complete an operation, which also means that more instructions will be required. The number of instruction dispatch and memory read/write times.
  • Since operands are stored in memory, frequent execution of memory read/write operations will inevitably affect execution speed. In order to solve this problem, the designers of HotSpot JVM proposed the top-of-stack cache (Tos, Top-of-Stack Cashing) technology, which caches all the top elements of the stack in the registers of the physical CPU, thereby reducing the read/write of memory times to improve the execution efficiency of the execution engine.

4.7. Dynamic Linking

  • Dynamic link, method return address, additional information: Some places are called frame data area
  • Each stack frame internally contains a reference to the method to which the stack frame belongs in the runtime constant pool . The purpose of including this reference is to support the code of the current method to achieve dynamic linking (Dynamic Linking). For example: invokedynamic command
  • When a Java source file is compiled into a bytecode file, all variable and method references are stored as symbolic references (Symbolic Reference) in the constant pool of the class file. For example: when describing that a method calls another method, it is represented by a symbolic reference pointing to the method in the constant pool, then the function of dynamic linking is to convert these symbolic references into direct references to the calling method.

insert image description here

Why do we need a runtime constant pool?
The role of the constant pool is to provide some symbols and constants to facilitate the identification of instructions.
insert image description here
insert image description here

4.8. Method call

In the JVM, the conversion of symbolic references to direct references calling methods is related to the method's binding mechanism.

Static link:

When a bytecode file is loaded into the JVM, if the called target method is known at compile time and remains unchanged at runtime, the process of converting the symbolic reference of the calling method into a direct reference is called for static linking

Dynamic link:

If the called method cannot be determined during compilation, the symbol of the called method can only be converted into a direct reference during program runtime. Since this reference conversion process is dynamic, it is also called dynamic linking.

Static linking and dynamic linking are not nouns, but verbs, which is the key to understanding.

The binding mechanism of the corresponding method is: early binding (Early Binding) and late binding (Late Binding). Binding is a process where a symbolic reference to a field, method, or class is replaced by a direct reference, which happens only once.

Early binding:

Early binding means that if the called target method is known at compile time and remains unchanged at runtime, the method can be bound to the type to which it belongs. Which one is it, so you can use static linking to convert symbolic references to direct references.

late binding

If the called method cannot be determined at compile time, the related method can only be bound according to the actual type at program runtime. This binding method is also called late binding.

With the emergence of high-level languages, there are more and more object-oriented programming languages ​​similar to Java. Although these programming languages ​​have certain differences in grammatical style, they always maintain a commonality with each other. , that is, they all support object-oriented features such as encapsulation, inheritance, and polymorphism. Since this type of programming language has polymorphic features, it naturally has two binding methods: early binding and late binding.

Any ordinary method in Java actually has the characteristics of virtual functions, which are equivalent to virtual functions in C language (in C, you need to use the keyword virtual to explicitly define). If you do not want a method to have the characteristics of a virtual function in a Java program, you can use the keyword final to mark this method.

4.8.1 Virtual and non-virtual methods

Non-virtual method:

  • If the method determines the specific calling version at compile time, this version is immutable at runtime. Such methods are called non-virtual methods.
  • Static methods, private methods, final methods, instance constructors, and superclass methods are all non-virtual methods. Other methods are called virtual methods.

It can be parsed in the parsing phase of class loading, the following is an example of a non-virtual method

class Father{
public static void print(String str){
System. out. println("father "+str); 
}

private void show(String str){

System. out. println("father"+str);

}
}

class Son extends Father{

public class VirtualMethodTest{

public static void main(String[] args){
Son.print("coder");
//Father fa=new Father();
//fa.show("atguigu.com");}
}

The following method call instructions are provided in the virtual machine:

Ordinary calling instructions:

  • invokestatic: Invokes a static method, and the parsing phase determines the only method version
  • invokespecial: Invoke methods, private and parent methods, and the only method version is determined in the parsing phase
  • invokevirtual: call all virtual methods
  • invokeinterface: call interface method

Dynamic call instruction:

  • invokedynamic: dynamically parse out the method that needs to be called, and then execute

The first four instructions are solidified inside the virtual machine, and the execution of the method call cannot be human-intervened, while the invokedynamic instruction supports the user to determine the method version. Among them, the methods invoked by the invokestatic instruction and the invokespecial instruction are called non-virtual methods, and the rest (except those modified by fina1) are called virtual methods.

/*** 解析调用中非虚方法、虚方法的测试*/
class Father {

public Father(){
System.out.println("Father默认构造器");
}

public static void showStatic(String s){
System.out.println("Father show static"+s);
}

public final void showFinal(){
System.out.println("Father show final");
}

public void showCommon(){
System.out.println("Father show common");
}
}

public class Son extends Father{

public Son(){super();}

public Son(int age){
this();
}

public static void main(String[] args) {
Son son = new Son();
son.show();
}

//不是重写的父类方法,因为静态方法不能被重写
public static void showStatic(String s){
System.out.println("Son show static"+s);
}

private void showPrivate(String s){
System.out.println("Son show private"+s);
}

public void show(){
//invokestaticshowStatic(" 大头儿子");
//invokestaticsuper.showStatic(" 大头儿子");
//invokespecialshowPrivate(" hello!");
//invokespecialsuper.showCommon();
//invokevirtual 因为此方法声明有final 不能被子类重写,所以也认为该方法是非虚方法showFinal();
//虚方法如下
//invokevirtualshowCommon();/
/没有显式加super,被认为是虚方法,因为子类可能重写
showCommoninfo();
MethodInterface in = null;
//invokeinterface  不确定接口实现类是哪一个 需要重写
in.methodA();
}

public void info(){}}

interface MethodInterface {void methodA();
}

About the invokednamic command

  • The JVM bytecode instruction set has always been relatively stable. It was not until Java7 that an invokedynamic instruction was added. This is an improvement made by Java to support "dynamic type language".
  • However, Java7 does not provide a method to directly generate invokedynamic instructions, and it is necessary to use ASM, a low-level bytecode tool, to generate invokedynamic instructions. Until the emergence of Java8's Lambda expression and the generation of invokedynamic instructions, there was no direct way to generate them in Java.
  • The essence of the dynamic language type support added in Java7 is the modification of the Java virtual machine specification, not the modification of the Java language rules. This part is relatively complicated, and the method call in the virtual machine is added. The most direct beneficiary It is a dynamic language compiler running on the Java platform.

Dynamically typed language and statically typed language

  • The difference between a dynamically typed language and a statically typed language lies in whether the type is checked at compile time or at runtime. If the former is satisfied, it is a statically typed language, and vice versa, it is a dynamically typed language.
  • To put it more bluntly, a static type language is to judge the type information of the variable itself; a dynamic type language is to judge the type information of the variable value, the variable has no type information, and the variable value has type information, which is an important feature of the dynamic language .
  • Java is a statically typed language (although lambda expressions add dynamic features to it), js, python are dynamically typed languages
	Java:String info = "小智";

//静态语言JS:
var name = "小智“;
var name = 10;

//动态语言Pythom: 
info = 130;//更加彻底的动态语言

The essence of method rewriting in the Java language:

  • The pc register will be changed every time an instruction is executed, and the return address will always be the address after the previous call before calling the call, and will not change
  • Find the actual type of the object executed by the first element at the top of the operand stack, denoted C.
  • If a method that matches the description in the constant and the simple name is found in the type C, the access authority check is performed, and if it passes, the direct reference of this method is returned, and the search process ends; if not, it returns java.lang. IllegalAccessError exception.
  • Otherwise, according to the inheritance relationship from bottom to top, carry out the search and verification process in the second step for each parent class of C.
  • If no suitable method is found, a java.1ang.AbstractMethodserror exception is thrown.

Introduction to IllegalAccessError

  • The program attempted to access or modify a property or call a method that you do not have permission to access. Normally, this will cause a compiler exception. This error, if it occurs at runtime, indicates that an incompatible change has taken place in a class.

4.8.2 Method call: virtual method table

  • In object-oriented programming, dynamic dispatch is frequently used. If you have to search for a suitable target in the method metadata of the class during each dynamic dispatch, it may affect the execution efficiency. Therefore, in order to improve performance, the JVM implements it by creating a virtual method table (virtual method table) in the method area of ​​the class (non-virtual methods will not appear in the table). Use indexed tables instead of lookups.
  • Each class has a virtual method table, which stores the actual entry of each method.
  • When was the virtual method table created?
    • The virtual method table will be created and initialized during the linking phase of class loading. After the initial value of the variable of the class is prepared, the JVM will also initialize the method table of the class.

insert image description here

interface Friendly{void sayHello();void sayGoodbye(); 
}
class Dog{public void sayHello(){}public String tostring(){return "Dog";}
}
class Cat implements Friendly {public void eat() {}public void sayHello() { } public void sayGoodbye() {}protected void finalize() {}
}
class CockerSpaniel extends Dog implements Friendly{public void sayHello() { super.sayHello();}public void sayGoodbye() {}
}

insert image description here

4.9. Method return address (return address)

  • Store the value of the pc register calling this method. There are two ways to end a method:
    • normal execution completed
    • Unhandled exception occurred, abnormal exit
  • No matter which method is used to exit, after the method exits, it returns to the place where the method was called. When the method exits normally, the value of the caller's pc counter is used as the return address, that is, the address of the next instruction after the instruction that calls the method. For those who exit by exception, the return address must be determined through the exception table, and this part of information is generally not saved in the stack frame.
  • In essence, the exit of the method is the process of popping the current stack frame. At this time, it is necessary to restore the local variable table of the upper method, the operand stack, push the return value into the operand stack of the caller stack frame, set the PC register value, etc. , so that the calling method continues to execute.
  • The difference between the normal completion exit and the abnormal completion exit is that the exit through the exception completion exit will not produce any return value to its upper caller.

After a method starts executing, there are only two ways to exit the method:

  • When the execution engine encounters a bytecode instruction (return) returned by any method, the return value will be passed to the upper-layer method caller, referred to as the normal completion exit;
    • After a method is called normally, which return instruction needs to be used depends on the actual data type of the method return value.
    • In bytecode instructions, return instructions include ireturn (used when the return value is boolean, byte, char, short, and int), lreturn (Long type), freturn (Float type), dreturn (Double type), and areturn. In addition, there is a method declared as void by the return instruction, which is used by instance initialization methods, class and interface initialization methods.
  • An exception (Exception) is encountered during the execution of the method, and the exception is not handled within the method, that is, as long as no matching exception handler is found in the exception table of the method , the method will exit, referred to as abnormal completion exit.

insert image description here

During method execution, the exception handling when an exception is thrown is stored in an exception handling table, so that it is convenient to find the exception handling code when an exception occurs
insert image description here

In essence, the exit of the method is the process of popping the current stack frame. At this time, it is necessary to restore the local variable table of the upper method, the operand stack, the operand stack that pushes the return value into the caller's stack frame, set the PC register value, etc., so that the caller method can continue to execute.

The difference between the normal completion exit and the abnormal completion exit is that the exit through the abnormal completion exit will not produce any return value to its upper caller.

4.10 Some additional information

Some additional information related to the implementation of the Java virtual machine is also allowed to be carried in the stack frame. For example: information that provides support for program debugging.

insert image description here
Stack related interview questions

  • What about stack overflow? Stack overflow: StackOverflowError
    • There is no GC in the stack, there are OOM and StackOverflowError
    • To give a simple example: calling the main method in the main method will continue to push the stack until the stack overflows;
  • The size of the stack can be fixed or dynamically changed (dynamic expansion)
    • If fixed, then a StackOverflowError will be thrown
    • If it is dynamically extended, an OOM exception (java.lang.OutOfMemoryError) will be thrown
  • Example of stack overflow? (StackOverflowError)
    • Set the size of the stack by -Xss
  • Can adjusting the stack size ensure that there will be no overflow?
    • cannot. Because adjusting the stack size will only reduce the possibility of overflow, the stack size cannot be expanded infinitely, so there is no guarantee that no overflow will occur
  • Is the larger the stack memory allocated, the better?
    • No, the probability of OOM is reduced for a certain period of time, but it will occupy other thread spaces because the entire space is limited.
  • Does garbage collection involve the virtual machine stack?
    • No; garbage collection will only involve the method area and the heap, and the method area and the heap may also overflow
    • Program counter, only records the address of running the next line, there is no overflow and garbage collection
    • The virtual machine stack and the local method stack only involve pushing and popping the stack, there may be stack overflow, and there is no garbage collection
  • Are local variables defined in methods thread-safe?
    • Analyze specific issues. If the object is generated internally and dies internally without being returned to the outside, then it is thread-safe, otherwise it is thread-unsafe.
/**方法中定义的局部变量是否线程安全?   具体问题具体分析
* @author shkstart* @create 15:53
*/
public class LocalVariableThreadSafe {

//s1的声明方式是线程安全的,因为线程私有,在线程内创建的s1 ,不会被其它线程调用
public static void method1() {
//StringBuilder:线程不安全
StringBuilder s1 = new StringBuilder();
s1.append("a");
s1.append("b");
//...}

//stringBuilder的操作过程:是线程不安全的,
// 因为stringBuilder是外面传进来的,有可能被多个线程调用
public static void method2(StringBuilder stringBuilder) {
stringBuilder.append("a");
stringBuilder.append("b");
//...}

//stringBuilder的操作:是线程不安全的;因为返回了一个stringBuilder,
// stringBuilder有可能被其他线程共享
public static StringBuilder method3() {
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("a");
stringBuilder.append("b");
return stringBuilder;
}

//stringBuilder的操作:是线程安全的;因为返回了一个stringBuilder.toString()相当于new了一个String,
// 所以stringBuilder没有被其他线程共享的可能
public static String method4() {
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("a");
stringBuilder.append("b");
return stringBuilder.toString();
/*** 结论:如果局部变量在内部产生并在内部消亡的,那就是线程安全的*/
}
}
runtime data area Is there an Error Is there a GC
program counter no no
virtual machine stack Yes (SOE) no
native method stack yes no
method area Yes (OOM) yes
heap yes yes

Five native method interfaces and native method stacks

5.1. Native method interface

insert image description here
What are native methods?

Simply put, a Native Method is an interface for Java to call non-Java code. A Native Method is a Java method whose implementation is implemented in a non-Java language, such as C. This feature is not unique to Java. Many other programming languages ​​have this mechanism. For example, in C, you can use extern "c" to tell the c compiler to call a c function.

When defining a native method, the implementation body is not provided (somewhat like defining a Java interface), because the implementation body is implemented by a non-java language outside.

The role of the local interface is to integrate different programming languages ​​for Java, and its original intention is to integrate C/C++ programs.

//标识符native可以与其它java标识符连用,但是abstract除外
public class IHaveNatives{

public native void methodNative1(int x);

public native static long methodNative2();

private native synchronized float methodNative3(Object o);

native void methodNative4(int[] ary) throws Exception;
}

Why use Native Method?

Java is very convenient to use, but it is not easy to implement certain levels of tasks in Java, or when we care about the efficiency of the program, problems arise.

Interaction with the Java environment

Sometimes Java applications need to interact with the environment outside of Java, which is the main reason for the existence of native methods. You can think of the situation when Java needs to exchange information with some underlying system, such as an operating system or some hardware. The native method is just such a communication mechanism: it provides us with a very concise interface, and we don't need to understand the tedious details outside the Java application.

Interaction with the operating system

The JVM supports the Java language itself and the runtime library. It is the platform on which Java programs live. It consists of an interpreter (interpreting bytecode) and some libraries connected to native code. However, it is not a complete system after all, and it often depends on the support of an underlying system. These underlying systems are often powerful operating systems. By using native methods, we can use Java to realize the interaction between jre and the underlying system, and even some parts of JVM are written in c. Also, if we want to use some features of the encapsulated operating system that the Java language itself does not provide, we also need to use native methods.

Sun’s Java

Sun's interpreter is implemented in C, which allows it to interact with the outside world like some ordinary C. Most of the jre is implemented in Java, and it also interacts with the outside world through some native methods. For example: the setPriority() method of the class java.lang.Thread is implemented in Java, but it calls the local method setPriority() in the class. This native method is implemented in C and is implanted inside the JVM. On the Windows 95 platform, this native method will eventually call the Win32 setPriority() API. This is a specific implementation of a local method directly provided by the JVM. More often, the local method is provided by an external dynamic link library (external dynamic link library) and then called by JVw.

status quo

At present, this method is used less and less, except for hardware-related applications, such as driving printers through Java programs or managing production equipment through Java systems, which are relatively rare in enterprise-level applications. Because the communication between heterogeneous fields is very developed now, for example, Socket communication can be used, and Web Service can also be used, etc., so I won’t introduce much.

5.2. Native method stack

  • The Java virtual machine stack is used to manage the calls of Java methods, and the native method stack is used to manage the calls of native methods.
  • The native method stack is also thread-private.
  • Allows to be implemented as fixed or dynamically expandable memory size. (same in terms of out-of-memory)
    • If the stack size requested by the thread exceeds the maximum size allowed by the native method stack, the Java virtual machine will throw a StackOverflowError exception.
    • If the native method stack can be dynamically expanded, and cannot apply for enough memory when trying to expand, or if there is not enough memory to create the corresponding native method stack when creating a new thread, then the Java virtual machine will throw an OutOfMemoryError abnormal.
  • Native methods are implemented using the C language.
  • Its specific method is to register the native method in the Native Method Stack, and load the native method library when the Execution Engine executes.

insert image description here

  • When a thread calls a native method, it enters a whole new world that is no longer limited by the virtual machine. It has the same permissions as the virtual machine.
    • Native methods can access the runtime data area inside the virtual machine through the native method interface.
    • It can even directly use registers in the local processor
    • Allocate any amount of memory directly from the heap in native memory.
  • Not all JVMs support native methods. Because the Java virtual machine specification does not clearly require the language used, specific implementation methods, data structures, etc. of the local method stack. If the JVM product does not plan to support native methods, it is not necessary to implement the native method stack.
  • In the Hotspot JVM, the local method stack and the virtual machine stack are directly combined into one.

Six. Heap

6.1. Core overview of Heap

The heap is unique to a JVM process, that is, a process has only one JVM, but the process contains multiple threads, and they share the same heap space.

insert image description here

  • There is only one heap memory in a JVM instance, and the heap is also the core area of ​​Java memory management.
  • The Java heap area is created when the JVM starts, and its space size is determined. It is the largest memory space managed by the JVM.
    • The size of the heap memory can be adjusted.
  • The "Java Virtual Machine Specification" stipulates that the heap can be in a physically discontinuous memory space, but logically it should be considered continuous.
  • All threads share the Java heap, where thread-private buffers (Thread Local Allocation Buffer, TLAB) can also be divided .
  • The description of the Java heap in the "Java Virtual Machine Specification" is: All object instances and arrays should be allocated on the heap at runtime. (The heap is the run-time data area from which memory for all class instances and arrays is allocated)
  • Arrays and objects may never be stored on the stack because the stack frame holds a reference to the location of the object or array on the heap.
  • After the method ends, the objects in the heap will not be removed immediately, but will only be removed during garbage collection.
  • The heap is the key area for GC (Garbage Collection, Garbage Collector) to perform garbage collection.

insert image description here

6.2. Setting the heap memory size and OOM

The Java heap area is used to store Java object instances, so the size of the heap has been set when the JVM starts, and you can set it through the options "-Xmx" and "-Xms".
"-xms" is used to indicate the starting memory of the heap area, equivalent to -XX: InitialHeapsize "-Xmx" is used to indicate the maximum memory of the heap area, equivalent to -XX:MaxHeapsize
Once the memory size in the heap area exceeds When the maximum memory specified by "-Xmx" is reached, an OutOfMemoryError exception will be thrown.
Usually, the two parameters -Xms and -Xmx are configured with the same value. The purpose is to improve performance without re-partitioning the size of the computing heap after the java garbage collection mechanism cleans up the heap.
By default, the initial memory size: physical computer memory size/64. Maximum memory size: physical computer memory size / 4

/**
 * 1. 设置堆空间大小的参数
 * -Xms 用来设置堆空间(年轻代+老年代)的初始内存大小
 *      -X 是jvm的运行参数
 *      ms 是memory start
 * -Xmx 用来设置堆空间(年轻代+老年代)的最大内存大小
 *
 * 2. 默认堆空间的大小
 *    初始内存大小:物理电脑内存大小 / 64
 *             最大内存大小:物理电脑内存大小 / 4
 * 3. 手动设置:-Xms600m -Xmx600m
 *     开发中建议将初始堆内存和最大的堆内存设置成相同的值。
 *
 * 4. 查看设置的参数:方式一: jps   /  jstat -gc 进程id
 *                  方式二:-XX:+PrintGCDetails
 * @author shkstart  [email protected]
 * @create 2020  20:15
 */
public class HeapSpaceInitial {
    public static void main(String[] args) {

        //返回Java虚拟机中的堆内存总量
        long initialMemory = Runtime.getRuntime().totalMemory() / 1024 / 1024;
        //返回Java虚拟机试图使用的最大堆内存量
        long maxMemory = Runtime.getRuntime().maxMemory() / 1024 / 1024;

        System.out.println("-Xms : " + initialMemory + "M");
        System.out.println("-Xmx : " + maxMemory + "M");

//        System.out.println("系统内存大小为:" + initialMemory * 64.0 / 1024 + "G");
//        System.out.println("系统内存大小为:" + maxMemory * 4.0 / 1024 + "G");

        try {
            Thread.sleep(1000000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

-Xmx600M -Xms600M -XX:+PrintGCDetails


Why is it only 575M when it is clearly set to 600M?
Only part of the survival area in the new generation is being used, and the other part is free.
(179200+409600)/1024=575
(179200+409600+25600)/1024=600


Total capacity of S0C:S0 area S0U:Used capacity of S0 area

Install the Visual GC plug-in for jvisualvm
After opening jvisualvm, click Tools—>Plugins—>Available Plugins---->Find the corresponding version of Visual GC. The
JDK8 version is used, and the corresponding Visual GC is 2.1.2, because the direct installation of the plug-in has already been done. No, so go to the official website to download the plug-in and install
https://visualvm.github.io/download.html


Click the link in the above red box to jump to the following interface and find your corresponding JDK version 

Find the version of Visual GC 2.1.2 you want to download


After the download is complete, put the installation package in the bin directory of the jdk installation directory

 Then open jvisualvm to install the plugin.

 

 
6.3. Young Generation and Old Generation

The Java objects stored in the JVM can be divided into two categories:
one is a short-lived transient object, which is created and destroyed very quickly,
and the other has a very long life cycle. Under the circumstances, it can also be consistent with the life cycle of the JVM.
If the Java heap area is further subdivided, it can be divided into the young generation (YoungGen) and the old generation (oldGen). The young generation can be divided into Eden space, Survivor0 space and survivor1 space (sometimes also called from area, to area).

Configure the proportion of the new generation and the old generation in the heap structure. In general, the default can be used without deliberately modifying it. Default -XX:NewRatio=2, which means that the new generation occupies 1, the old generation occupies 2, and the new generation occupies 1/3 of the entire heap . You can modify -XX:NewRatio=4, which means that the new generation occupies 1, the old generation occupies 4, and the new generation occupies
1/3 of the entire heap.
1/5 of the entire heap

In HotSpot, the default ratio of Eden space to the other two survivor spaces is 8:1:1. Of course, developers can adjust this space ratio through the option "-XX:SurvivorRatio". For example -XX:SurvivorRatio=8
Almost all Java objects are newly created in the Eden area. Most of the Java objects are destroyed in the new generation

IBM's special research shows that 80% of the objects in the new generation are "live and die".
You can use the option "-Xmn" to set the maximum memory size of the new generation. This parameter generally uses the default value.

-XX:NewRatio : Set the ratio of the new generation to the old generation. The default value is 2.
-XX:SurvivorRatio: Set the ratio of the Eden area to the Survivor area in the new generation. The default value is 8
-XX:-UseAdaptiveSizePolicy: Turn off the adaptive memory allocation policy (not used temporarily)
-Xmn: Set the maximum memory of the new generation. (generally not set)

6.4. Graphical object allocation process


An overview of the object allocation process:

Allocating memory for new objects is a very rigorous and complex task. JVM designers not only need to consider how and where to allocate memory, but also need to consider Gc because the memory allocation algorithm is closely related to the memory recovery algorithm. Whether memory fragmentation will occur in the memory space after memory recovery is performed.
1) New objects are placed in the Eden Park first. There is a size limit for this zone.
2) When the space in Eden is full, the program needs to create objects again, and the garbage collector of the JVM will perform garbage collection (Minor GC) on the Eden area to destroy the objects in the Eden area that are no longer referenced by other objects. Then load new objects into the Eden area
3) and then move the remaining objects in the Eden area to the survivor area 0.
4) If garbage collection is triggered again, the ones that survived the last time will be placed in Survivor 0. If they are not recycled, they will be placed in Survivor 1.
5) If you go through garbage collection again, it will be put back into Survivor 0, and then go to Survivor 1.
6) When can I go to the nursing home? The number of times can be set. The default is 15 times. You can set the parameter: -XX:MaxTenuringThreshold= for setting.
7) In the elderly care area, it is relatively leisurely. When the memory in the retirement area is insufficient, GC: Major GC is triggered again to clean up the memory in the retirement area.
8) If the Elderly District executes the Major GC and finds that the object cannot be saved, an OOM exception will occur

Summary:
Summary for survivors s0 and s1 areas: there is exchange after copying, whoever is empty is to.
Regarding garbage collection: frequently collect in newborn areas, rarely in old-age areas, and almost never in permanent areas/metaspace.

6.5. Minor GC、Major GC、Full GC

When the JVM is performing Gc, it does not recycle the above three memory areas (new generation, old generation method area) together every time. Most of the time, the recycling refers to the new generation.
For the implementation of HotSpot VM, the Gc in it is divided into two types according to the recycling area: one is partial collection (Partial Gc), and the other is full heap collection (Full GC). Partial collection: not a complete collection of the entire Java
heap garbage collection. It is further divided into:
Minor GC / Young GC: only the garbage collection of the new generation (Eden\S0, S1)
Old generation collection (Major GO / Old GO): only the garbage collection of the old generation.
Currently, only the CMS GC has the behavior of separately collecting the old generation.
Note that in many cases, Major GC will be confused with Full Gc, and it is necessary to specifically distinguish whether it is old generation recycling or full heap recycling.
Mixed collection (Mixed GC): collects the entire new generation and part of the old generation garbage collection. Currently, only G1 GC has this behavior.
Full heap collection (Full GC): Garbage collection that collects the entire java heap and method area.

Young generation GC (Minor GC) trigger mechanism:
When the young generation space is insufficient, Minor GC will be triggered. The young generation full here refers to the full Eden area, and the Survivor full will not trigger GC. (Every Minor GC will clean up the memory of the young generation. When the Minor GC cleans up the memory in the Eden area, it will also clean up the memory garbage in the Survivor area.) Because most
Java objects have the characteristics of eternity, Minor GC is very frequent and generally recycled The speed is also faster. This definition is clear and easy to understand.
Minor GC will cause STW, suspend the threads of other users, and wait for the garbage collection to end before the user threads resume running.

Old generation GC (Major GC/Full GC) trigger mechanism:
refers to the GC that occurs in the old generation. When objects disappear from the old generation, we say that "Major GC" or "Full GC" has occurred.
Major GC appears, often accompanied by at least one Minor GC (but not absolute, in the collection strategy of the Parallel scavenge collector, there is a strategy selection process for directly performing Major GC).
That is, when there is insufficient space in the old generation, it will try to trigger Minor GC first. If there is not enough space in the future,
the speed of triggering Major GC and Major GC will generally be more than 10 times slower than Minor GC, and the STW time will be longer. If the memory is not enough after Major GC, OOM will be reported.

Full GC trigger mechanism: ( described in detail later) There are five situations that trigger the execution of Full GC:
(1) When system. 3) Insufficient space in the method area(4) The average size of the old generation after passing the Minor GC is larger than the available memory in the old generation(5) When copying from the Eden area, survivor space0 (From Space) area to the survivor space1 (ToSpace) area, the object If the size is larger than the available memory of To Space, the object is transferred to the old generation, and the available memory of the old generation is smaller than the object size.Note: Full GC should be avoided in development and tuning. This will temporarily shorten the time.




6.6. Heap space generation idea

Why do you need to divide the Jaya heap into generations? Does it not work properly without generation? After research, different objects have different life cycles. 70%-99% of objects are temporary objects.
The new generation: consists of Eden and two survivors of the same size (also known as from/to, s0/s1), and to is always empty.
Old generation: stores objects that survive multiple GCs in the new generation.

In fact, it is completely possible to not divide into generations. The only reason for generation is to optimize cc performance. If there is no generation, then all the objects are together, just like shutting all the people in a school in a classroom. During GC, it is necessary to find which objects are useless, so that all areas of the heap will be scanned. And many objects are born and die, if it is divided into generations, put the newly created object in a certain place, and when the GC, first recycle the area that stores the "lived and died" objects, so that it will be freed. Come out with a lot of space.

6.7. Memory allocation strategy (or object promotion (Promotion) rules)


If the object is born in Eden and still survives after the first MinorGC, and can be accommodated by survivor, it will be moved to the survivor space and the age of the object will be set to 1. Every time an object survives a Minor GC in the survivor area, its age will increase by 1 year. When its age increases to a certain level (the default is 15 years old, in fact, each JVM and each GC are different), it will be removed. Promoted to the old generation.

The age threshold for promoting an object to the old generation can be set by the option -XX:MaxTenuringThreshold.

The principles of object allocation for different age groups are as follows: Prioritize the allocation of
large objects to Eden and directly allocate them to the old generation.
Try to avoid too many large objects in the program. Long-lived objects are allocated to the old generation.

Dynamic object age judgment
If the sum of the size of all objects of the same age in the survivor area is greater than half of the survivor space, objects whose age is greater than or equal to this age can directly enter the old age without waiting for the age required in MaxTenuringThreshold.

Space allocation guarantee, -XX:HandlePromotionFailure

6.8. Allocating memory for objects: TLAB

Why is there TLAB ((Thread Local Allocation Buffer )?
The heap area is a thread shared area, and any thread can access the shared data in the heap area.
Since the creation of object instances is very frequent in the JVM, in a concurrent environment, from the heap area Dividing memory space is not thread-safe.
In order to prevent multiple threads from operating the same address, mechanisms such as locking need to be used, which will affect the allocation speed.

What is TLAB?
From the perspective of memory model rather than garbage collection, the Eden area continues to be divided. The JVM allocates a private cache area for each thread, which is included in the Eden space.
When multiple threads allocate memory at the same time, using TLAB can avoid a series of non-thread-safety problems,
and can also improve the throughput of memory allocation, so we can call this memory allocation method a fast allocation strategy.
As far as I know, all JVMs derived from openJDK provide TLAB design.

Re-explanation of TLAB:
Although not all object instances can successfully allocate memory in TLAB, JVM does use TLAB as the first choice for memory allocation.
In the program, developers can set whether to enable TLAB space through the option "-xX:UseTLAB".
By default, the memory in the TLAB space is very small, occupying only 1% of the entire Eden space. Of course, we can set the percentage of the Eden space occupied by the TLAB space through the option "-XX:TABwasteTargetPercent".
Once the object fails to allocate memory in the TLAB space, the JVM will try to use the locking mechanism to ensure the atomicity of data operations, thereby directly allocating memory in the Eden space.

6.9. Summary of the parameter settings of the heap space

Commonly used jvm parameters for testing heap space:
-XX:+PrintFlagsInitial: View the default initial values ​​of all parameters
-XX:+PrintFlagsFinal: View the final values ​​of all parameters (there may be modifications, no longer initial values)

Specifically view the command of a parameter: jps: view the currently running process
jinfo -flag SurvivorRatio process id

-Xms: Initial heap space memory (1/64 of physical memory by default)
-Xmx: Maximum heap space memory (1/4 of physical memory by default)
-Xmn: Set the size of the new generation. (Initial value and maximum value)
-XX:NewRatio: Configure the proportion of the new generation and the old generation in the heap structure
-XX:SurvivorRatio: Set the ratio of Eden and S0/S1 space in the new generation
-XX:MaxTenuringThreshold: Set the new generation garbage Maximum age of
-XX:+PrintGCDetails: Output detailed GC processing log
Print gc brief information: ① -XX:+PrintGC ② -verbose:gc

-XX:HandlePromotionFailure: Whether to set the space allocation guarantee

Before Minor GC occurs, the virtual machine checks whether the maximum available continuous space in the old generation is greater than the total space of all objects in the new generation.
If it is larger, the Minor GC is safe this time
. If it is smaller, the virtual machine checks whether the -xX:HandlePromotionFailure setting value allows guarantee failure.
If HandlePromotionFailure=true, it will continue to check whether the maximum available continuous space in the old generation is greater than the average size of objects promoted to the old generation.
If it is greater, try to perform a Minor GC, but this Minor GC is still risky;
if it is less than, perform a Full GC instead.
If HandlePromotionFailure=false, perform a Full GC instead.
After JDK6 Update24, the HandlePromotionFailure parameter will no longer affect the space allocation guarantee policy of the virtual machine. Observe the source code changes in openJDK. Although the HandlePromotionFailure parameter is defined in the source code, it will no longer be used in the code. The rule after JDK6 Update 24 becomes that Minor GC will be performed as long as the continuous space of the old generation is larger than the total size of objects in the new generation or the average size of previous promotions, otherwise Full GC will be performed.

6.10. Is the heap the only option for allocating object storage?

In "In-depth Understanding of Java Virtual Machine", there is such a description about Java heap memory:
With the development of JIT compilation period and the gradual maturity of escape analysis technology, on-stack allocation and scalar replacement optimization technology will lead to some subtle changes . Objects are allocated on the heap and gradually become less "absolute" .
In the Java virtual machine, objects are allocated memory in the Java heap, which is a common knowledge. However, there is a special case, that is, if after escape analysis (Escape Analysis), it is found that an object does not escape the method, then it may be optimized to be allocated on the stack . This eliminates the need to allocate memory on the heap and garbage collection. This is also the most common off-heap storage technique.
In addition, the aforementioned TaoBaoVM based on the deep customization of openJDK, in which the innovative GCIH (GC invisible heap) technology implements off-heap, moves Java objects with a long life cycle from the heap to outside the heap, and cc cannot manage the internals of GCIH The Java object, in order to achieve the purpose of reducing the recycling frequency of cc and improving the recycling efficiency of Gc.

Escape Analysis Overview
How to allocate objects on the heap to the stack requires the use of escape analysis.
This is a cross-function global data flow analysis algorithm that can effectively reduce the synchronization load and memory heap allocation pressure in Java programs.
Through escape analysis, the Java Hotspot compiler can analyze the scope of use of a new object reference and decide whether to allocate this object on the heap.
The basic behavior of escape analysis is to analyze the dynamic scope of objects:
when an object is defined in a method, and the object is only used inside the method, it is considered that no escape has occurred.
When an object is defined in a method and it is referenced by an external method, it is considered to have escaped. For example, as a call parameter passed to other places.

Objects that have not escaped can be allocated on the stack, and the stack space is removed as the method execution ends.

/**
 * 逃逸分析
 *
 *  如何快速的判断是否发生了逃逸分析,大家就看new的对象实体是否有可能在方法外被调用。
 * @author shkstart
 * @create 2020 下午 4:00
 */
public class EscapeAnalysis {

    public EscapeAnalysis obj;

    /*
    方法返回EscapeAnalysis对象,发生逃逸
     */
    public EscapeAnalysis getInstance(){
        return obj == null? new EscapeAnalysis() : obj;
    }
    /*
    为成员属性赋值,发生逃逸
     */
    public void setObj(){
        this.obj = new EscapeAnalysis();
    }
    //思考:如果当前的obj引用声明为static的?仍然会发生逃逸。

    /*
    对象的作用域仅在当前方法中有效,没有发生逃逸
     */
    public void useEscapeAnalysis(){
        EscapeAnalysis e = new EscapeAnalysis();
    }
    /*
    引用成员变量的值,发生逃逸
     */
    public void useEscapeAnalysis1(){
        EscapeAnalysis e = getInstance();
        //getInstance().xxx()同样会发生逃逸
    }
}

Parameter setting:
After the JDK 6u23 version, escape analysis has been enabled by default in HotSpot. .If you are using an earlier version, developers can
explicitly enable escape analysis by: option "-xX: +DoEscapeAnalysis"
and view the filter results of escape analysis through the option "-XX:+PrintEscapeAnalysis".

Conclusion:
If local variables can be used in development, do not use them defined outside the method.

Using escape analysis, the compiler can optimize the code as follows:
1) Allocation on the stack . Convert heap allocation to stack allocation. If an object is allocated in a subroutine, the object may be a candidate for stack allocation rather than heap allocation if a pointer to the object never escapes.
2) Synchronization is omitted . If an object is found to be accessible only from one thread, operations on that object may not be synchronized.
3), separate objects or scalar replacement . Some objects may be accessed without existing as a continuous memory structure, so part (or all) of the object may not be stored in memory, but stored in CPU registers.

Stack allocation for code optimization

According to the results of escape analysis during compilation, the JIT compiler finds that if an object does not escape the method, it may be optimized to be allocated on the stack. After the allocation is completed, continue to execute in the call stack, and finally the thread ends, the stack space is reclaimed, and the local variable object is also reclaimed. This eliminates the need for garbage collection.

Common stack allocation scenarios
have been explained in the escape analysis. The scenarios where escape occurs are assigning values ​​to member variables, method return values, and passing instance references.

Synchronization omission for code optimization (lock elimination)

The cost of thread synchronization is quite high, and the consequence of synchronization is to reduce concurrency and performance.
When dynamically compiling a synchronization block, the JIT compiler can use escape analysis to determine whether the lock object used by the synchronization block can only be accessed by one thread and has not been released to other threads. If not, then the JIT compiler will cancel the synchronization of this part of the code when compiling the synchronization block. This can greatly improve concurrency and performance. This process of canceling synchronization is called synchronization elision, also called lock elimination.

Scalar replacement for code optimization
Scalar (Scalar) refers to a data that cannot be decomposed into smaller data. The primitive data types in Java are scalars.
In contrast, those data that can be decomposed are called aggregates, and objects in Java are aggregates because they can be decomposed into other aggregates and scalars.
In the JIT stage, if after escape analysis, it is found that an object will not be accessed by the outside world, then after JIT optimization, the object will be disassembled into several member variables contained in it to replace it. This process is known as scalar substitution.
Scalar replacement parameter setting:
Parameter -XX:+EliminateAllocations: Enables scalar replacement (open by default), allowing objects to be scattered and allocated on the stack.

The above code performs 100 million allocs in the main function. Call to create an object. Since the User object instance needs to occupy about 16 bytes of space, the cumulative allocated space reaches nearly 1.5GB. If the heap space is less than this value, GC will inevitably occur. Run the above code with the following parameters:
-server -Xmx100m -Xms100m -XX:+DoEscapeAnalysis -XX:+Printcc -XX:+EliminateAl1Gcations
The parameters used here are as follows:
Parameters -server: Start the server mode, because it can only be enabled in the server mode escape analysis. · Parameter -XX:+DoEscapeAnalysis: Enable escape analysis
Parameter -Xmx10m: Specifies the maximum heap space of 10MB
Parameter -XX:+PrintGC: Gc log will be printed.
Parameter -XX:+EliminateAllocations: Enables scalar replacement (opened by default), allowing objects to be scattered and allocated on the stack. For example, if an object has two fields of id and name, then these two fields will be regarded as two independent Local variables are assigned.

On-stack allocation is not enabled in HotSpot, but scalar replacement is enabled. Escape analysis is also enabled by default.

 7. Method area

7.1. Interaction between stack, heap and method area

Dachang interview questions
Let’s talk about the JVM memory model. What are the areas? What do they do?
Java8’s memory generation improves
which areas are in the JVM memory, and what is the role of each area?
JVM memory distribution/memory structure? Stack and The difference between the heap? The structure of the heap? Why two survivor areas?
The ratio of Eden and survivor allocates jvm memory partitions, why
there are new generation and old
Java memory partitions?
?
The memory structure of the VM, the ratio of Eden and survivor.
Why should VM memory be divided into new generation, old generation, and permanent generation.
Why is the new generation divided into Eden and survivor.
The Jvm memory model and partitions need to detail what to put in each area.
The memory model of the JVM, what changes have been made in Java 8
, which areas the JVM memory is divided into, and what is the role of each area? Will garbage collection occur in the permanent generation of
the java memory allocation jvm ? Old generation?

7.2. Understanding of method area

The "Java Virtual Machine Specification" clearly states: "Although all method areas are logically part of the heap, some simple implementations may not choose to perform garbage collection or compression." But for HotSpotJVM, the method The area also has an alias called Non-Heap (non-heap), the purpose is to separate it from the heap.
Therefore, the method area is regarded as a memory space independent of the Java heap.


Method Area (Method Area), like the Java heap, is a memory area shared by each thread.
The method area is created when the JVM starts, and its actual physical memory space can be discontinuous just like the Java heap area.
The size of the method area, like the heap space, can be fixed or expandable. The size of the method area determines how many classes the system can save. If the system defines too many classes, causing the method area to overflow, the virtual machine will also throw Out of memory error: java.lang.outOfMemoryError: PermGen spacel or java.lang.outOfMemoryError: Metaspace. Load a large number of third-party jar packages; Tomcat deploys too many projects (30-50); a large number of dynamic reflection classes are
generated. Closing the JVM will release the memory in this area.

The evolution of the method area in Hotspot
In jdk7 and before, it is customary to call the method area the permanent generation. Starting with jdk8, the permanent generation is replaced by the metaspace.
ln JDK 8, classes metadata is now stored in the native heap and this space is called Metaspace
Essentially, the method area and the permanent generation are not equivalent. Only for hotspot. The "Java Virtual Machine Specification" does not make uniform requirements on how to implement the method area. For example: The concept of permanent generation does not exist in BEA JRockit/IBM J9.
Looking at it now, it was not a good idea to use the permanent generation back then. Causes Java programs to be more prone to ooM (exceeding the upper limit of -XX:MaxPermSize)

In JDK 8, the concept of permanent generation was finally completely abandoned, and the metaspace (Metaspace) implemented in local memory like JRockit and J9 was used instead.

The essence of the metaspace is similar to that of the permanent generation, which is the implementation of the method area in the JVM specification. However, the biggest difference between the metaspace and the permanent generation is that the metaspace is not in the memory set by the virtual machine,
but uses local memory.
The permanent generation and metaspace are not only changed in name, but the internal structure is also adjusted.
According to the "Java Virtual Machine Specification", if the method area cannot meet the new memory allocation requirements, an OOM exception will be thrown.

7.3. Set method area size and OOM

jdk8 and later:
The size of the metadata area can be specified with the parameters -XX:MetaspaceSize and -XX:MaxMetaspaceSize.

The default is platform dependent. Under windows, -XX:Metaspacesize is 21M, and the value of -XX:MaxMetaspaceSize is -1, that is, there is no limit

Unlike the permanent generation, if you do not specify a size, the virtual machine will by default use up all available system memory. If the metadata area overflows, the virtual machine will also throw an exception OutOfMemoryError: Metaspace
-XX:MetaspaceSize: Set the initial metaspace size. For a 64-bit server-side JVM, the default -XX:Metaspacesize value is 21MB. This is the initial high water mark. Once this water mark is touched, Full Gc will be triggered and unload useless classes (that is, the class loaders corresponding to these classes are no longer alive), and then the high water mark will be reset. The value of the new high-water mark depends on how much metaspace is freed after GC. If the freed space is not enough, then increase the value appropriately if it does not exceed MaxMetaspaceSize. If there is too much space to be freed, reduce this value appropriately.
If the initial high-water mark is set too low, the high-water mark adjustment described above can occur many times. Through the log of the garbage collector, it can be observed that the Full GC is called multiple times. In order to avoid frequent Gc, it is recommended to set -XX:MetaspaceSize to a relatively high value.

7.4. Internal structure of the method area

 

 Runtime constant pool and constant pool

7.5. Details of the evolution of the method area

 Object entities corresponding to static references always have heap space.

 7.6. Garbage Collection in the Method Area


Summarize

Eight. Object instantiation, memory layout and access positioning, direct memory

Dachang interview questions
How objects are stored in the JVM? What is in the object header information?
What is in the java object header

8.1. Object instantiation 

Object instantiation steps:

①When the virtual machine encounters a new instruction, it first checks whether the parameter of this instruction can locate a symbolic reference of a class in the constant pool of Metaspace, and checks whether the class represented by the symbolic reference has been loaded, parsed, and initialized. ( That is, to determine whether the classifier information exists ). If not, then in the parent delegation mode, use the current class loader to find the corresponding .class file with ClassLoader+package name+class name Key. If the file is not found, a ClassNotFoundException is thrown, and if it is found, the class is loaded and the corresponding Class class object is generated

② First calculate the size of the space occupied by the object, and then divide a piece of memory in the heap for the new object.
If the instance member variable is a reference variable, only the reference variable space is allocated, which is 4 bytes in size.

If the memory is regular, then the virtual machine will use the pointer collision method (Bump The Pointer) to allocate memory for the object .
It means that all used memory is on one side, free memory is on the other side, and a pointer is placed in the middle as an indicator of the demarcation point. Allocating memory is just moving the pointer to the free side by a distance equal to the size of the object. If the garbage collector chooses a compression algorithm based on Serial or ParNew, the virtual machine adopts this allocation method. Generally, when using a collector with a compact (finishing) process, pointer collisions are used.

If the memory is not regular and the used memory and unused memory are interleaved, then the virtual machine will use the free list method to allocate memory for the object .
It means that the virtual machine maintains a list, records which memory blocks are available, finds a large enough space from the list to allocate to the object instance, and updates the content on the list. This allocation method is called "free list (Free List)".

⑤ Which allocation method to choose is determined by whether the Java heap is regular, and whether the Java heap is regular is determined by whether the garbage collector used has a compression function.

⑥Store data such as the class of the object (that is, the metadata information of the class), the HashCode of the object, the GC information of the object, and the lock information in the object header of the object. Exactly how this process is set up depends on the JVM implementation.

⑦ From the perspective of the Java program, the initialization officially begins. Initialize member variables, execute the instantiation code block, call the constructor of the class, and assign the first address of the object in the heap to the reference variable. Therefore, generally speaking (determined by whether there is an invokespecial instruction in the bytecode), the new instruction will be followed by the execution method to initialize the object according to the programmer's wishes, so that a truly usable object is completely created.

8.2. Memory layout of objects

/**
 * 测试对象实例化的过程
 *  ① 加载类元信息 - ② 为对象分配内存 - ③ 处理并发问题  - ④ 属性的默认初始化(零值初始化)
 *  - ⑤ 设置对象头的信息 - ⑥ 属性的显式初始化、代码块中初始化、构造器中初始化
 *
 *  给对象的属性赋值的操作:
 *  ① 属性的默认初始化 - ② 显式初始化 / ③ 代码块中初始化 - ④ 构造器中初始化
 * @author shkstart  [email protected]
 * @create 2020  17:58
 */

 

public class Customer{
    int id = 1001;
    String name;
    Account acct;

    {
        name = "匿名客户";
    }
    public Customer(){
        acct = new Account();
    }

}
class Account{

}

public class CustomerTest {
    public static void main(String[] args) {
        Customer cust = new Customer();
    }
}

What happened to the main method of CustomerTest ?

8.3. Object access location

Handle access method: 

Direct pointer method:

Advantages and disadvantages of handle access and direct pointer access:

These two object access methods have their own advantages. The biggest advantage of using a handle to access is that the address of the stable handle is stored in the reference. When the object is moved (moving objects during garbage collection is a very common behavior), only the address in the handle will be changed. The instance data pointer, and the reference itself does not need to be modified.

The biggest advantage of using direct pointers to access is that it is faster, which saves the time overhead of one pointer positioning. Since object access is very frequent in Java, the accumulation of such overhead is also an extremely considerable execution cost. HotSpot Mainly use the second way for object access.

8.4. Direct Memory

Nine. Execution engine of JVM

9.1. Execution Engine Overview

Compiling java files into class files is called front-end compilation (javac) , and compiling bytecodes into machine instructions in the execution engine is called back-end compilation .


 

9.2. The process of compiling and executing Java code

 

 

 
9.3. Machine code, instructions, assembly language

High-level language needs to be translated into assembly language first, and assembly language is then translated into machine instructions

9.4. Interpreter

9.5. JIT Compiler

 

 

Ten. StringTable

10.1. Basic properties of String

10.2. String memory allocation

Use the intern method of String to add the string to the string constant pool. At the same time, this method can be used to prove that the string constant pool in JDK8 is stored in the heap.

/**
 * jdk6中:
 * -XX:PermSize=6m -XX:MaxPermSize=6m -Xms6m -Xmx6m
 *
 * jdk8中:
 * -XX:MetaspaceSize=6m -XX:MaxMetaspaceSize=6m -Xms6m -Xmx6m
 * @author shkstart  [email protected]
 * @create 2020  0:36
 */
public class StringTest3 {
    public static void main(String[] args) {
        //使用Set保持着常量池引用,避免full gc回收常量池行为
        Set<String> set = new HashSet<String>();
        //在short可以取值的范围内足以让6MB的PermSize或heap产生OOM了。
        short i = 0;
        while(true){
            set.add(String.valueOf(i++).intern());
        }
    }
}



 


10.3. Basic operations

10.4. String concatenation operation

The execution details of s1 + s2 are as follows: (the variable s is temporarily defined by me)
① StringBuilder s = new StringBuilder();
② s.append(“a”)
③ s.append(“b”)
④ s.toString( ) --> approximately equal to new String("ab")

Supplement: StringBuilder is used after jdk5.0, and StringBuffer is used before jdk5.0
1) The string splicing operation does not necessarily use StringBuilder!
   If the left and right sides of the splicing symbol are string constants or constant references, then still Use compile-time optimization, that is, a non-StringBuilder way.
2) For the structure of final modified classes, methods, basic data types, and reference data types, it is recommended to use final when it can be used.
 

package com.atguigu.java1;

import org.junit.Test;

/**
 * 字符串拼接操作
 * @author shkstart  [email protected]
 * @create 2020  0:59
 */
public class StringTest5 {
    @Test
    public void test1(){
        String s1 = "a" + "b" + "c";//编译期优化:等同于"abc"
        String s2 = "abc"; //"abc"一定是放在字符串常量池中,将此地址赋给s2
        /*
         * 最终.java编译成.class,再执行.class
         * String s1 = "abc";
         * String s2 = "abc"
         */
        System.out.println(s1 == s2); //true
        System.out.println(s1.equals(s2)); //true
    }

    @Test
    public void test2(){
        String s1 = "javaEE";
        String s2 = "hadoop";

        String s3 = "javaEEhadoop";
        String s4 = "javaEE" + "hadoop";//编译期优化
        //如果拼接符号的前后出现了变量,则相当于在堆空间中new String(),具体的内容为拼接的结果:javaEEhadoop
        String s5 = s1 + "hadoop";
        String s6 = "javaEE" + s2;
        String s7 = s1 + s2;

        System.out.println(s3 == s4);//true
        System.out.println(s3 == s5);//false
        System.out.println(s3 == s6);//false
        System.out.println(s3 == s7);//false
        System.out.println(s5 == s6);//false
        System.out.println(s5 == s7);//false
        System.out.println(s6 == s7);//false
        //intern():判断字符串常量池中是否存在javaEEhadoop值,如果存在,则返回常量池中javaEEhadoop的地址;
        //如果字符串常量池中不存在javaEEhadoop,则在常量池中加载一份javaEEhadoop,并返回次对象的地址。
        String s8 = s6.intern();
        System.out.println(s3 == s8);//true
    }

    @Test
    public void test3(){
        String s1 = "a";
        String s2 = "b";
        String s3 = "ab";
        /*
        如下的s1 + s2 的执行细节:(变量s是我临时定义的)
        ① StringBuilder s = new StringBuilder();
        ② s.append("a")
        ③ s.append("b")
        ④ s.toString()  --> 约等于 new String("ab")

        补充:在jdk5.0之后使用的是StringBuilder,在jdk5.0之前使用的是StringBuffer
         */
        String s4 = s1 + s2;//
        System.out.println(s3 == s4);//false
    }
    /*
    1. 字符串拼接操作不一定使用的是StringBuilder!
       如果拼接符号左右两边都是字符串常量或常量引用,则仍然使用编译期优化,即非StringBuilder的方式。
    2. 针对于final修饰类、方法、基本数据类型、引用数据类型的量的结构时,能使用上final的时候建议使用上。
     */
    @Test
    public void test4(){
        final String s1 = "a";
        final String s2 = "b";
        String s3 = "ab";
        String s4 = s1 + s2;
        System.out.println(s3 == s4);//true
    }
    //练习:
    @Test
    public void test5(){
        String s1 = "javaEEhadoop";
        String s2 = "javaEE";
        String s3 = s2 + "hadoop";
        System.out.println(s1 == s3);//false

        final String s4 = "javaEE";//s4:常量
        String s5 = s4 + "hadoop";
        System.out.println(s1 == s5);//true

    }

    /*
    体会执行效率:通过StringBuilder的append()的方式添加字符串的效率要远高于使用String的字符串拼接方式!
    详情:① StringBuilder的append()的方式:自始至终中只创建过一个StringBuilder的对象
          使用String的字符串拼接方式:创建过多个StringBuilder和String的对象
         ② 使用String的字符串拼接方式:内存中由于创建了较多的StringBuilder和String的对象,内存占用更大;如果进行GC,需要花费额外的时间。

     改进的空间:在实际开发中,如果基本确定要前前后后添加的字符串长度不高于某个限定值highLevel的情况下,建议使用构造器实例化:
               StringBuilder s = new StringBuilder(highLevel);//new char[highLevel]
     */
    @Test
    public void test6(){

        long start = System.currentTimeMillis();

//        method1(100000);//4014
        method2(100000);//7

        long end = System.currentTimeMillis();

        System.out.println("花费的时间为:" + (end - start));
    }

    public void method1(int highLevel){
        String src = "";
        for(int i = 0;i < highLevel;i++){
            src = src + "a";//每次循环都会创建一个StringBuilder、String
        }
//        System.out.println(src);

    }

    public void method2(int highLevel){
        //只需要创建一个StringBuilder
        StringBuilder src = new StringBuilder();
        for (int i = 0; i < highLevel; i++) {
            src.append("a");
        }
//        System.out.println(src);
    }
}



10.5. Use of intern()

/**
 * 题目:
 * new String("ab")会创建几个对象?看字节码,就知道是两个。
 *     一个对象是:new关键字在堆空间创建的
 *     另一个对象是:字符串常量池中的对象"ab"。 字节码指令:ldc
 *
 * 思考:
 * new String("a") + new String("b")呢?
 *  对象1:new StringBuilder()
 *  对象2: new String("a")
 *  对象3: 常量池中的"a"
 *  对象4: new String("b")
 *  对象5: 常量池中的"b"
 *
 *  深入剖析: StringBuilder的toString():
 *      对象6 :new String("ab")
 *       强调一下,toString()的调用,在字符串常量池中,没有生成"ab"
 *
 * @author shkstart  [email protected]
 * @create 2020  20:38
 */
public class StringNewTest {
    public static void main(String[] args) {
//        String str = new String("ab");

        String str = new String("a") + new String("b");
    }
}

/**
 * 如何保证变量s指向的是字符串常量池中的数据呢?
 * 有两种方式:
 * 方式一: String s = "shkstart";//字面量定义的方式
 * 方式二: 调用intern()
 *         String s = new String("shkstart").intern();
 *         String s = new StringBuilder("shkstart").toString().intern();
 *
 * @author shkstart  [email protected]
 * @create 2020  18:49
 */
public class StringIntern {
    public static void main(String[] args) {

        String s = new String("1");
        s.intern();//调用此方法之前,字符串常量池中已经存在了"1"
        String s2 = "1";
        System.out.println(s == s2);//jdk6:false   jdk7/8:false


        String s3 = new String("1") + new String("1");//s3变量记录的地址为:new String("11")
        //执行完上一行代码以后,字符串常量池中,是否存在"11"呢?答案:不存在!!
        s3.intern();//在字符串常量池中生成"11"。如何理解:jdk6:创建了一个新的对象"11",也就有新的地址。
                                            //         jdk7:此时常量中并没有创建"11",而是创建一个指向堆空间中new String("11")的地址
        String s4 = "11";//s4变量记录的地址:使用的是上一行代码代码执行时,在常量池中生成的"11"的地址
        System.out.println(s3 == s4);//jdk6:false  jdk7/8:true
    }
}

public class StringIntern1 {
    public static void main(String[] args) {
        //StringIntern.java中练习的拓展:
        String s3 = new String("1") + new String("1");//new String("11")
        //执行完上一行代码以后,字符串常量池中,是否存在"11"呢?答案:不存在!!
        String s4 = "11";//在字符串常量池中生成对象"11"
        String s5 = s3.intern();
        System.out.println(s3 == s4);//false
        System.out.println(s5 == s4);//true
    }
}

/**
 * @author shkstart  [email protected]
 * @create 2020  20:17
 */
public class StringExer1 {
    public static void main(String[] args) {
        String s = new String("a") + new String("b");//new String("ab")
        //在上一行代码执行完以后,字符串常量池中并没有"ab"

        String s2 = s.intern();//jdk6中:在串池中创建一个字符串"ab"
                               //jdk8中:串池中没有创建字符串"ab",而是创建一个引用,指向new String("ab"),将此引用返回

        System.out.println(s2 == "ab");//jdk6:true  jdk8:true
        System.out.println(s == "ab");//jdk6:false  jdk8:true
    }
}

/**
 *
 * @author shkstart  [email protected]
 * @create 2020  20:26
 */
public class StringExer2 {
    public static void main(String[] args) {
        String s1 = new String("ab");//执行完以后,会在字符串常量池中会生成"ab"                         // false
//        String s1 = new String("a") + new String("b");执行完以后,不会在字符串常量池中会生成"ab"  // true
        s1.intern();
        String s2 = "ab";
        System.out.println(s1 == s2);
    }
}

Intern efficiency test
Conclusion: For a large number of strings in the program, especially when there are many repeated strings, using intern() can save memory space.

/**
 * 使用intern()测试执行效率:空间使用上
 *
 * 结论:对于程序中大量存在存在的字符串,尤其其中存在很多重复字符串时,使用intern()可以节省内存空间。
 *
 *
 * @author shkstart  [email protected]
 * @create 2020  21:17
 */
public class StringIntern2 {
    static final int MAX_COUNT = 1000 * 10000;
    static final String[] arr = new String[MAX_COUNT];

    public static void main(String[] args) {
        Integer[] data = new Integer[]{1,2,3,4,5,6,7,8,9,10};

        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_COUNT; i++) {
//            arr[i] = new String(String.valueOf(data[i % data.length]));
            arr[i] = new String(String.valueOf(data[i % data.length])).intern();

        }
        long end = System.currentTimeMillis();
        System.out.println("花费的时间为:" + (end - start));

        try {
            Thread.sleep(1000000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.gc();
    }
}

10.6. String deduplication in G1


 

11. Concepts related to garbage collection

11.1. Understanding of System.gc()

 11.2. Memory overflow and memory leak

11.3. Stop The World

 
11.4. Concurrency and Parallelism

11.5. Description of safe point and safe area

11.6. Strong references, soft references, weak references, phantom references

strong reference

  

soft reference

SoftReference userSoftRef = new SoftReference(new User(1, “songhk”));
//The above line of code is equivalent to the following three lines of code
User u1 = new User(1, “songhk”);
SoftReference userSoftRef = new SoftReference (u1);
u1 = null;//Delete strong reference

**
 * 软引用的测试:内存不足即回收
 *
 * @author shkstart  [email protected]
 * @create 2020  16:06
 */
public class SoftReferenceTest {
    public static class User {
        public User(int id, String name) {
            this.id = id;
            this.name = name;
        }

        public int id;
        public String name;

        @Override
        public String toString() {
            return "[id=" + id + ", name=" + name + "] ";
        }
    }

    public static void main(String[] args) {
        //创建对象,建立软引用
//        SoftReference<User> userSoftRef = new SoftReference<User>(new User(1, "songhk"));
        //上面的一行代码,等价于如下的三行代码
        User u1 = new User(1,"songhk");
        SoftReference<User> userSoftRef = new SoftReference<User>(u1);
        u1 = null;//取消强引用


        //从软引用中重新获得强引用对象
        System.out.println(userSoftRef.get());

        System.gc();
        System.out.println("After GC:");
//        //垃圾回收之后获得软引用中的对象
        System.out.println(userSoftRef.get());//由于堆空间内存足够,所有不会回收软引用的可达对象。
//
        try {
            //让系统认为内存资源紧张、不够
//            byte[] b = new byte[1024 * 1024 * 7];
            byte[] b = new byte[1024 * 7168 - 635 * 1024];
        } catch (Throwable e) {
            e.printStackTrace();
        } finally {
            //再次从软引用中获取数据
            System.out.println(userSoftRef.get());//在报OOM之前,垃圾回收器会回收软引用的可达对象。
        }
    }
}

weak quotation

The difference between weak references and soft references: weak references are recycled when GC occurs, and soft references are recycled when there is insufficient memory. The occurrence of GC does not mean that there is insufficient memory, and it needs to pass the algorithm check.

phantom reference

12. Garbage collection algorithm

1. Garbage collection overview


1.1 What is garbage

 

1.2 Why GC is needed


1.3 Java Garbage Collection Mechanism

2. Garbage collection algorithm


2.1 Garbage Marking Algorithm


2.1.1 Reference Counting

 

 
2.1.2 Reachability Analysis Algorithm


 Object finalization mechanism

This method is very tasteless and hardly used in work. 

GC Roots Traceability of MAT and JProfiler

MAT
MAT is the abbreviation of Memory Analyzer, it is a powerful Java heap memory analyzer. Used to find memory leaks and view memory consumption.
MAT is developed based on Eclipse and is a free performance analysis tool

public class GCRootsTest {
    public static void main(String[] args) {
        List<Object> numList = new ArrayList<>();
        Date birth = new Date();

        for (int i = 0; i < 100; i++) {
            numList.add(String.valueOf(i));
            try {
                Thread.sleep(10);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }

        System.out.println("数据添加完毕,请操作:");
        new Scanner(System.in).next();
        numList = null;
        birth = null;

        System.out.println("numList、birth已置空,请操作:");
        new Scanner(System.in).next();

        System.out.println("结束");
    }
}

Generate a heap dump file
jmap -dump:format=b,file=C:\Users\qinsshuoyu\Desktop\1.hprof.analysis\log\1.hprof 19420

Before numList and birth are empty, there are ArrayList objects and Date objects in GC ROOTS


After numList and birth are empty, there are no ArrayList objects and Date objects in GC ROOTS

The classification of GC ROOTS in MAT is slightly different from what I said before, as long as it is divided into system classes, JNI Global, threads, and monitors

2.2 Mark-clear algorithm principle and its advantages and disadvantages

garbage removal phase


 The objects marked are reachable, because only reachable objects can be associated from GC ROOTS.

 

 2.3 Replication Algorithm

 Especially suitable for scenes with many garbage objects and few surviving objects; for example: S0 and S1 areas in the Young area


2.4 Tag Compression (Mark Collation)

 

 

2.5 Incremental collection algorithm, partition algorithm

Thirteen. Garbage collector

13.1. Garbage Collector Classification


 

 

 

 Evaluate GC performance metrics

 

 

 

 

13.2. Overview of the different garbage collectors

 

 

 

 

 

 

 

In jdk8, UseParallelGC is matched with UseParallelOldGC by default

Modify the JDK version of the program running in the idea

 

 

13.3. Serial and Serial Old Garbage Collectors: Serial Collection

 

 

 

13.4. ParNew Garbage Collector: Parallel Collection

 The bottom layer of ParNew shares a lot of code with Serial

 

 

 

13.5. Parallel and Parallel Old Garbage Collectors: Throughput Priority

 

 

 

 

 

13.6. CMS Collector: Low Latency

 

 

 

 

 

 

 

 

 

 

 

 

 

 

13.7. G1 Collector: Region Generation

 

 

Features (Advantages) of G1 Recycler

 

 

Disadvantages of the G1 collector:

 

Parameter settings for the G1 collector

 

 

 

 

 

 

Bump-the-pointer:
A single Region uses pointer collision to store data. Allocated above is the used memory space, top is the position of the pointer, and unallocated is the unused memory space. TLAB: Although there are partitioned Regions, there are still
independent
threads Some TLAB space, which can ensure that multiple threads can modify objects in parallel

 G1 recycling process

 

 

Remebered Set  

 

 

 

 

 

 

 

 

 

 

 


13.8. Garbage Collector Summary

 

 

 CMS is deprecated in JDK9.

 

 

/**
 *  -XX:+PrintCommandLineFlags
 *
 *  -XX:+UseSerialGC:表明新生代使用Serial GC ,同时老年代使用Serial Old GC
 *
 *  -XX:+UseParNewGC:标明新生代使用ParNew GC
 *
 *  -XX:+UseParallelGC:表明新生代使用Parallel GC
 *  -XX:+UseParallelOldGC : 表明老年代使用 Parallel Old GC
 *  说明:二者可以相互激活
 *
 *  -XX:+UseConcMarkSweepGC:表明老年代使用CMS GC。同时,年轻代会触发对ParNew 的使用
 * @author shkstart  [email protected]
 * @create 2020  0:10
 */
public class GCUseTest {
    public static void main(String[] args) {
        ArrayList<byte[]> list = new ArrayList<>();
        while(true){
            byte[] arr = new byte[100];
            list.add(arr);
            try {
                Thread.sleep(10);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}


13.9. GC log analysis

 

The following two are equivalent
-Xms60m -Xmx60m -XX:+PrintGC
-Xms60m -Xmx60m -verbose:gc

public class GCLogTest {
    public static void main(String[] args) {
        ArrayList<byte[]> list = new ArrayList<>();
        for (int i = 0; i < 500; i++) {
            byte[] arr = new byte[1024 * 100];//100KB
            list.add(arr);
    }
}
[GC (Allocation Failure)  15289K->13782K(58880K), 0.0044617 secs]
[GC (Allocation Failure)  29081K->29184K(58880K), 0.0046940 secs]
[Full GC (Ergonomics)  29184K->28807K(58880K), 0.0102882 secs]
[Full GC (Ergonomics)  44125K->43710K(58880K), 0.0060180 secs]

GC, Full GC: The type of GC. GC is only performed on the young generation, and Full GC includes the immortal generation, the new generation, and the old generation.
Allocation Failure: The reason why GC occurred.
15289K->13782K: The size of the heap before GC and the size after GC
58880K: The total heap size.
0.0044617 secs: The duration of GC.

-Xms60m -Xmx60m -XX:+PrintGCDetails

[GC (Allocation Failure) [PSYoungGen: 15282K->2548K(17920K)] 15282K->13874K(58880K), 0.0417173 secs] [Times: user=0.00 sys=0.00, real=0.04 secs] 
[GC (Allocation Failure) [PSYoungGen: 17847K->2500K(17920K)] 29173K->29028K(58880K), 0.0073599 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[Full GC (Ergonomics) [PSYoungGen: 2500K->0K(17920K)] [ParOldGen: 26528K->28807K(40960K)] 29028K->28807K(58880K), [Metaspace: 3495K->3495K(1056768K)], 0.0135070 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] 
[Full GC (Ergonomics) [PSYoungGen: 15318K->3000K(17920K)] [ParOldGen: 28807K->40709K(40960K)] 44126K->43710K(58880K), [Metaspace: 3496K->3496K(1056768K)], 0.0167859 secs] [Times: user=0.00 sys=0.02, real=0.02 secs] 
Heap
 PSYoungGen      total 17920K, used 10253K [0x00000000fec00000, 0x0000000100000000, 0x0000000100000000)
  eden space 15360K, 66% used [0x00000000fec00000,0x00000000ff603510,0x00000000ffb00000)
  from space 2560K, 0% used [0x00000000ffd80000,0x00000000ffd80000,0x0000000100000000)
  to   space 2560K, 0% used [0x00000000ffb00000,0x00000000ffb00000,0x00000000ffd80000)
 ParOldGen       total 40960K, used 40709K [0x00000000fc400000, 0x00000000fec00000, 0x00000000fec00000)
  object space 40960K, 99% used [0x00000000fc400000,0x00000000febc17f0,0x00000000fec00000)
 Metaspace       used 3502K, capacity 4498K, committed 4864K, reserved 1056768K
  class space    used 387K, capacity 390K, committed 512K, reserved 1048576K

GC, Full GC: the same type of GC
Allocation Failure: GC reason
PsYoungGen: the size change before and after the new generation GC using the Parallel Scavenge parallel garbage collector
ParOldGen: the size before and after the old generation GC using the Parallel Old parallel garbage collector Change
Metaspace: the change in the size of the metadata area before and after GC, the metadata area was introduced in JDK1.8 to replace the permanent generation
secs: refers to the time spent by GC
Times: user: refers to all the CPU time spent by the garbage collector
sys: spent Time waiting for system calls or system events
real: the time from the beginning to the end of GC, including the actual time occupied by other processes in the time slice.

-Xms60m -Xmx60m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps

2022-12-16T23:07:42.172+0800: 0.260: [GC (Allocation Failure) [PSYoungGen: 15282K->2544K(17920K)] 15282K->13894K(58880K), 0.0420238 secs] [Times: user=0.00 sys=0.00, real=0.04 secs] 
2022-12-16T23:07:42.213+0800: 0.270: [GC (Allocation Failure) [PSYoungGen: 17843K->2536K(17920K)] 29193K->29112K(58880K), 0.0083888 secs] [Times: user=0.01 sys=0.13, real=0.02 secs] 
2022-12-16T23:07:42.229+0800: 0.278: [Full GC (Ergonomics) [PSYoungGen: 2536K->0K(17920K)] [ParOldGen: 26576K->28807K(40960K)] 29112K->28807K(58880K), [Metaspace: 3494K->3494K(1056768K)], 0.0184285 secs] [Times: user=0.09 sys=0.00, real=0.01 secs] 
2022-12-16T23:07:42.244+0800: 0.301: [Full GC (Ergonomics) [PSYoungGen: 15318K->3000K(17920K)] [ParOldGen: 28807K->40709K(40960K)] 44125K->43710K(58880K), [Metaspace: 3495K->3495K(1056768K)], 0.0148117 secs] [Times: user=0.05 sys=0.06, real=0.02 secs] 

2022-12-16T23:07:42.172+0800: -XX:+PrintGCDateStamps parameter prints, indicating the current printed timestamp
0.260: -XX:+PrintGCTimeStamps parameter prints, indicating how long the JVM has been started.

GC is really time-consuming: real

 

 

 

/**
 * 在jdk7 和 jdk8中分别执行
 * -Xms20M -Xmx20M -Xmn10M -XX:+PrintGCDetails -XX:SurvivorRatio=8 -XX:+UseSerialGC
 * @author shkstart  [email protected]
 * @create 2020  0:12
 */
public class GCLogTest1 {
    private static final int _1MB = 1024 * 1024;

    public static void testAllocation() {
        byte[] allocation1, allocation2, allocation3, allocation4;
        allocation1 = new byte[2 * _1MB];
        allocation2 = new byte[2 * _1MB];
        allocation3 = new byte[2 * _1MB];
        allocation4 = new byte[4 * _1MB];
    }

    public static void main(String[] agrs) {
        testAllocation();
    }
}



JDK7

JDK8 

[GC (Allocation Failure) [DefNew: 6431K->695K(9216K), 0.0084041 secs] 6431K->4791K(19456K), 0.0085389 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
Heap
 def new generation   total 9216K, used 7161K [0x00000000fec00000, 0x00000000ff600000, 0x00000000ff600000)
  eden space 8192K,  78% used [0x00000000fec00000, 0x00000000ff2506b0, 0x00000000ff400000)
  from space 1024K,  67% used [0x00000000ff500000, 0x00000000ff5adf38, 0x00000000ff600000)
  to   space 1024K,   0% used [0x00000000ff400000, 0x00000000ff400000, 0x00000000ff500000)
 tenured generation   total 10240K, used 4096K [0x00000000ff600000, 0x0000000100000000, 0x0000000100000000)
   the space 10240K,  40% used [0x00000000ff600000, 0x00000000ffa00020, 0x00000000ffa00200, 0x0000000100000000)
 Metaspace       used 3500K, capacity 4498K, committed 4864K, reserved 1056768K
  class space    used 387K, capacity 390K, committed 512K, reserved 1048576K

It can be seen that the old generation accounts for 40%, that is, 4M, which is equivalent to placing large objects directly in the old generation after they cannot fit in the Eden area, but this is not the case in JDK7. Objects are allocated in the Eden area

 

GCViewer is a jar package that can be run by clicking, but the page cannot be adjusted, the resolution is not suitable, and it is difficult to use.
GCEasy: https://gceasy.io

-Xms60m -Xmx60m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -Xloggc:./logs/gc.log

public class GCLogTest {
    public static void main(String[] args) {
        ArrayList<byte[]> list = new ArrayList<>();

        for (int i = 0; i < 500; i++) {
            byte[] arr = new byte[1024 * 100];//100KB
            list.add(arr);
            try {
                Thread.sleep(50);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

./ refers to the current directory, that is, the current project general directory, the gc.log under the logs folder under it, copy it to the desktop, and then upload it to GCeasy for analysis.


13.10. New Developments in the Garbage Collector

 

 

 

 

 

 

 

 

Note: This article is a note made by studying the full set of JVM tutorials (detailed explanation of the java virtual machine) by Song Hongkang of Shang Silicon Valley.

Guess you like

Origin blog.csdn.net/chuixue24/article/details/130511497