On the JVM constant pool

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/u013630349/article/details/102768581

It should be noted is that this article is based on "in-depth understanding of the Java Virtual Machine," the second edition of the book is based, assuming you already know the JVM runtime area, as well as class file structure, class loading process and other infrastructure content. Of course, the text we will mention a mention as relevant content review summary

There are several .JVM a constant pool

  Divided into: Class file constant pool , runtime constant pool , and of course global string constant pool , as well as the basic type of wrapper class object literal pool

1.Class file constant pool

  Read "in-depth understanding of the Java Virtual Machine," junior partner in Chapter 6 of this book content must know, class file is a group of 8 bytes of binary data stream in java code during compilation , we write. java file is compiled into binary data .class file formats stored on the disk, including the class file constant pool .
  there is constant pool (non-runtime constant pool) class file, which is in the compilation stage have been identified; JVM specification of the class file structure has a strict norms, must comply with the specification JVM class files will be recognized and loaded.

For convenience, we have here the first to write a very simple class:

class JavaBean{
    private int value = 1;
    public String s = "abc";
    public final static int f = 0x101;

    public void setValue(int v){
        final int temp = 3;
        this.value = temp + v;
    }

    public int getValue(){
        return value;
    }
}

After compiled by javah command, with javap -v command to view the compiled file:

class JavaBasicKnowledge.JavaBean
  minor version: 0
  major version: 52
  flags: ACC_SUPER
Constant pool:
   #1 = Methodref          #6.#29         // java/lang/Object."<init>":()V
   #2 = Fieldref           #5.#30         // JavaBasicKnowledge/JavaBean.value:I
   #3 = String             #31            // abc
   #4 = Fieldref           #5.#32         // JavaBasicKnowledge/JavaBean.s:Ljava/lang/String;
   #5 = Class              #33            // JavaBasicKnowledge/JavaBean
   #6 = Class              #34            // java/lang/Object
   #7 = Utf8               value
   #8 = Utf8               I
   #9 = Utf8               s
  #10 = Utf8               Ljava/lang/String;
  #11 = Utf8               f
  #12 = Utf8               ConstantValue
  #13 = Integer            257
  #14 = Utf8               <init>
  #15 = Utf8               ()V
  #16 = Utf8               Code
  #17 = Utf8               LineNumberTable
  #18 = Utf8               LocalVariableTable
  #19 = Utf8               this
  #20 = Utf8               LJavaBasicKnowledge/JavaBean;
  #21 = Utf8               setValue
  #22 = Utf8               (I)V
  #23 = Utf8               v
  #24 = Utf8               temp
  #25 = Utf8               getValue
  #26 = Utf8               ()I
  #27 = Utf8               SourceFile
  #28 = Utf8               StringConstantPool.java
  #29 = NameAndType        #14:#15        // "<init>":()V
  #30 = NameAndType        #7:#8          // value:I
  #31 = Utf8               abc
  #32 = NameAndType        #9:#10         // s:Ljava/lang/String;
  #33 = Utf8               JavaBasicKnowledge/JavaBean
  #34 = Utf8               java/lang/Object

After this command you can see we get the version number, the class file constant pool, bytecode instruction has been compiled (in space reasons omitted here), here we can control the class file to explain:

  Here we need to explain, since it is a constant pool, which then must be stored in a "constant", then what is "constant" mean? class file constant pool storing two main constants: literal and symbolic references :

1). Literals

Literal closer to the concept of java language constant level, including:

  • Text string , that is, we often declared: public String s = "abc";in"abc"
 #9 = Utf8               s
 #3 = String             #31            // abc
 #31 = Utf8              abc
  • With a final modification of member variables, including static variables , instance variables and local variables
 #11 = Utf8               f
 #12 = Utf8               ConstantValue
 #13 = Integer            257

  Here a little, the above said amount should be noted that there is literally in the constant pool, referring to the data value , that is, abcand 0x101(257), on the face of the constant pool observation shows that the word does exist in the face amount is constant pool.
  As for the basic types of data (or even the method of local variables), which is above private int value = 1; constant pool of only kept his word descriptorI and the name of the fieldvalue , they will not exist in the literal constant pool:

2). Symbolic references

Symbolic references located mainly involves the concept of compiler theory aspects, including the following three constants:

  • Classes and interfaces of the fully qualified name , i.e. Ljava/lang/String;so that, in the name of the original class is replaced with "/" obtained, mainly for runtime parsing obtained directly reference the class, as above. "":
 #5 = Class              #33            // JavaBasicKnowledge/JavaBean
 #33 = Utf8               JavaBasicKnowledge/JavaBean
  • Field of the name and descriptor , the field is declared class or interface variables , including class-level variables (static) and instance-level variables
 #4 = Fieldref           #5.#32         // JavaBasicKnowledge/JavaBean.value:I
 #5 = Class              #33            // JavaBasicKnowledge/JavaBean
 #32 = NameAndType       #7:#8          // value:I

 #7 = Utf8               value
 #8 = Utf8               I

 //这两个是局部变量,值保留字段名称
 #23 = Utf8               v
 #24 = Utf8               temp

Can be seen, there is a method in the class file constant pool local variables, but does not; constant pool, but outside the field table does not include local variables;

  • The method of the name and descriptor described, JNI method is similar to dynamic "signature method" is registered, i.e. the parameter type + Return Value Type :
  #21 = Utf8               setValue
  #22 = Utf8               (I)V

  #25 = Utf8               getValue
  #26 = Utf8               ()I

2. runtime constant pool

  Runtime constant pool area is part of the method, it is also a global share of. We know, jvm in the implementation of a class, must be loaded connection (validation, preparation, analytical), initialization , in the first step of loading phase, the virtual machine needs to complete the following three things:

  • By a class of "fully qualified name" to obtain such a binary byte stream
  • The byte stream that represents the static storage structure into the method area of the run-time data structure
  • Generating a class in memory representative of such java.lang.Class objects , a method for accessing the various data entry areas of this class

  The caveat here is that class objects and ordinary instance of an object is different class object is generated when the class is loaded, an ordinary instance of an object is usually created after calling new.

  Article above, the static storage class structure represented by the byte stream into runtime data structure area method , which contains the class file constant pool entry process runtime constant pool. It should emphasize different classes sharing a runtime constant pool ( http://blog.csdn.net/fan2012huan/article/details/52759614 ), while the runtime constant pool entry process, a plurality of class files the same string constant pool will only exist in a runtime constant pool, which is an optimization.

  Runtime constant pool is the role of symbolic information stored in Java class file constant pool. Runtime constant pool holds some of the described symbol file referenced class, it is loaded in both the class "parsing phase" will also these symbolic references are translated to a direct reference (pointer directly points to an instance object) often amount stored in the operation pool.

  Runtime constant pool relative to the class constant pool has a great feature is its dynamic , Java specification does not require constant can only be generated at runtime, that runtime constant pool of content is not all from the class constant pool, class constant pool runtime constant pool is not only a data input port; may be generated by a constant code at runtime and placed into runtime constant pool , this characteristic is more used is the String.intern () (this methods will speak in detail below).

II. Global string constant pool

  String constant pool listed separately for two reasons:

  • Different from the basic data types, String type is a final object of his literal presence in the class file constant pool, but the runtime behavior is different from ordinary constants
  • JDK 1.7, the string constant pool and class references are moved to the Java heap (the runtime constant pool of separation), so different versions of String behavior vary

Create a string object in two ways 1.Java

  The problem I think we must be very clear now, generally have the following two:

  • String s0 =”hellow”;
  • String s1=new String (“hellow”);

  The first we have seen before, and in this way a literal statement hellowin compile time has been determined, it will go directly to class file constant pool; when during running global string constant pool will save it a reference, in fact, ultimately to create a heap ”hellow”object that later would speak.
  The second way to use the way new String(), that is, call the constructor of the String class, we know that new directive is to create an instance of the class object and initialize finished loading, so this string object is run to determine, created string object is in heap on memory .
  At this time, therefore call System.out.println(s0 == s1);return is certainly flase, so ==the symbol is the address comparison elements on both sides, s1, and s0 are present in the heap, but certainly not the same address.

Let us look at several common topics:

String s1 = "Hello";
String s2 = "Hello";
String s3 = "Hel" + "lo";
String s4 = "Hel" + new String("lo");
String s5 = new String("Hello");
String s7 = "H";
String s8 = "ello";
String s9 = s7 + s8;

System.out.println(s1 == s2);  // true
System.out.println(s1 == s3);  // true
System.out.println(s1 == s4);  // false
System.out.println(s1 == s9);  // false

1) s1 == s2

  The first part explains the constant comparison of the pool should be well understood, because the literal "Hello"at runtime into the runtime constant pool (in a string constant pool, JDK1.7 before), but only with a literal keep a copy, All references point that this is a string that will address the same as a natural reference.

2) s1 == s3

  This mainly involves the String "+" No. compiler optimization problems, s3 although the dynamic spliced ​​out of the string, but all the parts involved in splicing are known literal, during compilation, this mosaic is optimized, compiled directly help you a good fight, so String s3 = "Hel" + "lo"; in the class file is optimized to String s3 = "Hello" ;, so s1 == s3 established.

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=3, args_size=1
         0: ldc           #2                  // String Hello
         2: astore_1
         3: ldc           #2                  // String Hello
         5: astore_2
         6: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
         9: aload_1
        10: aload_2
        11: if_acmpne     18
        14: iconst_1
        15: goto          19
        18: iconst_0
        19: invokevirtual #4                  // Method java/io/PrintStream.println:(Z)V
        22: return

  See methods by the compiled code, where you can see added ldc instruction operand stack twice, is "Hello", no "Hel" or "lo", while the two "Hello" constant pool point by an address, are #2, therefore, there is only one constant pool “Hello”literal.

3) s1! = S4

  In fact, this is not difficult to understand, but we take a look at the compiled byte code:

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=3, args_size=1
         0: ldc           #2                  // String Hello
         2: astore_1
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."<init>":()V
        10: ldc           #5                  // String Hel
        12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        15: new           #7                  // class java/lang/String
        18: dup
        19: ldc           #8                  // String lo
        21: invokespecial #9                  // Method java/lang/String."<init>":(Ljava/lang/String;)V
        24: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        27: invokevirtual #10                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        30: astore_2
        31: getstatic     #11                 // Field java/lang/System.out:Ljava/io/PrintStream;
        34: aload_1
        35: aload_2
        36: if_acmpne     43
        39: iconst_1
        40: goto          44
        43: iconst_0
        44: invokevirtual #12                 // Method java/io/PrintStream.println:(Z)V
        47: return

  We can not explain everything on the operator, we can see that this does occur “String Hel”and “String lo”the reason we have said above, this is because new String("lo")in the heap of a new String object out, and “Hel”literal by another operation heap objects created two different places in the heap objects created by StringBuilder.appendthe method spliced out, and eventually calls StringBuilder.toStringthe method output (final output is "Hello"), by which the above analysis of byte code can be see, we take a look at StringBuilder.toStringmethods:

@Override
    public String toString() {
        // Create a copy, don't share the array
        return new String(value, 0, count);
    }

  You can see, this is ultimately a mosaic out of the String object, that is to say, it s4 directed through a String object StringBuilder after stitching, and s1 points to another object, address these two objects, of course, is a different .

4) s1! = S9

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=5, args_size=1
         0: ldc           #2                  // String Hello
         2: astore_1
         3: ldc           #3                  // String H
         5: astore_2
         6: ldc           #4                  // String ello
         8: astore_3
         9: new           #5                  // class java/lang/StringBuilder
        12: dup
        13: invokespecial #6                  // Method java/lang/StringBuilder."<init>":()V
        16: aload_2
        17: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        20: aload_3
        21: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        27: astore        4
        29: getstatic     #9                  // Field java/lang/System.out:Ljava/io/PrintStream;
        32: aload_1
        33: aload         4
        35: if_acmpne     42
        38: iconst_1
        39: goto          43
        42: iconst_0
        43: invokevirtual #10                 // Method java/io/PrintStream.println:(Z)V
        46: return

  From the byte code in the mutation, and this case 3) are the same, are the new target after StringBuilder.append toString output by stitching, as the object is assigned gone, we do not know.

2.String s1 = "Hello", in the end there is no create objects in the heap?

JVM run-time data area .png

  The above image is a structure than the JVM run-time data area of our commonly understood, but there is an incomplete place, to illustrate the global string constant pool concept, it is necessary to come up with the picture below:

JVM run-time data area Plus.png

这张图中,可以看到,方法区实际上是在一块叫“非堆”的区域包含——可以简单粗略的理解为非堆中包含了永生代,而永生代中又包含了方法区和字符串常量池,我们放大一下,一遍大家看的更清楚些:

非堆.png

  其中的Interned String就是全局共享的“字符串常量池(String Pool)”,和运行时常量池不是一个概念。但我们在代码中申明String s1 = "Hello";这句代码后,在类加载的过程中,类的class文件的信息会被解析到内存的方法区里。
  class文件里常量池里大部分数据会被加载到“运行时常量池”,包括String的字面量;但同时“Hello”字符串的一个引用会被存到同样在“非堆”区域的“字符串常量池”中,而"Hello"本体还是和所有对象一样,创建在Java堆中。

  当主线程开始创建s1时,虚拟机会先去字符串池中找是否有equals(“Hello”)的String,如果相等就把在字符串池中“Hello”的引用复制给s1;如果找不到相等的字符串,就会在堆中新建一个对象,同时把引用驻留在字符串池,再把引用赋给str。
  当用字面量赋值的方法创建字符串时,无论创建多少次,只要字符串的值相同,它们所指向的都是堆中的同一个对象。

字符串常量池的本质

  看到这里,是时候引出字符串常量池的概念了:字符串常量池是JVM所维护的一个字符串实例的引用表,在HotSpot VM中,它是一个叫做StringTable的全局表。在字符串常量池中维护的是字符串实例的引用,底层C++实现就是一个Hashtable。这些被维护的引用所指的字符串实例,被称作”被驻留的字符串”或”interned string”或通常所说的”进入了字符串常量池的字符串”。
  再强调一遍:运行时常量池在方法区(Non-heap),而JDK1.7后,字符串常量池被移到了heap区,因此两者根本就不是一个概念。

3.String"字面量" 是何时进入字符串常量池的?

先说结论:在执行ldc指令时,该指令表示int、float或String型常量从常量池推送至栈顶

JVM规范里Class文件的常量池项的类型,有两种东西(这段内容建议配合看书上168页内容):

  • CONSTANT_Utf8_info
  • CONSTANT_String_info

  在HotSpot VM中,运行时常量池里,CONSTANT_Utf8_info可以表示Class文件的方法、字段等等,其结构如下:

CONSTANT_Utf8_info结构.png

首先是1个字节的tag,表示这是一个CONSTANT_Utf8_info结构的常量,然后是两个字节的length,表示要储存字节的长度,之后是一个字节的byte数组,表示真正的储存的length个长度的字符串。这里需要注意的是,一个字节只是代表这里有一个byte类型的数组,而这个数组的长度当然可以远远大于一个字节。当然,由于CONSTANT_Utf8_info结构只能用u2即两个字节来表示长度,因此长度的最大值为2byte,也就是65535(注意这跟Android中dex字节码65535方法数限制没有什么关系,但是道理是一样的).

  后者CONSTANT_String_info是String常量的类型,但它并不直接持有String常量的内容,而是只持有一个index,这个index所指定的另一个常量池项必须是一个CONSTANT_Utf8类型的常量,这里才真正持有字符串的内容。

![Uploading 提纲_620529.png . . .]

  CONSTANT_Utf8会在类加载的过程中就全部创建出来,而CONSTANT_String则是lazy resolve的,在第一次引用该项的ldc指令被第一次执行到的时候才会resolve。在尚未resolve的时候,HotSpot VM把它的类型叫做JVM_CONSTANT_UnresolvedString,内容跟Class文件里一样只是一个index;等到resolve过后这个项的常量类型就会变成最终的JVM_CONSTANT_String,
  也就是说,就HotSpot VM的实现来说,加载类的时候,那些字符串字面量会进入到当前类的运行时常量池,不会进入全局的字符串常量池(即在StringTable中并没有相应的引用,在堆中也没有对应的对象产生),在执行ldc指令时,触发lazy resolution这个动作:
  ldc字节码在这里的执行语义是:到当前类的运行时常量池(runtime constant pool,HotSpot VM里是ConstantPool + ConstantPoolCache)去查找该index对应的项,如果该项尚未resolve则resolve之,并返回resolve后的内容。
  在遇到String类型常量时,resolve的过程如果发现StringTable已经有了内容匹配的java.lang.String的引用,则直接返回这个引用,反之,如果StringTable里尚未有内容匹配的String实例的引用,则会在Java堆里创建一个对应内容的String对象,然后在StringTable记录下这个引用,并返回这个引用出去。

  可见,ldc指令是否需要创建新的String实例,全看在第一次执行这一条ldc指令时,StringTable是否已经记录了一个对应内容的String的引用。

4.String.intern()用法

String.intern()官方给的定义:

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

实际上,就是去拿String的内容去Stringtable里查表,如果存在,则返回引用,不存在,就把该对象的"引用"存在Stringtable表里。

这里采用《深入理解Java虚拟机》书上的两个例子来解释这个问题,第一个例子在P57页:

public class RuntimeConstantPoolOOM{
    public static void main(String[] args) {

         String str1 = new StringBuilder("计算机").append("软件").toString();
         System.out.println(str1.intern() == str1);

         String str2 = new StringBuilder("ja").append("va").toString();
         System.out.println(str2.intern() == str2);

    }
}

以上代码,在 JDK6 下执行结果为 false、false,在 JDK7 以上执行结果为 true、false。

  首先我们调用StringBuilder创建了一个"计算机软件"String对象,因为调用了new关键字,因此是在运行时创建,之前JVM中是没有这个字符串的。
  在 JDK6 下,intern()会把首次遇到的字符串实例复制到永久代中,返回的也是这个永久代中字符串实例的引用;而在JDK1.7开始,intern()方法不在复制字符串实例,tring 的 intern 方法首先将尝试在常量池中查找该对象的引用,如果找到则直接返回该对象在常量池中的引用地址

  因此在1.7中,“计算机软件”这个字符串实例只存在一份,存在于java堆中!通过3中的分析,我们知道当String str1 = new StringBuilder("计算机").append("软件").toString();这句代码执行完之后,已经在堆中创建了一个字符串对象,并且在全局字符串常量池中保留了这个字符串的引用,那么str1.intern()直接返回这个引用,这当然满足str1.intern() == str1——都是他自己嘛;对于引用str2,因为JVM中已经有“java”这个字符串了,因此new StringBuilder("ja").append("va").toString()会重新创建一个新的“java”字符串对象,而intern()会返回首次遇到的常量的实例引用,因此他返回的是系统中的那个"java"字符串对象引用(首次),因此会返回false

  在 JDK6 下 str1、str2 指向的是新创建的对象,该对象将在 Java Heap 中创建,所以 str1、str2 指向的是 Java Heap 中的内存地址;调用 intern 方法后将尝试在常量池中查找该对象,没找到后将其放入常量池并返回,所以此时 str1/str2.intern() 指向的是常量池中的地址,JDK6常量池在永久代,与堆隔离,所以 s1.intern()和s1 的地址当然不同了。

第二个例子在P56页:

public class Test2 {
    public static void main(String[] args) {
        /**
         * 首先设置 持久代最大和最小内存占用(限定为10M)
         * VM args: -XX:PermSize=10M -XX:MaxPremSize=10M
         */

        List<String> list  = new ArrayList<String>();

        // 无限循环 使用 list 对其引用保证 不被GC  intern 方法保证其加入到常量池中
        int i = 0;
        while (true) {
            // 此处永久执行,最多就是将整个 int 范围转化成字符串并放入常量池
            list.add(String.valueOf(i++).intern());
        }
    }
}

以上代码在 JDK6 下会出现 Perm 内存溢出,JDK7 or high 则没问题。

  JDK6 常量池存在持久代(不经心CG),设置了持久代大小后,不断while循环必将撑满 Perm 导致内存溢出;JDK7 常量池被移动到 Native Heap(Java Heap,HotSpot VM中不区分native堆和Java堆),所以即使设置了持久代大小,也不会对常量池产生影响;不断while循环在当前的代码中,所有int的字符串相加还不至于撑满 Heap 区,所以不会出现异常。

三.JAVA 基本类型的封装类及对应常量池

  java中基本类型的包装类的大部分都实现了常量池技术,这些类是Byte,Short,Integer,Long,Character,Boolean,另外两种浮点数类型的包装类则没有实现。另外上面这5种整型的包装类也只是在对应值小于等于127时才可使用对象池,也即对象不负责创建和管理大于127的这些类的对象。

public class StringConstantPool{

    public static void main(String[] args){
        //5种整形的包装类Byte,Short,Integer,Long,Character的对象,
        //在值小于127时可以使用常量池
        Integer i1=127;
        Integer i2=127;
        System.out.println(i1==i2);//输出true

        //值大于127时,不会从常量池中取对象
        Integer i3=128;
        Integer i4=128;
        System.out.println(i3==i4);//输出false
        //Boolean类也实现了常量池技术

        Boolean bool1=true;
        Boolean bool2=true;
        System.out.println(bool1==bool2);//输出true

        //浮点类型的包装类没有实现常量池技术
        Double d1=1.0;
        Double d2=1.0;
        System.out.println(d1==d2); //输出false

    }
}

  在JDK5.0之前是不允许直接将基本数据类型的数据直接赋值给其对应地包装类的,如:Integer i = 5; 但是在JDK5.0中支持这种写法,因为编译器会自动将上面的代码转换成如下代码:Integer i=Integer.valueOf(5);这就是Java的装箱.JDK5.0也提供了自动拆箱:Integer i =5; int j = i;

  以及,这里常量池中缓存的是包装类对象,而不是基本数据类型,要注意!!!


 

Guess you like

Origin blog.csdn.net/u013630349/article/details/102768581