Interviewer: What is String.intern() good for? What is the relationship with the constant pool? Ask for a large piece!

Author: GuoMell
Source: blog.csdn.net/gcoder_/article/details/106644312

0. Background

There are 8 basic types and a special type String in the JAVA language. In order to make them faster and save memory during operation, these types provide a concept of constant pool. The constant pool is similar to a cache provided at the JAVA system level.

The eight basic types of constant pools are coordinated by the system, and the constant pool of the String type is special.

There are two main ways to use it:

  • String objects declared directly using double quotes will be directly stored in the constant pool.
  • If it is not a String object declared with double quotes, you can use the intern method provided by String. The intern method will query whether the current string exists from the string constant pool, and if it does not exist, it will put the current string into the constant pool

Recommend an open source and free Spring Boot practical project:

https://github.com/javastacks/spring-boot-best-practice

1. Constant pool

1.1 What is the constant pool?

The JVM constant pool is mainly divided into Class file constant pool, runtime constant pool, global string constant pool, and basic type wrapper class object constant pool

1.1.0 Method area

The function of the method area is to store the structural information of the Java class. After the object is created, the type information of the object is stored in the method area, and the instance data is stored in the heap. Type information is the constants, static variables, and various methods declared in the class, method fields, etc. defined in the Java code; instance data is the object instances created in Java and their values.

The main purpose of memory recovery in this area is to recover the constant pool and unload memory data; generally speaking, the memory recovery rate in this area is much lower than that of the Java heap.

1.1.1 Class file constant pool

A class file is a set of binary data streams in bytes. During the compilation of Java code, the Java files we write are compiled into binary data in .class file format and stored on disk, including the class file constant pool. .

The class file constant pool mainly stores two constants: literals and symbolic references .

Literal quantity: literal quantity is close to the constant concept of java language level

  • Text string, which is what we often declare: "abc" in public String s = "abc";
  • Member variables modified with final, including static variables, instance variables and local variables: public final static int f = 0x101;, final int temp = 3;
  • For basic type data (even local variables in methods), such as int value = 1, only its field descriptor int and field name value are reserved in the constant pool, and their literal values ​​will not exist in the constant pool.

Symbolic reference: Symbolic reference mainly involves the concept of compilation principle

  • The fully qualified name of the class and interface, that is, java/lang/String; in this way, the original "." in the class name is replaced with "/", which is mainly used to resolve the direct reference of the class at runtime
  • The name and descriptor of the field, the field is the variable declared in the class or interface, including class-level variables and instance-level variables
  • The name and descriptor in the method, that is, the parameter type + return value

1.1.2 Runtime constant pool

When a Java file is compiled into a class file, the above class file constant pool will be generated. When the JVM executes a certain class, it must go through the steps of loading, linking (verification, preparation, parsing), initialization, and runtime constant pool. After the JVM loads the class into the memory, it will store the contents of the class constant pool in the runtime constant pool, that is, the version after the class constant pool is loaded into the memory, which is part of the method area.

In the parsing phase, symbolic references will be replaced with direct references, and the parsing process will query the string constant pool, that is, StringTable, to ensure that the strings referenced by the constant pool at runtime are consistent with those in the string constant pool.

Compared with the class constant pool, a major feature of the runtime constant pool is that it is dynamic. The Java specification does not require constants to be generated only at runtime, which means that the content of the runtime constant pool does not all come from the class constant pool. When you can generate constants through code and put them into the runtime constant pool, the most used feature is String.intern().

1.1.3 String constant pool

In JDK6.0 and earlier versions, the string constant pool is stored in the method area. After JDK7.0, the string constant pool is moved to the heap. As for why it was moved to the heap, it is probably because the memory space in the method area is too small. The string pool function implemented in HotSpot VM is a StringTable class, which is a Hash table with a default size and length of 1009; this StringTable has only one copy in each instance of HotSpot VM and is shared by all classes. String constants are composed of characters one by one and placed on the StringTable.

In JDK6.0, the length of StringTable is fixed, and the length is 1009. Therefore, if there are too many Strings in the String Pool, it will cause hash conflicts, resulting in too long linked list. When calling String#intern(), it will need To search one by one on the linked list, resulting in a significant drop in performance; in JDK7.0, the length of StringTable can be specified by parameters.

String constant pool design ideas:

  • The allocation of strings, like other object allocations, consumes high time and space costs. As the most basic data type, a large number of frequent strings are created, which greatly affects the performance of the program.

  • In order to improve performance and reduce memory overhead, JVM has made some optimizations when instantiating string constants

    • Create a string constant pool for strings, similar to a buffer
    • When creating a string constant, first check whether the string exists in the string constant pool
    • If the string exists, return a reference instance, if it does not exist, instantiate the string and put it in the pool
  • The basis for realization

    • The basis for this optimization is that strings are immutable and can be shared without worrying about data conflicts
    • There is a table in the global string constant pool created by the runtime instance, which always maintains a reference for each unique string object in the pool, which means that they always refer to the objects in the string constant pool, so, in These strings in the constant pool will not be reclaimed by the garbage collector

2. String.intern() and string constant pool

/**
 * Returns a canonical representation for the string object.
 * <p>
 * A pool of strings, initially empty, is maintained privately by the
 * class <code>String</code>.
 * <p>
 * When the intern method is invoked, if the pool already contains a
 * string equal to this <code>String</code> object as determined by
 * the {@link #equals(Object)} method, then the string from the pool is
 * returned. Otherwise, this <code>String</code> object is added to the
 * pool and a reference to this <code>String</code> object is returned.
 * <p>
 * It follows that for any two strings <code>s</code> and <code>t</code>,
 * <code>s.intern()&nbsp;==&nbsp;t.intern()</code> is <code>true</code>
 * if and only if <code>s.equals(t)</code> is <code>true</code>.
 * <p>
 * All literal strings and string-valued constant expressions are
 * interned. String literals are defined in section 3.10.5 of the
 * <cite>The Java&trade; Language Specification</cite>.
 *
 * @return  a string that has the same contents as this string, but is
 *          guaranteed to be from a pool of unique strings.
 */
public native String intern();

The location of the string constant pool also varies with the jdk version. In jdk6, the location of the constant pool is in the permanent generation (method area), and objects are stored in the constant pool at this time. In jdk7, the location of the constant pool is in the heap. At this time, the constant pool stores the reference.

In jdk8, the permanent generation (method area) was replaced by metaspace. This leads to a very common and classic problem, see the following code.

@Test
public void test(){
    String s = new String("2");
    s.intern();
    String s2 = "2";
    System.out.println(s == s2);

    String s3 = new String("3") + new String("3");
    s3.intern();
    String s4 = "33";
    System.out.println(s3 == s4);
}

//jdk6
//false
//false

//jdk7
//false
//true

The output of this code is false false in jdk6, but the output is false true in jdk7. We explain it line by line through the diagram.

JDK1.6

  • String s = new String("2"); creates two objects, a StringObject object in the heap, and a "2" object in the constant pool.
  • s.intern(); Search for an object with the same content as the s variable in the constant pool, find that the object "2" with the same content already exists, and return the address of object 2.
  • String s2 = "2"; Use the literal value to create, look for an object with the same content in the constant pool, if found, return the address of the object "2".
  • System.out.println(s == s2); From the above analysis, the addresses of the s variable and the s2 variable point to different objects, so return false

  • String s3 = new String("3") + new String("3"); Two objects are created, one is the StringObject object in the heap, and the other is the "3" object in the constant pool. There are also 2 anonymous new String("3") in the middle, we will not discuss them.
  • s3.intern(); Look for an object with the same content as the s3 variable in the constant pool, but no "33" object is found, create an "33" object in the constant pool, and return the address of the "33" object.
  • String s4 = "33"; Use the literal value to create, look for an object with the same content in the constant pool, and if found, return the address of the object "33".
  • System.out.println(s3 == s4); From the above analysis, the addresses of the s3 variable and the s4 variable point to different objects, so return false

JDK1.7

  • String s = new String("2"); creates two objects, one is the StringObject object in the heap, and the other is the "2" object in the heap, and saves the reference address of the "2" object in the constant pool.
  • s.intern(); Search for an object with the same content as the s variable in the constant pool, find that the object "2" with the same content already exists, and return the reference address of the object "2".
  • String s2 = "2"; Use the literal value to create, look for an object with the same content in the constant pool, and if found, return the reference address of the object "2".
  • System.out.println(s == s2); From the above analysis, the addresses of the s variable and the s2 variable point to different objects, so return false

  • String s3 = new String("3") + new String("3"); Two objects are created, one is the StringObject object in the heap, and the other is the "3" object in the heap, and save it in the constant pool "3" The reference address of the object. There are also 2 anonymous new String("3") in the middle, we will not discuss them.
  • s3.intern(); Search for an object with the same content as the s3 variable in the constant pool, but no "33" object is found, save the address of the StringObject object corresponding to s3 in the constant pool, and return the address of the StringObject object.
  • String s4 = "33"; Use the literal value to create, look for an object with the same content in the constant pool, and if found, return its address, which is the reference address of the StringObject object.
  • System.out.println(s3 == s4); It can be seen from the above that the addresses of the s3 variable and the s4 variable point to the same object, so it returns true.

3. Application of String.intern()

In the case of reading and assigning a large number of strings, using String.intern() will greatly save memory space.

static final int MAX = 1000 * 10000;
static final String[] arr = new String[MAX];

public static void main(String[] args) throws Exception {
    Integer[] DB_DATA = new Integer[10];
    Random random = new Random(10 * 10000);
    for (int i = 0; i < DB_DATA.length; i++) {
        DB_DATA[i] = random.nextInt();
    }
 long t = System.currentTimeMillis();
    for (int i = 0; i < MAX; i++) {
        //arr[i] = new String(String.valueOf(DB_DATA[i % DB_DATA.length]));
         arr[i] = new String(String.valueOf(DB_DATA[i % DB_DATA.length])).intern();
    }

 System.out.println((System.currentTimeMillis() - t) + "ms");
    System.gc();
}

The running parameters are: -Xmx2g -Xms2g -Xmn1500M The above code is a demonstration code, in which there are two different statements, one is using intern and the other is not using intern. It is found that the code that does not use intern generates 1000w strings and takes up about 640m of space.

The code using intern generates 1345 strings, occupying a total space of about 133k. In fact, only 10 strings are used in the program, so after accurate calculation, the difference should be exactly 100w times. Although the example is somewhat extreme, it does accurately reflect the huge space savings generated by the use of interns.

Using the immutability of String, the String.intern() method essentially maintains a constant pool of String, and the String in the pool should be unique. In this way, we can use this uniqueness to make some articles. We can use the String objects in the pool as locks to control resources. For example, a certain resource in a city can only be accessed by one thread at a time, then you can use the String object of the city name as a lock and put it in the constant pool, and only one thread can obtain it at the same time.

Improper use: fastjson uses the intern method for all json keys and caches them in the string constant pool, so that each read will be very fast, greatly reducing time and space, and json keys are usually Changeless. But this place does not take into account that if a large number of json keys change, it will bring a great burden to the string constant pool.

Recent hot article recommendation:

1. 1,000+ Java interview questions and answers (2022 latest version)

2. Brilliant! Java coroutines are coming. . .

3. Spring Boot 2.x tutorial, too comprehensive!

4. Don't fill the screen with explosions and explosions, try the decorator mode, this is the elegant way! !

5. The latest release of "Java Development Manual (Songshan Edition)", download quickly!

Feel good, don't forget to like + forward!

Guess you like

Origin blog.csdn.net/youanyyou/article/details/132401474