Java technology - do you really understand the intern() method of the String class

0 Preface

Without further ado, let's take a look at the following imported example:

String str1 = new String("SEU")+ new String("Calvin");    
System.out.println(str1.intern() == str1); 
System.out.println(str1 == "SEUCalvin");

 

My JDK version is 1.8, and the output is:

  1. true  
  2. true  

Add a line of code to the above example:

  1. String str2 = "SEUCalvin";//A new line of code is added, the rest remain unchanged  
  2. String str1 = new String("SEU")+ new String("Calvin");      
  3. System.out.println(str1.intern() == str1);   
  4. System.out.println(str1 == "SEUCalvin");   

Running again, the result is:

  1. false  
  2. false  

Do you feel inexplicable, the newly defined str2 seems to have no relationship with str1, how can it affect the output of str1? In fact, this is all the ghost of the intern() method! After reading this article, you will understand. o(∩_∩)o 

To be honest, I originally wanted to summarize an article on Android memory leaks. After consulting a lot of information, I found that I had to start with Java's OOM, and when I talked about Java's OOM, I had to talk about Java's virtual machine architecture. If you don't know JVM, you can check this article  JVM - Java Virtual Machine Architecture . (This article has been revised many times by me, and I personally feel that it is quite comprehensive and clear. Every time I read it, I will have a new understanding.)

It is also introduced in the JVM architecture article. There is a constant pool in the method area of ​​the JVM runtime data area, but it is found that the constant pool is placed in the heap space after JDK1.6, so the difference in the location of the constant pool affects the intern of String ( ) method performance. After in-depth understanding, it is worth writing down and recording. In order to ensure the real-time update of the article and real-time revision of possible mistakes, please make sure that this is the original text, not the "original text" reprinted without a brain. The link to the original text is: SEU_Calvin's blog .

 

 

1. Why introduce the intern() method

The original intention of the intern() method is to reuse String objects to save memory consumption. This may be a bit abstract, so let's demonstrate it with an example.

  1. static final int MAX = 100000;  
  2. static final String[] arr = new String[MAX];  
  3.   
  4. public static void main(String[] args) throws Exception {  
  5.     //Assign a random assignment to an Integer array of length 10  
  6.     Integer[] sample = new Integer[10];  
  7.     Random random = new Random(1000);  
  8.     for (int i = 0; i < sample.length; i++) {  
  9.         sample[i] = random.nextInt();  
  10.     }  
  11.     //Record the program start time  
  12.     long t = System.currentTimeMillis();  
  13.     //Use/do not use the intern method to assign values ​​to 100,000 Strings, and the values ​​come from 10 numbers in the Integer array  
  14.         for (int i = 0; i < MAX; i++) {  
  15.             arr[i] = new String(String.valueOf(sample[i % sample.length]));  
  16.             //arr[i] = new String(String.valueOf(sample[i % sample.length])).intern();  
  17.         }  
  18.         System.out.println((System.currentTimeMillis() - t) + "ms");  
  19.         System.gc();  
  20. }  

 

This example is also simpler, just to demonstrate that using intern() consumes less memory than not using intern().

First define an Integer array with a length of 10, and assign values ​​to it randomly, and then assign values ​​to String objects with a length of 100,000 in sequence through a for loop. These values ​​come from the Integer array. The two cases are run separately. You can set the JVM startup parameter to -agentlib:hprof=heap=dump,format=b through Window ---> Preferences --> Java --> Installed JREs, and place the hprof after the program runs in the under the project directory. Then view the hprof file through the MAT plugin.
The results of the two experiments are as follows:

 

From the running results, without using intern(), the program generated 101762 String objects, but when using the intern() method, the program only generated 1772 String objects. Naturally, the conclusion that intern() saves memory is also proved.

Careful students will find that the running time of the program increases after using the intern() method. This is because each time the program uses new String and then performs the intern() operation, but not using intern() to occupy memory space causes the GC time to be much longer than this time. 

 

 

2. In-depth understanding of the intern() method

After JDK1.7, the constant pool is put into the heap space, which leads to different functions of the intern() function. What is the different method? Let's take a look at the following code. This example is a widely circulated example on the Internet. It is also pasted directly. Here I will explain this example with my own understanding:

  1. String s = new String("1");  
  2. s.intern();  
  3. String s2 = "1";  
  4. System.out.println(s == s2);  
  5.   
  6. String s3 = new String("1") + new String("1");  
  7. s3.intern();  
  8. String s4 = "11";  
  9. System.out.println(s3 == s4);  

The output is:

  1. JDK1.6 and below: false false  
  2. JDK1.7 and above: false true  

Then adjust the order of lines 2.3 and 7.8 of the above code respectively:

String s = new String("1");  

  1. String s2 = "1";  
  2. s.intern();  
  3. System.out.println(s == s2);  
  4.   
  5. String s3 = new String("1") + new String("1");  
  6. String s4 = "11";  
  7. s3.intern();  
  8. System.out.println(s3 == s4);  

The output is:

  1. JDK1.6 and below: false false  
  2. JDK1.7 and above: false false  

 

The following is an analysis of the intern() method based on the above code:

 

2.1 JDK1.6

 

In JDK1.6, all output results are false, because in JDK1.6 and previous versions, the constant pool is placed in the Perm area (belonging to the method area). If you are familiar with the JVM, you should know that this is completely separate from the heap area. .

Strings declared with quotation marks are directly generated in the string constant pool, and the new String object is placed in the heap space. So the memory addresses of the two must be different, even if the intern() method is called, it will not affect. If you don't know the difference between "==" and equals() of String class, you can check my blog post Java Interview - Compare the difference between equals and == from the Java heap and stack perspective .

The role of the intern() method in JDK1.6 is: for example, String s = new String("SEU_Calvin"), and then calling s.intern(), the return value is still the string "SEU_Calvin", which looks like this methods are useless. But in fact, in JDK1.6 it made a small action: check whether there is a string such as "SEU_Calvin" in the string pool, if it exists, return the string in the pool; if it does not exist, the method will return "SEU_Calvin" SEU_Calvin" is added to the string pool before returning a reference to it. However, this is not the case in JDK1.7, which will be discussed later.

 

2.2 JDK1.7

 

For JDK1.7 and above, we will discuss the above two pieces of code separately. Let's look at the first piece of code:

 

Paste the first piece of code for easy viewing:

[java] view plain copy

  1. String s = new String("1");  
  2. s.intern();  
  3. String s2 = "1";  
  4. System.out.println(s == s2);  
  5.   
  6. String s3 = new String("1") + new String("1");  
  7. s3.intern();  
  8. String s4 = "11";  
  9. System.out.println(s3 == s4);  

 

String s = newString("1"), which generates "1" in the constant pool and a string object in the heap space.

s.intern(), the function of this line is to find that "1" already exists in the constant pool after the s object searches in the constant pool.

String s2 = "1", this line of code is to generate a reference to s2 pointing to the "1" object in the constant pool.

The result is that the reference addresses of s and s2 are significantly different. So false is returned.

 

 

String s3 = new String("1") + newString("1"), this line of code generates "1" in the string constant pool, and generates the object pointed to by the s3 reference in the heap space (the content is "11") . Note that there is no "11" object in the constant pool at this time.

s3.intern(), this line of code, is to put the "11" string in s3 into the String constant pool. At this time, the "11" string does not exist in the constant pool. The JDK1.6 approach is to directly store the string in the constant pool. Generates an "11" object in .

However, in JDK1.7, there is no need to store an object in the constant pool, and the reference in the heap can be stored directly. This reference points directly to the object referenced by s3, which means that s3.intern() == s3 will return true.

String s4 = "11", this line of code will go directly to the constant pool to create, but it is found that the object already exists, which is a reference to the s3 reference object. So s3 == s4 returns true.

 

Let's continue to analyze the second piece of code:

Paste the second piece of code for easy viewing:

[java] view plain copy

  1. String s = new String("1");  
  2. String s2 = "1";  
  3. s.intern();  
  4. System.out.println(s == s2);  
  5.   
  6. String s3 = new String("1") + new String("1");  
  7. String s4 = "11";  
  8. s3.intern();  
  9. System.out.println(s3 == s4);  

String s = newString("1"), which generates "1" in the constant pool and a string object in the heap space.

String s2 = "1", this line of code is to generate a reference to s2 to point to the "1" object in the constant pool, but it is found that it already exists, so it directly points to it.

s.intern(), this line has no practical effect here. Because "1" already exists.

The result is that the reference addresses of s and s2 are significantly different. So false is returned.

 

 

String s3 = new String("1") + newString("1"), this line of code generates "1" in the string constant pool, and generates the object pointed to by the s3 reference in the heap space (the content is "11") . Note that there is no "11" object in the constant pool at this time.

String s4 = "11", this line of code will directly generate "11" in the constant pool.

s3.intern(), this line has no practical effect here. Because "11" already exists.

The result is that the reference addresses of s3 and s4 are significantly different. So false is returned.

In order to ensure the real-time update of the article and real-time revision of possible mistakes, please make sure that this is the original text, not the "original text" reprinted without a brain. The link to the original text is: SEU_Calvin's blog .

 

 

3 Summary

Finally done Ending. Now let's take a look at the introduction example given at the beginning, is it very clear?

 

[java] view plain copy

  1. String str1 = new String("SEU") + new String("Calvin");        
  2. System.out.println(str1.intern() == str1);     
  3. System.out.println(str1 == "SEUCalvin");    

 

str1.intern() == str1 is the case in the above example, str1.intern() finds that "SEUCalvin" does not exist in the constant pool, so it points to str1. When "SEUCalvin" is created in the constant pool, it directly points to str1. Both return true as a matter of course.

So what about the second piece of code:

 

[java] view plain copy

  1. String str2 = "SEUCalvin";//A new line of code is added, the rest remain unchanged  
  2. String str1 = new String("SEU")+ new String("Calvin");      
  3. System.out.println(str1.intern() == str1);   
  4. System.out.println(str1 == "SEUCalvin");   

It's also very simple, str2 first creates "SEUCalvin" in the constant pool, then str1.intern() of course points directly to str2, you can verify that both of them return true. The latter "SEUCalvin" also points to str2. So no one pays attention to str1 in the heap space, so they all return false.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325461440&siteId=291194637