String memory allocation and splicing operations

Basic Features of String

  • String: String, represented by a pair ""of quotes
  • Two ways of instantiating String: String s1 = "hello";,String s2 = new String("hello");
  • String is declared final and cannot be inherited
  • String implements the Serializable interface: indicating that the string supports serialization.
  • String implements the Comparable interface: indicating that the string can be compared in size

String storage structure changes in jdk9

  • String defines a final char[] value (using two bytes (16 bits) per character) internally in jdk8 and earlier for storing string data, but data collected from many different applications shows that strings are The main component used by the heap, most string objects contain only Latin-1 characters and require only one byte of storage.
  • After jdk9, the internal representation method of String class is changed from UTF-16 character array to byte array (byte[]) plus encoding flag field. The new String class will store character encodings in ISO-8859-1/Latin-1 (one byte per character) or UTF-16 (two bytes per character) depending on the content of the string. The encoding flag will indicate which encoding is used.
  • Conclusion: String is no longer stored in char[], but changed to byte[] plus encoding mark, which reduces the memory usage during work and greatly reduces GC activities.

String: Immutable sequence of characters

When a string is assigned a literal value ( String str="Hello";), the string value is declared in the string constant pool. In the string constant pool, strings with the same content will not be stored.

String constant pool:
String's String Pool is a fixed-size Hashtable. If there are too many Strings in the String Pool, it will cause serious Hash conflicts, resulting in a very long linked list. The direct impact of the long linked list is that the performance will drop significantly when String.intern is called. Use -XX:StringTablesize to set the length of StringTable

  • In jdk6, StringTable is fixed, that is, the length of 1009, so if there are too many strings in the constant pool, the efficiency will drop rapidly, and there is no requirement for StringTablesize setting.
  • In jdk7, the default length of StringTable is 60013, and there is no requirement for StringTablesize setting
  • In jdk8, if you set the length of StringTable, 1009 is the minimum value that can be set

String memory allocation

There are 8 basic data types and a special type String in the Java language. These types provide the concept of a constant pool in order to make them faster and more memory efficient during operation.

The constant pool is like a cache provided at the Java system level. The constant pools of the 8 basic data types are coordinated by the system. The constant pool of String type is special. There are two main methods of storage:

  • String objects declared directly with double quotes are stored directly in the constant pool.
  • If the String object is not declared with double quotes, you can use the intern() method provided by String.

After JDK 7, internal strings are no longer allocated in the permanent generation of the Java heap (permanent generation garbage collection frequency is low) , but are allocated in the main part of the Java heap (young and old) , all strings are stored On the heap, along with other objects created by the application. This change will result in more data residing in the main Java heap and less data in the permanent generation, so the heap may need to be resized.

insert image description here

The obvious effect of this change can be seen in large applications that load many classes or make heavy use of the String.intern() method.

Example:

class Memory {
    
    
    public static void main(String[] args) {
    
    //line 1
        int i= 1;//line 2
        Object obj = new Object();//line 3
        Memory mem = new Memory();//Line 4
        mem.foo(obj);//Line 5
    }//Line 9
    private void foo(Object param) {
    
    //line 6
        String str = param.toString();//line 7
        System.out.println(str);
    }//Line 8
}

insert image description here

String concatenation operation

  • The splicing result of constant and constant is in the constant pool, the principle is compile-time optimization
  • A variable with the same content will not exist in the constant pool
  • As long as one of them is a variable, the result is on the heap. The principle of variable splicing is StringBuilder
  • If the result of splicing calls the intern() method, it will actively put the string object not yet in the constant pool into the pool, and return the address of this object

Example 1:

public static void test1() {
    
    
    // 都是常量,前端编译期会进行代码优化
    String s1 = "a" + "b" + "c";  
    String s2 = "abc"; 

    // true,有上述可知,s1和s2实际上指向字符串常量池中的同一个值
    System.out.println(s1 == s2); 
}

After decompiling into a class file, you will find String s1 = "abc"; indicating that the code is optimized at compile time
insert image description here

Example 2:

public static void test2() {
    
    
    String s1 = "javaEE";
    String s2 = "hadoop";

    String s3 = "javaEEhadoop";
    String s4 = "javaEE" + "hadoop";    
    String s5 = s1 + "hadoop";
    String s6 = "javaEE" + s2;
    String s7 = s1 + s2;

    System.out.println(s3 == s4); // true 编译期优化
    System.out.println(s3 == s5); // false s1是变量,不能编译期优化
    System.out.println(s3 == s6); // false s2是变量,不能编译期优化
    System.out.println(s3 == s7); // false s1、s2都是变量
    System.out.println(s5 == s6); // false s5、s6 不同的对象实例
    System.out.println(s5 == s7); // false s5、s7 不同的对象实例
    System.out.println(s6 == s7); // false s6、s7 不同的对象实例

    String s8 = s6.intern();
    System.out.println(s3 == s8); // true intern之后,s8和s3一样,指向字符串常量池中的"javaEEhadoop"
}

Principle of variable splicing:
When two variables: String s1=“a”;String s2=“b”; are added: The execution details are as follows:
①StringBuilder s=new StringBuilder();
②s.append(“a”);
③s .append("b");
④s.toString();

Example 3:

public void test3(){
    
    
    String s0 = "ab";
    String s1 = "a";
    String s2 = "b";
    String s3 = s1 + s2;
    System.out.println(s0 == s3); // false s3指向对象实例,s0指向字符串常量池中的"ab"
    String s7 = "cd";
    final String s4 = "c";
    final String s5 = "d";
    String s6 = s4 + s5;
    System.out.println(s6 == s7); // true s4和s5是final修饰的,编译期就能确定s6的值了
}

String Builder is not necessarily used for string splicing operations. If final modification is used, it is constant, and code optimization will be performed in the compiler. If final modification is not used, it is a variable, which will be spliced ​​through new StringBuilder. In actual development, you can use final, try to use it.

Example 4:

String concatenation operation performance comparison:

public class Test{
    
        
	public static void main(String[] args) {
    
            
		int times = 40000;        
		
		long start = System.currentTimeMillis();        
		
		testString(times);    // String  6963ms    
		//testStringBuilder(times); // StringBuilder    2ms             
		
		long end = System.currentTimeMillis();        
		System.out.println("String: " + (end-start) + "ms");        
		
	
	}    
	
	public static void testString(int times) {
    
            
		String str = "";        
		for (int i = 0; i < times; i++) {
    
                
			str += "test";        
		}    
	}    
	
	public static void testStringBuilder(int times) {
    
            
		StringBuilder sb = new StringBuilder();        
		for (int i = 0; i < times; i++) {
    
                
			sb.append("test");        
		}    
	}    

}

Result: The efficiency of adding strings through StringBuilder's append() method is much higher than that of using Sting's string splicing method.

Details: StringBuilder's append() method: Only one StringBuilder object has been created from beginning to end. Using String's string splicing method, multiple StringBuilder and String objects will be created during the execution process, which occupies a large amount of memory. If GC is performed, it will cost more extra time.

use of intern()

intern(): Try to put the string object into the string pool. First, determine whether there is a corresponding string value in the string constant pool. If it exists, return the address of the string in the constant pool. If it does not exist, it will be in the constant pool. Add the string and return the corresponding address.

intern is a native method that calls the underlying C method.

Interned string ensures that there is only one copy of the string in memory, which can save memory space and speed up the execution of string manipulation tasks. Note that this value will be stored in the String Intern Pool.

Space efficiency: When a large number of existing strings are used in the program, especially when there are many repeated strings, using the intern() method can save memory space.


Interview question: How many objects will new String("ab") create?

String s = new String("ab");Two objects are created: a new object in the heap space, and a string constant "ab" in the string constant pool (if the constant already exists in the string constant pool at this time, it will not be created)
insert image description here
Interview questions: new String(" How many objects will a")+new String("b") create?
insert image description here


Use of intern:
insert image description here

The use of intern in jdk6: false false
insert image description here
String s = new String("1")Created two objects (new object, string constant)
② s.intern() Since "1" already exists in the string constant pool, s points to the object address in the heap space , s2 points to the address of "1" in the constant pool in the heap space
③: The address of the s3 variable record is: new String("11");, but no string is generated in the string constant pool "11"
④: s3.intern() is in the string constant pool ( a "11"new object)
⑤: s4 points to the address in the string constant pool "11", so the addresses pointed to by s3 and s4 are different

The use of intern in jdk7/8: false true

insert image description here
In jdk7/8, s3.intern(), since it already exists new String("11");, will generate a reference address of "11" in the constant pool new String("11");, and s4 will point to the reference of "11" generated in the constant pool when the previous line of code is executed address, so both s3 and s4 point to the same address.

Summary:
①: In jdk6, if there is one in the string pool, it will not be put in. Returns the address of the object in the existing string pool , if not, it will make a copy of the object, put it into the string pool, and return the object address in the string pool.
②: In jdk7/8, if there is a string in the pool, it will not be put in. Returns the address of an object in the existing string constant pool . If not, it will copy the reference address of the object, put it into the string pool, and return the reference address in the string pool.


Exercise:
insert image description here
In jdk6: s.intern() creates a string "ab" in the string constant pool, s2 points to "ab", execution result: true false
insert image description here
In jdk7/8: s.intern() does not create a string" ab", but a reference is created, pointing to new String("ab");, both s and s2 point to this address, execution result: true true
insert image description here


Guess you like

Origin blog.csdn.net/Lzy410992/article/details/118707321