[Java] The difference between String, StringBuffer and StringBuilder

String

Strings are widely used in Java programming. Strings are objects in Java. Java provides the String class to create and manipulate strings.

It should be noted that the value of String is immutable, which causes a new String object to be generated for each operation on String, which is not only inefficient, but also wastes a lot of limited memory space. Let's take a look at this picture of memory changes when operating on String:

We can see that the initial String value is "hello", and then a new string "world" is added to the end of this string. This process requires re-opening memory space in the stack memory, and finally got "hello world" "Strings also need to open up memory space accordingly. Such a short two strings need to open up memory space three times. I have to say that this is a huge waste of memory space. In order to deal with frequent string-related operations, two new classes, StringBuffer and StringBuild, are introduced to deal with this kind of changed strings.

StringBuffer 和 StringBuilder

When modifying a string, you need to use the StringBuffer and StringBuilder classes.

Unlike the String class, the objects of the StringBuffer and StringBuilder classes can be modified many times without generating new unused objects.

The StringBuilder class was proposed in Java 5. The biggest difference between it and StringBuffer is that the methods of StringBuilder are not thread-safe (cannot be accessed synchronously).

Since StringBuilder has a speed advantage compared to StringBuffer, it is recommended to use StringBuilder class in most cases. However, when the application requires thread safety, the StringBuffer class must be used.

Inheritance structure of the three

The difference between the three:

String is faster in some special cases

In some special cases, the string splicing of String objects is actually interpreted by JVM as the splicing of StringBuffer objects, so the speed of String objects at these times is not slower than StringBuffer objects, especially the following string object generation Among them, String efficiency is much faster than StringBuffer

 String S1 = "This is only a" + " simple" + " test";
 StringBuffer Sb = new StringBuffer("This is only a").append(" simple").append(" test");
复制代码

You will be surprised to find that the speed of generating String S1 objects is simply too fast, and at this time, StringBuffer does not have any advantage in speed at all. In fact, this is a trick of the JVM, the JVM eyes, this String S1 = “This is only a” + “ simple” + “test”;is actually:String S1 = “This is only a simple test”;

So of course it doesn’t take much time. But what everyone should pay attention to here is that if your string comes from another String object, the speed will not be so fast, for example:

String S2 = “This is only a”;
String S3 = “ simple”;
String S4 = “ test”;
String S1 = S2 +S3 + S4;
复制代码

At this time, the JVM will do it in the original way.

Creation of String objects

1. Regarding the creation of class objects, a very common way is to use the constructor, and the String class is no exception: String s=new String("Hello world"); The question is what the parameter "Hello world" is, which is also a character Is it a string object? Is it possible to create a string object with a string object?

2. Of course, there is another way to create String objects that everyone likes: String s="Hello world"; But it's a bit weird. How does it resemble the assignment operation of basic data types (int i=1)?

Before starting to explain these issues, we first introduce some necessary knowledge:

Java class file structure and constant pool

We all know that in order for a Java program to run, a compiler must first compile the source code file into a bytecode file (that is, a .class file). Then it is interpreted and executed by the JVM.

The class file is a binary stream of 8-bit bytes. The meaning of these binary streams consists of a number of compact meaningful items. For example, the first 4-byte item in the class byte stream is called magic, and its meaning is to distinguish class files (value 0xCAFEBABE) from non-class files. The general structure of the class byte stream is shown on the left side of the figure below.

Among them, there is a very important item in the class file-constant pool. This constant pool is dedicated to placing symbol information in the source code (and different symbol information is placed in the constant table of different flags). As shown on the right side of the above figure is the constant table in the HelloWorld code (the HelloWorld code is as follows), which has four different types of constant tables (four different constant pool entries).

public class HelloWorld{  
    void hello(){  
        System.out.println("Hello world");  
    }  
}  
复制代码

It can be seen from the above figure that after the "Hello world" string literal in the code is compiled, it can be clearly seen that it is stored in the string constant table in the class constant pool (the red box area on the right side of the above figure).

JVM run class file

After the source code is compiled into a class file, the JVM will run the class file. It will first be loaded into the class file with a class loader. Then you need to create many memory data structures to store the byte data in the class file. For example, the class information data corresponding to the class file, the structure of the constant pool, the binary instruction sequence in the method, the description information of the class method and the field, and so on. Of course, at runtime, you also need to create a stack frame for the method. Of course, so many memory structures need to be managed, and the JVM will organize these things into several "runtime data areas". There are what we often call "method area", "heap", "Java stack" and so on.

As we mentioned above, every literal string in the Java source code will be compiled into a class file to form a constant table with the symbol number 8 (CONSTANT_String_info). When the JVM loads the class file, it will create a memory data structure for the corresponding constant pool and store it in the method area. At the same time, JVM will automatically create a new String object (intern string object, also called detention string object ) in the heap for the string constant literal in the CONSTANT_String_info constant table . Then convert the entry address of the CONSTANT_String_info constant table into the direct address of the String object in the heap (constant pool analysis).

The key here is this detained string object . All string constants with the same literal value in the source code can only create a single detained string object. In fact, the JVM maintains this feature through an internal data structure that records the detention string references. In a Java program, you can call String's intern() method to make a regular string object a detained string object. We will introduce this method later.

Opcode mnemonic instruction

With the two knowledge premises explained above, we will distinguish between the two creation methods of string objects based on binary instructions below:

(1) String s=new String("Hello world"); The instructions after compiling into a class file (check in myeclipse):

Class bytecode instruction set code

0  new java.lang.String [15]  //在堆中分配一个String类对象的空间,并将该对象的地址堆入操作数栈。  
3  dup //复制操作数栈顶数据,并压入操作数栈。该指令使得操作数栈中有两个String对象的引用值。  
4  ldc <String "Hello world"> [17] //将常量池中的字符串常量"Hello world"指向的堆中拘留String对象的地址压入操作数栈  
6  invokespecial java.lang.String(java.lang.String) [19] //调用String的初始化方法,弹出操作数栈栈顶的两个对象地址,用拘留String对象的值初始化new指令创建的String对象,然后将这个对象的引用压入操作数栈  
9  astore_1 [s] // 弹出操作数栈顶数据存放在局部变量区的第一个位置上。此时存放的是new指令创建出的,已经被初始化的String对象的地址 (此时的栈顶值弹出存入局部变量中去)。  
复制代码

Note: [There is a dup command here. Its function is to copy the reference to the previously allocated Java.lang.String space and push it to the top of the stack. So why is it necessary to do this here? Because the invokespecial instruction found the java.lang.String() construction method through the constant pool entry [15], although the construction method was found. But you have to know whose construction method it is, so the application of the previously allocated space must be pushed onto the top of the stack so that the invokespecial command application will know that this construction method is the reference just created. After the call is completed, the value of the top of the stack will be changed. pop up. Then call astore_1 to pop the top value of the stack at this time into a local variable.

In fact, before running this instruction, the JVM has created a detained string in the heap for "Hello world" (It is worth noting that if there is a "Hello world" string constant in the source program, then they All correspond to the detention strings in the same heap). Then use the value of the detained string to initialize the new String object created by the new instruction in the heap. The local variable s actually stores the address of the new heap object. Everyone pays attention. At this time, in the heap managed by the JVM, there are two String objects with the same string value: one is a detained string object, and the other is a newly created string object. If there is another creation statement String s1=new String("Hello world"); how many strings with the value "Hello world" in the heap? The answer is 3, everyone think about why!

(2) The command after compiling String s="Hello world"; into a class file:

Class bytecode instruction set code

0  ldc <String "Hello world"> [15]//将常量池中的字符串常量"Hello world"指向的堆中拘留String对象的地址压入操作数栈  
2  astore_1 [str] // 弹出操作数栈顶数据存放在局部变量区的第一个位置上。此时存放的是拘留字符串对象在堆中的地址  
复制代码

It is very different from the creation instruction above. The local variable s stores the heap address of the detained string that has been created (there is no new object). Think about it, everyone, if there is another sentence String s1="Hello word"; at this time, how many strings with the value "Hello world" are in the heap? The answer is one. So is the address stored in the local variable s and s1 the same? Haha, you should know this.

to sum up

The String type is actually very common if it is stripped. What really makes her mysterious lies in the existence of the Constant_String_info constant table and the detained string object. Now we can resolve many disputes on the rivers and lakes.

[Dispute 1] Arguments about the equality of strings

//代码1  
String sa=new String("Hello world");            
String sb=new String("Hello world");      
System.out.println(sa==sb);  // false       
//代码2    
String sc="Hello world";    
String sd="Hello world";  
System.out.println(sc==sd);  // true   
复制代码

The local variables sa and sb in the code 1 store the memory addresses of the two String objects new in the heap by the JVM. Although the values ​​of these two String objects (char[] stored character sequence) are both "Hello world". Therefore, "==" compares two different heap addresses. The local variables sc and sd in code 2 also store addresses, but they are the addresses of the only detained string objects in the heap pointed to by "Hello world" in the constant pool. Naturally equal.

[Dispute 2] The inside story of the string "+" operation

//代码1  
String sa = "ab";                                          
String sb = "cd";                                       
String sab=sa+sb;                                      
String s="abcd";  
System.out.println(sab==s); // false  
//代码2  
String sc="ab"+"cd";  
String sd="abcd";  
System.out.println(sc==sd); //true  
复制代码
代码1中局部变量sa,sb存储的是堆中两个拘留字符串对象的地址。而当执行sa+sb时,JVM首先会在堆中创建一个StringBuilder类,同时用sa指向的拘留字符串对象完成初始化,然后调用append方法完成对sb所指向的拘留字符串的合并操作,接着调用StringBuilder的toString()方法在堆中创建一个String对象,最后将刚生成的String对象的堆地址存放在局部变量sab中。而局部变量s存储的是常量池中"abcd"所对应的拘留字符串对象的地址。 sab与s地址当然不一样了。这里要注意了,代码1的堆中实际上有五个字符串对象:三个拘留字符串对象、一个String对象和一个StringBuilder对象。
  代码2中"ab"+"cd"会直接在编译期就合并成常量"abcd", 因此相同字面值常量"abcd"所对应的是同一个拘留字符串对象,自然地址也就相同。
复制代码

String三姐妹(String,StringBuffer,StringBuilder)

The variability of StringBuffer and String

//String   
public final class String  
{  
        private final char value[];  
  
         public String(String original) {  
              // 把原字符串original切分成字符数组并赋给value[];  
         }  
}  
  
//StringBuffer   
public final class StringBuffer extends AbstractStringBuilder  
{  
         char value[]; //继承了父类AbstractStringBuilder中的value[]  
         public StringBuffer(String str) {  
                 super(str.length() + 16); //继承父类的构造器,并创建一个大小为str.length()+16的value[]数组  
                 append(str); //将str切分成字符序列并加入到value[]中  
        }  
}  
复制代码

Obviously, the value[] in String and StringBuffer are used to store character sequences. but:

(1) String is a constant (final) array, which can only be assigned once. For example: new String("abc") makes value[]={'a','b','c'} (see jdk String is how it is implemented), and then the value[] in this String object can no longer be changed Up. This is what everyone often says, String is immutable.

Note: This is a misunderstanding for beginners. Some people say that String str1=new String("abc"); str1=new String("cba"); doesn't it change the string str1? Then you need to understand the difference between object reference and object itself. Here I briefly explain that the object itself refers to the instance data (non-static non-constant field) of the object stored in the heap space. The object reference refers to the address stored in the object itself in the heap. Generally, the method area and the Java stack store the object reference, not the data of the object itself.

(2) The value[] in StringBuffer is a very ordinary array, and a new string can be added to the end of value[] through the append() method. This also changes the content and size of value[].

For example: new StringBuffer("abc") makes value[]={'a','b','c','',''...} (note that the length of the structure is str.length()+16) . If you append this object again ("abc"), then the value[]={'a','b','c','a','b','c',''... ..}. This is why everyone says that StringBuffer is a variable string. It can also be seen from this point that the value[] in StringBuffer can be used as a string buffer function. The cumulative performance is very good, we will compare later.

In summary, discuss that String and StringBuffer can be immutable. Essentially refers to the value[] character array in the object can be immutable, not the object reference can be immutable.

The thread safety of StringBuffer and StringBuilder

StringBuffer and StringBuilder can be regarded as twins, and the methods of the two are not very different. But in terms of thread safety, StringBuffer allows multiple threads to perform character operations. This is because many methods of StringBuffer in the source code are modified by the keyword synchronized, but StringBuilder does not.

Programmers with multi-threaded programming experience should know synchronized. This keyword is set for the thread synchronization mechanism. Let me briefly explain the meaning of synchronized:

Each class object corresponds to a lock. When a thread A calls the synchronized method M in the class object O, it must obtain the lock of the object O to be able to execute the M method, otherwise the thread A is blocked. Once thread A starts to execute the M method, it will monopolize the lock of object O. Makes other threads that need to call the M method of the O object to be blocked. Only after thread A finishes executing and releasing the lock. Those blocked threads have the opportunity to call the M method again. This is the lock mechanism that solves the thread synchronization problem.

After understanding the meaning of synchronized, everyone may feel this way. StringBuffer is much safer than StringBuilder in multi-threaded programming, which is indeed the case. If multiple threads need to operate on the same string buffer, StringBuffer should be the best choice.

Note: Is String also unsafe? In fact, this problem does not exist, String is immutable. The thread can only read a String object specified in the heap, but cannot modify it. Let me ask: What else is unsafe?

The efficiency of String and StringBuffer

StringBuffer and StringBuilder can be described as twins, StringBuilder is newly introduced in 1.5, and its predecessor is StringBuffer. StringBuilder is slightly more efficient than StringBuffer. If thread safety is not considered, StringBuilder should be the first choice. In addition, the main time consumption of the JVM running program is in creating and reclaiming objects.

Comparison of "+" operation between String constant and String variable

▲测试①代码:    
(测试代码位置1)  
String str="";
(测试代码位置2)  
str="Heart"+"Raid";
[耗时:  0ms]
             
▲测试②代码        
(测试代码位置1)  
String s1="Heart";
String s2="Raid";
String str="";
(测试代码位置2)  
str=s1+s2;
[耗时:  15—16ms]
复制代码

Conclusion: The "+connection" of String constants is slightly better than the "+connection" of String variables.

Reason: The "Heart" + "Raid" of test ① have been connected in the compilation stage, forming a string constant "HeartRaid", and pointing to the detained string object in the heap. When running, only the address of the detention string object pointed to by "HeartRaid" needs to be taken out 1W times and stored in the local variable str. This does not take much time.

Comparison of the "cumulative +" connection operation of the String object and the append() cumulative connection operation of the StringBuffer object

 ▲测试①代码:     
(代码位置1)  
String s1="Heart";
String s="";
(代码位置2)  s=s+s1;
[耗时:  4200—4500ms]
             
▲测试②代码        
(代码位置1)  String s1="Heart";
StringBuffer sb=new StringBuffer();
(代码位置2) sb.append(s1);
[耗时:  0ms(当循环100000次的时候,耗时大概16—31ms)]
复制代码

Conclusion: When a large number of strings are accumulated, the append() efficiency of StringBuffer is much better than the "tired +" connection of String objects

Reason: In test ①, s=s+s1, JVM will first create a StringBuilder, and use the append method to complete the merging operation of the string object values ​​pointed to by s and s1, and then call the toString() method of StringBuilder in the heap Create a new String object whose value is the merged result of the string just now. The local variable s points to the newly created String object.

Because the value[] in the String object cannot be changed, a new String object needs to be created to store the string value after each merge. Looping 1W times naturally needs to create 1W String objects and 1W StringBuilder objects, and the low efficiency can be imagined.

In the test ②, sb.append(s1); only needs to keep expanding your own value[] array to store s1. There is no need to create any new objects in the heap during the cycle. The high efficiency is not surprising.

to sum up:

(1) For string constants that can be determined at the compile stage, there is no need to create String or StringBuffer objects. The "+" connection operation using string constants is the most efficient.

(2) The append efficiency of the StringBuffer object is higher than the "+" connection operation of the String object.

(3) Constantly creating objects is an important reason for program inefficiency. Then whether the same string value can only create a String object in the heap. Obviously, the detained string can do this. In addition to the string constants in the program that will be automatically created by the JVM, the intern() method of the String can also be used to do this. When intern() is called, if the current String value is already in the constant pool, then this constant is returned to point to the address of the detained object. If not, the String value is added to the constant pool, and a new detained string object is created.

Scene interview questions

Compile-time replacement

String a = "hello2";  
String b = "hello" + 2; 
System.out.println((a == b));
复制代码

  The output result is: true. The reason is simple, "hello"+2 has been optimized to "hello2" during compilation, so during runtime, variable a and variable b point to the same object.

Symbolic reference

 String a = "hello2";  
 String b = "hello";  
 String c = b + 2;   
 System.out.println((a == c));
复制代码

The output result is: false. Because of the presence of symbolic references, it String c = b + 2;will not be optimized during compilation, will not b+2as a literal handled, thus generating in this way an object is actually stored on the heap. Therefore, a and c are not pointing to the same object. What javap -c gets:

final modification

String a = "hello2";  
final String b = "hello";    
String c = b + 2;     
System.out.println((a == c));
复制代码

The output result is: true. For variables modified by final, a copy will be saved in the class file constant pool, which means that they will not be accessed through connection, and access to final variables will be directly replaced with real values ​​during compilation. Then String c = b + 2; will be optimized during compilation: String c = "hello" + 2; The following figure is the content of javap -c:

Method call

public class Main {
    public static void main(String[] args) {
        String a = "hello2";
        final String b = getHello();
        String c = b + 2;
        System.out.println((a == c));
    }
     
    public static String getHello() {
        return "hello";
    }
}
复制代码

The output result is false. Although b is modified with final, since its assignment is returned by a method call, its value can only be determined during runtime, so a and c are not pointing to the same object.

String.intern()

public class Main {
    public static void main(String[] args) {
        String a = "hello";
        String b =  new String("hello");
        String c =  new String("hello");
        String d = b.intern();
         
        System.out.println(a==b);
        System.out.println(b==c);
        System.out.println(b==d);
        System.out.println(a==d);
    }
}
复制代码

Result: false, false, false, true

This involves the use of the String.intern method. In the String class, the intern method is a local method. Before JAVA SE6, the intern method would look for a string with the same content in the runtime constant pool, and return a reference to the string if it exists. If it does not exist, The string will be pooled and a reference to the string will be returned. Therefore, a and d point to the same object.

String str = new String("abc") How many objects are created?

First of all, we must figure out the meaning of creating an object. When was it created? Will this code create 2 objects during runtime? Undoubtedly, it is impossible. You can get the bytecode content executed by the JVM with javap -c decompilation: Obviously, new is only called once, which means that only one object is created.

The confusing part of this topic is here. This code does only create one object during runtime, that is, an "abc" object is created on the heap. And why everyone is talking about two objects? It is necessary to clarify a concept that there is a difference between the execution process of this piece of code and the loading process of the class. In the process of class loading, an "abc" object is indeed created in the runtime constant pool, but only a String object is indeed created during the execution of the code.

Therefore, if this issue into String str = new String("abc")involves several String objects? There are two reasonable explanations.

Personally feel that if you encounter this problem during the interview, you can ask the interviewer clearly "how many objects are created during the execution of this code or how many objects are involved" and then answer them according to the specifics.

What is the difference between 1) and 2) of the following code?

public class Main {
    public static void main(String[] args) {
        String str1 = "I";
        //str1 += "love"+"java";        1)
        str1 = str1+"love"+"java";      //2)
    }
}
复制代码

The efficiency of 1) is higher than that of 2). The "love"+"java" in 1) will be optimized to "lovejava" during compilation, while the one in 2) will not be optimized. The following is the bytecode in two ways:

1) Bytecode:

2) Bytecode

It can be seen that only one append operation was performed in 1), and two append operations were performed in 2).

String.concat()

String s1 = "a";  
String s2 = s1.concat("");  
String s3 = null;  
new String(s1);  
复制代码

This code will involve 3 String type variables,

1, s1, points to 1 of the String instance below

2. s2, the same direction as s1

3. s3, the value is null, does not point to any instance

And 3 String instances,

1. The String instance of the resident string constant corresponding to the "a" literal

2. The String instance of the resident string constant corresponding to the "" literal (String.concat() is an interesting method. When it is found that the passed parameter is an empty string, this will be returned, so there will be no additional creation here. String instance)

3. A new String instance created by new String(String); there is no variable pointing to it.

Several String objects

String s1 = new String("xyz");  
String s2 = new String("xyz");  
复制代码

Only two String instances will be created every time it is executed.

In the Java language, the "new" expression is responsible for creating an instance, in which the constructor is called to initialize the instance; the return value type of the constructor itself is void, not "the constructor returns a reference to the newly created object ", but the value of the new expression is a reference to the newly created object.

Correspondingly, in the JVM, the "new" bytecode instruction is only responsible for creating the instance (including allocating space, setting the type, setting default values ​​for all fields, etc.), and pressing the reference to the newly created object to the operand The top of the stack. At this time, the reference cannot be used directly and is in an uninitialized state; if a method a contains code that attempts to call any instance method through the reference in the uninitialized state, then method a will fail the JVM bytecode verification. As a result, JVM refused to execute.

The only thing that can be done with a reference to an uninitialized state is to call the instance constructor through it, which is represented as a special initialization method "" at the Class file level. The actual call instruction is invokespecial, and the required parameters must be sequentially pressed onto the operand stack before the actual call. In the bytecode example above, the instructions for pressing parameters include two dup and ldc, which respectively combine the hidden parameters (the reference of the newly created instance, which is "this" for the instance constructor) and the first explicitly declared An actual parameter (a reference to the "xyz" constant) is pushed onto the operand stack.

After the constructor returns, the reference of the newly created instance can be used normally.

JVM constant pool understanding

The memory model of the JVM runtime data area consists of five parts: [1] Method area [2] Heap [3] JAVA stack [4] PC register [5] Local method stack

For String s = "haha"

Its virtual machine instructions:

0:   ldc     #16; //String haha   
2:   astore_1
3:   return
复制代码

Ldc command format: ldc, index

ldc instruction process:

To execute the ldc instruction, the JVM first looks for the constant pool entry specified by the index. At the constant pool entry pointed to by the index, the JVM will look for the CONSTANT_Integer_info, CONSTANT_Float_info, and CONSTANT_String_info entries. If there are no such entries, the JVM will resolve them. For the above hahaJVM, the CONSTANT_String_info entry will be found, and at the same time, the reference to the detained String object (generated by the process that parses the entry) will be pushed onto the operand stack.

astore_1 instruction format: astore_1

astore_1 instruction process:

To execute the astore_1 instruction, JVM pops a reference type or returnAddress type value from the top of the operand stack, and then stores the value in the local variable specified by index 1, that is, stores the reference type or returnAddress type value in local variable 1.

The process of the return instruction:

Return from the method, the return value is void.

Talk about my personal understanding:

From the execution process of the ldc instruction above, it can be concluded that the value of s is a reference from the detained String object (generated by the process of parsing the entry), which can be understood as being copied from the reference of the detained String object. So my personal understanding is that the value of s is stored in the stack. The above is the analysis of the value of s, followed by the analysis of the "haha" value. We know that for String s = "haha", the "haha" value is determined during the compilation of the JAVA program. To put it simply, the value of haha ​​is generated in the class file after the program is compiled into a class file (you can use the UE editor or other text editing tools to see this haha ​​in the bytecode file after opening the class file value). In the process of executing the JAVA program, the first step is to generate the class file, which is then loaded into the memory for execution by the JVM. Then the JVM loads this class into the memory. How does the value of haha ​​open up space for it in the memory and store it in which area?

Constant pool

The virtual machine must maintain a constant pool for each loaded type. The constant pool is an ordered set of constants used by the type, including direct constants (string, integer, and floating point constants) and symbolic references to other types, fields, and methods. For String constants, its value is in the constant pool. The constant pool in the JVM exists in the form of a table in memory. For the String type, there is a fixed-length CONSTANT_String_info table to store literal string values. Note: This table only stores literal string values, not symbols. Reference. Having said that, there should be a clearer understanding of the storage location of string values ​​in the constant pool.

After introducing the concept of JVM constant pool, let's talk about the location of the memory distribution of the value of "haha" mentioned at the beginning. For the value of haha, in fact, before the class file is loaded into the memory by the JVM and the engine parses the ldc instruction and executes the ldc instruction, the JVM has allocated space for the haha ​​string in the CONSTANT_String_info table of the constant pool for storage haha this value. Since the string constant haha ​​is stored in the constant pool, according to the description in the book "Deep into the JAVA Virtual Machine": the constant pool is part of the type information, and the type information is each type that is reproduced. This type is reflected in the JVM memory model. The middle corresponds to the method area that exists in the JVM memory model, that is, the constant pool concept in this type information exists in the method area, and the method area is allocated by the JVM in the heap in the JVM memory model. Therefore, the value of haha ​​should be stored in the heap space.

对于String s = new String("haha")

Its JVM instructions:

0:   new     #16; //class String
3:   dup
4:   ldc     #18; //String haha
6:   invokespecial   #20; //Method java/lang/String."":(Ljava/lang/String;)V
9:   astore_1
10:  return
复制代码

New command format: new indexbyte1, indexbyte2

New instruction process:

To execute the new instruction, Jvm calculates (indextype1<<8)|indextype2 to generate an unsigned 16-bit index pointing to the constant pool. Then the JVM looks up the constant pool entry according to the calculated index. The constant pool entry pointed to by the index must be CONSTANT_Class_info. If the entry does not yet exist, then the JVM will resolve the constant pool entry, and the entry type must be a class. The JVM allocates enough space for the new object image from the heap and sets the object's instance variable to the default value. Finally, the JVM pushes the reference objectref pointing to the new object onto the operand stack.

dup instruction format: dup

Dup instruction process:

To execute the dup instruction, the JVM copies the contents of one word length at the top of the operand stack, and then pushes the copied contents onto the stack. This instruction can copy any unit word length value from the top of the operand stack. But never use it to copy one of the two word lengths (long or double) at the top of the operand stack. In the above example, the reference objectref is copied, and there are 2 references in the operand stack at this time.

Ldc command format: ldc, index

ldc instruction process:

To execute the ldc instruction, the JVM first looks for the constant pool entry specified by the index. At the constant pool entry pointed to by the index, the JVM will look for the CONSTANT_Integer_info, CONSTANT_Float_info, and CONSTANT_String_info entries. If there are no such entries, the JVM will resolve them. For the above haha, the JVM will find the CONSTANT_String_info entry, and at the same time, it will push the reference to the detained String object (generated by the process of parsing the entry) into the operand stack.

Invokespecial instruction format: invokespecial, indextype1, indextype2

Invokespecial instruction process: For this class, the instruction is used to call the instance initialization method. In view of the length of the instruction, you can refer to the description in "Deep into the JAVA Virtual Machine" for details. In the above example, the constructor of the String class is called through one of the references, the object instance is initialized, and the other same reference is made to point to the initialized object instance, and then the previous reference is popped from the operand stack.

astore_1 instruction format: astore_1

astore_1 instruction process:

To execute the astore_1 instruction, JVM pops a reference type or returnAddress type value from the top of the operand stack, and then stores the value in the local variable specified by index 1, that is, stores the reference type or returnAddress type value in local variable 1.

The process of the return instruction:

Return from the method, the return value is void.

To execute the astore_1 instruction, JVM pops a reference type or returnAddress type value from the top of the operand stack, and then stores the value in the local variable specified by index 1, that is, stores the reference type or returnAddress type value in local variable 1.

Through the above 6 instructions, it can be seen that the haha ​​in String s = new String("haha"); is stored in the heap space, while s is in the operand stack. The above is the analysis and understanding of the memory situation of the s and haha ​​values; then for the String s = new String("haha"); statement, how many objects are created? My understanding: here "haha" itself is a constant pool When executing new String() at runtime, a copy of the object in the constant pool is placed in the heap, and the reference to this object in the heap is held by s. So this statement creates 2 String objects.

Application scenarios

  • If you want to manipulate a small amount of data, use String;

  • Operate a large amount of data StringBuffer under multi-threaded operation string buffer;

  • Operate a large amount of data StringBuilder under single thread operation string buffer.


Author: XHSF
link: https: //juejin.cn/post/6934023966004609031
Source: Nuggets
 

Guess you like

Origin blog.csdn.net/m0_50180963/article/details/114239909