Study notes of the String class (middle): Introducing the immutability of strings and the string constant pool

This article introduces the immutability of the String class string and the string constant pool, mainly including how to ensure that the string is immutable and how to modify the string. Why should the string be set immutable? The creation and understanding of the string constant pool is simple. The string constant pool diagram, and how to manually add strings to the string constant pool

1. The immutability of strings

The String object instantiated by the String class in Java is actually an immutable object. The content in the string cannot be changed. That is, its string cannot be modified after it is created

insert image description here
It has also been marked in the String source code interface that all string literals in Java programs are used as instances of this class
and strings are treated as constants, that is, their values ​​cannot be changed after they are created!!!

1. How to ensure that the string is immutable

It is unrealistic to restrict the string from being modified only by some text description~ How can the content of the created object not be modified in the String class?

In the String class, two member variables are actually encapsulated.
insert image description here
In fact, the string represented in the String object is actually stored in a character array, so creating a String object will also create a character array object. The content of the string is each Each character is stored in the character array object, and then accepted by the value array reference.
Therefore, the value array is used to store the string content...

The hash variable is related to the initialization of the string constant pool. The default value is 0, which will be introduced below...

We can see that value is modified by private and final, and the
value member variable of hash modified by private is modified by final, which means that when instantiating a string object, it must be initialized and point to a character array.

Because value is modified by final, value can be regarded as a constant after instantiation. That is, its pointer to the character array object can no longer be changed, but the content of the character array pointed to by value can still be modified

And the hash is not modified by final, that is, it can be modified later~

Both of them are modified by private, so they can only be accessed in the String class, and these two properties cannot be obtained outside the class. Only
through member methods in the class can these two properties be accessed, but the internal and external properties provided by the class All methods have no operations that can modify value and hash.
Therefore, due to this design, the outside cannot access and modify the contents of the array pointed to by value, that is, the contents of the String string cannot be modified!

The content of the string is a character array, which is pointed to by value.
The permission modifier private inside the string ensures that the content of the string cannot be modified by external access, and the method provided externally does not modify the content of the array pointed to by value.

2. Modify the string

As mentioned above, strings are represented by String class instances in Java and are internally maintained by value, and they cannot be modified. So how do the above commonly used string manipulation methods modify strings?

insert image description here
The above commonly used operation methods, as long as it involves modifying the string, you will find that if you look at any source code, as long as you modify the string, you must create a new modified string object and return it!!! The
original Nothing changes in the string!!!

All modifications to the String object create a new String object for modification and finally return the address of the new object

In general, when the content of the String has not been modified, the address of the original string object is returned.
For example, the characters in the string are all uppercase. Calling toUpperCase has not changed and the return is still the original string.

3. Why should the string be set immutable

When we use the String class object to store constant strings, it will involve putting the string objects in the string constant pool.
The string constant pool can be regarded as a string resource area

When we store a string object in the string constant pool, the next time we use this string object, we can directly use it from the constant pool, which can save the time and space wasted in creating new string objects

But if the content of the string can be modified, it means that the modified content may have to be stored in a different location in the string constant pool. If it is not changed in subsequent storage, duplicate strings will be stored. If you choose to change the location
, Then every time the content of the object in the string constant pool is modified, the position must be changed again, which is complicated and reduces performance

The hash in the string is to locate the position of the string object in the string constant pool. If the content of the string can be changed, the hash must be modified every time, and the hash value must be recalculated every time.

In the case of multi-threading, when the string object can be modified, each thread can modify the string, and it may happen that multiple threads modify the content of the string at the same time, which is not thread-safe.

Therefore, the content of the design string cannot be modified:

  1. It is convenient to realize the string object pool. If the String is mutable, then the object pool needs to consider the problem of copy-on-write.
  2. Immutable objects are thread-safe
  3. Immutable objects are more convenient for caching hash codes, and can be stored in the constant pool more efficiently as keys

2. String constant pool

What is a pool?

"Pool" is a common and important way to improve efficiency in programming. We will encounter various "memory pools", "thread pools", "database connection pools" in future learning... For example
: The way everyone spends their living expenses

  1. The family is in financial difficulties, and the living expenses are paid on a regular basis every month. Sometimes it may be late. In the worst case, you may need to open your mouth to ask for it at a slow speed.
  2. There is a mine at home, one-time payment of living expenses for one year is put into the bank card, and the speed is very fast. Method 2 is an example of pooling technology. Money is placed on the card and withdrawn as needed, which is efficient. very high. Common pooling technologies such as: database connection pool, thread pool, etc.

In order to save storage space and program operating efficiency, Java introduces:

  1. Class file constant pool: After each .Java source file is compiled, the .Class file will save the literal constants and symbol information in the current class
  2. Runtime constant pool: When the .Class file is loaded, the constant pool in the .Class file is loaded into memory called the runtime constant pool, and each class of the runtime constant pool has a copy
  3. string constant pool

In Java programs, literal constants like: 1, 2, 3, 3.14, "hello" and so on are frequently used. In order to make the program run faster and save memory, Java provides 8 basic data types and String All classes provide a constant pool.
The string constant pool is the StringTable class in the JVM. It is actually a fixed-size HashTable hash table (a data structure that is efficiently used for searching). The location of the string constant pool under different JDK versions and the default size is different:

insert image description here

1. Creation of string constant objects

The 0~n characters enclosed by "" are called string constants. This way of writing omits the new keyword, and can also directly instantiate string objects for string reference acceptance, and the creation process of string constants will go through String constant pool
Take a look at the following code:
Think about itHow many String objects will be created and the running results of the following code.

public static void main(String[] args) {
    
    
        String s1 = "hello";
        String s2 = "hello";
        String s3 = new String("hello");
        String s4 = new String("hello");
        System.out.println(s1 == s2);   
        System.out.println(s2 == s3); 
        System.out.println(s3 == s4);  
        
    }

Result: The above code finally created three objects, and the running result was true false false

The content of the string is hello, but the results are different, because the string constant has passed through the variable of the string constant pool. If you want to understand the running results of the above code, you have to learn more about the string constant pool~

2. Understand the string constant pool

As mentioned above, the string constant pool is a StringTable class, that is, a hash table, which is essentially a linked list + array + red-black tree structure. In order to facilitate understanding, here is an example of linked list + array...

Its string constant pool is also an array, each element of the array is a linked list, and each node of the linked list has three fields: one field
stores the address of the string object, one field stores the hash value of the string object, and one field stores the next node the address of

For string constants, it will look for string objects with equal string content in the string constant pool before creation

If saved, directly return the address of the string object in its string constant pool (use the string object in the string constant pool)

If it does not exist, a node will be created, an object will be created according to the string constant, the address hash of the string object and the address of the next node will be stored in the node, and the address of the string object will be returned. (Put the created string constant in to the string constant pool)

How to create a string constant to determine its position in the string constant pool?

In Java, the string constant pool is created by string literals, which are determined at compile time, not runtime. Therefore, each string is assigned a unique hashcode at compile time. This hash code is calculated by using the contents of the string, usually using a hashing algorithm called "Jenkins Hash"

Therefore, each string constant enclosed by "" will get its hash at compile time and locate it in the string constant pool, that is, a certain subscript of the linked list array. You can check whether there is a node pointed to in the linked
list where the subscript is located. String constant objects and string constants are equal

In the String class, the hashCode method of the Object class is also rewritten, which is a method for calculating the hash value of a string, but this method is not used in the string constant pool, and is used for other hashes that need to obtain string objects, such as :hashMap

 public int hashCode() {
    
     //String源代码
        int h = hash;
        if (h == 0 && value.length > 0) {
    
    
            char val[] = value;
            for (int i = 0; i < value.length; i++) {
    
    
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

Source code analysis:

By calling the rewritten hashCode method, it is judged that the hash of the current string is 0, and when the length of the character array is greater than 0, obtain a hash value for each character of the character array through the cumulative sum of each character +31*h stored in the hash variable

And when the hash value is not 0, it means that the hashCode has been calculated once, because the string is immutable, there is no need to repeat the calculation, just use the hash value calculated in the hash variable directly, and the string content is empty, that is, an empty string When its hash value is 0, it can be returned

insert image description here

In this way, the hash generated according to each character of the string content is stored in the hash variable

Note: The hash generated at compile time is the same as the hash generated by the hashCode method, but in the string constant pool, instead of the hash value generated by hashCode, the hash value generated at compile time is used, so the hash variable of the string
object After the object is created, it will still be 0. Only when the hashCode is called manually, the calculated hash will be obtained and the hash variable will be changed, and the

hash of the node in the constant pool will be changed to the hash of the stored string object. It can improve the efficiency of finding and adding strings

For example, if you want to insert a string constant later, directly compare whether the hash values ​​are equal in the string constant pool. If they are equal, then compare whether the content is equal.

3. Analyze the results of the above code operation with accompanying drawings

public static void main(String[] args) {
    
    
        String s1 = "hello";
        String s2 = "hello";
        String s3 = new String("hello");
        String s4 = new String("hello");
        System.out.println(s1 == s2);   
        System.out.println(s2 == s3); 
        System.out.println(s3 == s4);  
    }

The first "hello" will get its hash at compile time, find the corresponding subscript in the string constant pool, and check whether there is a string object in the node whose content is equal to hello, and because "hello" is the first time to
use , this object does not exist in the string constant pool at this time, a String object with hello content and its hash value will be created and stored in the node, the node will be stored in the linked list under this subscript, and the address of the created string object will be returned to s1 quote

The second "hello" obtains its hash in the string constant pool and already has an object whose string object content is equal to hello. At this time, no more objects will be created, and the address of the hello object pointed to in the string constant pool will be returned to s2 Citation reception

The third hello returns the object created by the first hello in the same way as before, but it also has a new String() statement, which will create a new String object and pass the address of the first hello object into its construction method , and finally the value reference in the new object will point to the value in the first object, and the hash is also the hash in the first object. The
following is the construction method corresponding to String
insert image description here

Why does the above constructor not create a new array object but refer to the original character array?
Because creating a new array object will waste time and space, and since strings are immutable, their value cannot be modified by access, so pointing to a character array can also meet the basic use of strings and save resources

The fourth hello is the same as the previous one, and will create a new string object, but its internal value reference points to the character array object pointed to by the value of the first string object, and hash is also in the first string object the hash

From this, it can be obtained that both s1 and s2 store the same hello object, and s3 and s4 store two other different string objects, but their internal value points to the same character array. At this time, three objects are created. A string object, a character array object~

And two reference variables use == to compare addresses, so the result is true false false

Note: The hash is obtained at compile time. Whether to create a string object depends on whether the string to be created is found in the string constant pool. The hash in the string constant pool is obtained according to the compile-time method instead of hashCode
The constant pool stores the hash value generated according to the content of the string object. The hash variable in the created string object has not changed and is still 0

Here is a simple diagram of the above code:

insert image description here

You can see that three string objects are created, and finally s1 s2 points to the same string object s3 s4 points to different string objects, but the value of each object points to the same character array object

4. Manually add the string to the string constant pool

Using a constant string to create a String type object will store it in the string constant pool, which is more efficient and saves space. Alternatively, the created string object can be added to the string constant pool through intern.

intern is a native method (Native method refers to: the bottom layer is implemented in C++, and the source code of its implementation cannot be seen), the function of this method is to manually add the created String object to the constant pool

Analyze the following code s1. intern appearing above String s2 and appearing under String s2, what are the different results~

public static void main(String[] args) {
    
    
        char[] ch={
    
    '1','2','3'};
        String s1=new String(ch);  // 实例化一个字符串对象 内部value数组 指向一份拷贝的ch数组
       // s1.intern();   1

        String s2="123";      
//        s1.intern();   2
        System.out.println(s1==s2);
    }

When s1.intern is above String s2, the string object pointed to by s1 will obtain the hash value generated during compilation and map to a subscript in the string constant pool. Since there is no "123" object in the string constant pool, it will be Put it in the string constant pool, and then execute String s2="123";, the object pointed to by s1 already exists in the string constant pool, and the object pointed to by s1 is returned at this time, and the result is true
insert image description here

When s1.intern is under String s2, the object pointed to by s1 will be manually added to the string constant pool, but the string "123" already exists in the constant pool, and the string pointed to by s2 will be returned at this time, and the string of s1 will be returned. It has not been added to the string constant pool, so the result is false
insert image description here

But if the statement is s1=s1.intern, s1 will finally point to the string object pointed to by s2 returned in the string constant pool, and the result will be true

insert image description here

Note: The implementation of Intern in Java6 and Java7 and 8 will be slightly different

3. Summary

This article introduces the immutability of the String class, how to ensure the immutability of the String class (through private permission encapsulation and methods exposed to the outside world)
to modify the string (if the called method changes the content of the string, a new string will be created The object changes the content of the new string object),
why the string object is set immutable (in order to facilitate the content of the string constant pool will no longer be modified, the hash content will not be changed, making the String class thread-safe)

Introduction to the string constant pool, why there is a string constant pool (increase the utilization of constant strings, share an object for a large number of constant strings with equal content, save time and space resources)
and illustrate the process of creating string constants ,
manually add the string object to the string constant pool
(the use of the intern keyword)

insert image description here

Guess you like

Origin blog.csdn.net/lch1552493370/article/details/130476342