String (String, StringBuffer, StringBuilder) advanced analysis

Reprinted from https://segmentfault.com/a/1190000002683782

We must first remember the characteristics of the three:

  • String String constant
  • StringBuffer string variable (thread safe)
  • StringBuilder string variable (not thread safe)

1. Definition

Looking at the API, you will find that String, StringBuffer, and StringBuilder all implement the CharSequence interface, and they are all implemented with a char array internally. Although they are all related to strings, their processing mechanisms are different.

  • String: It is an immutable quantity, that is, it cannot be modified after it is created.
  • StringBuffer: It is a variable string sequence. Like String, it stores an ordered string sequence (an array of char type) in memory. The difference is that the values ​​of StringBuffer objects are variable.
  • StringBuilder: It is basically the same as the StringBuffer class, which is a variable character-for-string sequence. The difference is that StringBuffer is thread-safe, and StringBuilder is thread-unsafe.

scenes to be used

Scenarios using the String class: The String class can be used in scenarios where the string does not change frequently, such as constant declarations and a small number of variable operations.

Scenarios using the StringBuffer class: If you frequently perform string operations (such as concatenation, replacement, deletion, etc.) and run in a multi-threaded environment, you can consider using StringBuffer, such as XML parsing, HTTP parameter parsing and encapsulation.

Scenarios using the StringBuilder class: If you frequently perform string operations (such as splicing, replacement, and deletion) and run in a single-threaded environment, you can consider using StringBuilder, such as SQL statement assembly, JSON encapsulation, and so on.

analyze

In terms of performance, since the operation of the String class is to generate a new String object, while StringBuilder and StringBuffer are just the expansion of a character array, the operation of the String class is much slower than that of StringBuffer and StringBuilder.

Briefly, the main performance difference between String type and StringBuffer type is that String is an immutable object, so every time the String type is changed, it is actually equivalent to 生成了一个新的 String 对象, 然后将指针指向新的 String 对象. Therefore, it is best not to use String for strings that frequently change the content, because 每次生成对象都会对系统性能产生影响, especially when there are more objects without reference in memory, JVM 的 GC 就会开始工作,那速度是一定会相当慢的.

If you use the StringBuffer class, the result is different. Each time the result will operate on the StringBuffer object itself, instead of generating a new object and changing the object reference. So in general we recommend using StringBuffer, especially if the string object changes frequently.

In some special cases, the string splicing of String objects is actually interpreted by JVM as the splicing of StringBuffer objects, so the speed of String objects is not slower than StringBuffer objects at these times, especially the following string objects are generated , String efficiency is much faster than StringBuffer:

  1. String S1 = “This is only a" + “ simple" + “ test";
  2. StringBuffer Sb = new StringBuilder(“This is only a").append(“ simple").append(“ test");

You will be surprised to find that the speed of generating String S1 objects is simply too fast, and this time StringBuffer has no advantage in speed at all. In fact, this is a trick of the JVM. In the eyes of the JVM, this

 
String S1 = “This is only a" + “ simple" + “test"; 

In fact it is:

 
String S1 = “This is only a simple test";

So of course it doesn't take much time. But what everyone should pay attention to here is that if your string is from another String object, the speed is not so fast, for example:

 
1 String S2 = "This is only a";
2 String S3 = "simple";
3 String S4 = "test"; 4 String S1 = S2 +S3 + S4; 

At this time, the JVM will do it in the original way.

Also:

About equal and ==

 
  == 用于比较两个对象的时候,是来check 是否两个引用指向了同一块内存。


This output is false

This output is true
A special case:

this is because:
String buffer pool: The program creates a string buffer pool when it runs.
When using String s1 = "xyz"; to create a string (not new), the program will first look for an object of the same value in the String buffer pool,
in String str1 = "xyz"; , s1 is put into the pool first, so when s2 is created, the program finds str1 with the same value
and references s2 to the object "xyz" referenced by s1

 
 equals()

equals() is a method of object, and by default it does the same as ==, comparing addresses.
But when equal is overloaded, by design, equal compares the object's value. And this is the function that java hopes to have. String 类就重写了这个方法

the result returns true

In general, String has a characteristic: If there are multiple String objects in the program, all containing the same string sequence, then these String objects are all mapped to the same memory area, so two new String("hello") generated Two instances, although independent of each other, should have the same result using hashCode() on them. Note: This is not the case with arrays of strings, only with Strings. That is, hashCode for String is based on its content.

 
public class StringHashCode {
       public static void main(String[] args) { \\输出结果相同 String[] hellos = "Hello Hello".split(" " ); System.out.println(""+hellos[0].hashCode()); System.out.println(""+hellos[1].hashCode()); \\输出结果相同 String a = new String("hello"); String b = new String("hello"); System.out.println(""+a.hashCode()); System.out.println(""+b.hashCode()); } } 

in conclusion

The String class is final and cannot be inherited. The best way to reuse String types is composition rather than inheritance.
String has length() method, array has length property

String s = new String("xyz"); How many string objects are created?
Two objects, a static storage area "xyz", and an object created on the heap with new.

What is the difference between String and StringBuffer, String Builder?

in most casesStringBuffer > String

Java.lang.StringBuffer is a thread-safe mutable sequence of characters. A string buffer similar to String, but cannot be modified. Although it contains a certain sequence of characters at any point in time, the length and content of that sequence can be changed by certain method calls. String buffers can be safely used for multithreading in a program. And these methods can be synchronized when necessary, so that all operations on any particular instance appear to occur in a serial order consistent with the order in which the method calls are made by each thread involved.

The main operations on StringBuffer are the append and insert methods, which can be overloaded to accept any type of data. Each method effectively converts the given data into a string, and then appends or inserts the characters of that string into the string buffer. The append method always adds these characters to the end of the buffer; the insert method adds characters at the specified point.

For example, if z refers to a string buffer object whose current content is "start", this method call z.append("le") will cause the string buffer to contain "startle" (accumulate); and z.insert( 4, "le") will change the string buffer to contain "starlet".

in most casesStringBuilder > StringBuffer

java.lang.StringBuilder A variable character sequence is new in JAVA 5.0. This class provides an API compatible with StringBuffer, but does not guarantee synchronization, so the usage scenario is single-threaded. This class is designed to be used as a drop-in replacement for StringBuffer when the string buffer is used by a single thread (which is common). It is recommended to prefer this class if possible, as it is faster than StringBuffer in most implementations. Both are used in basically the same way.


source code

String, StringBuffer, StringBuilder all implement the CharSequence interface.

 
public interface CharSequence
{
    int length(); // return the char value at the specified index char charAt(int index); // return a new CharSequence that is a subsequence // of this sequence. CharSequence subSequence(int start, int end); public String toString(); } 

The source code of String

 
public final class String{ private final char value[]; // used for character storage private int the hash; // cache the hash code for the string } 

There are only two member variables: final hashcode
of char type array int type

Constructor

 
public String()
public String(String original){ this.value = original.value; this.hash = original.hash; } public String(char value[]){ this.value = Arrays.copyOf(value, value.length); } public String(char value[], int offset, int count){ // 判断offset,count,offset+count是否越界之后 this.value = Arrays.copyOfRange(value, offset, offset+count); } 

Some utility functions are used here to
copyOf(source[],length);copy length from position 0 of the source array;
this function is System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength))implemented with .

copyOfRange(T[] original, int from, int to)

The constructor can also initialize String with the StringBuffer/StringBuilder type,

 
    public String(StringBuffer buffer) {
        synchronized(buffer) { this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); } } public String(StringBuilder builder) { this.value = Arrays.copyOf(builder.getValue(), builder.length()); } 

In addition to the constructor, there are many methods of the String class,
length, isEmpty, which can be implemented by manipulating value.length.
charAt(int index):
Obtained by manipulating the value array. Note that the boundary conditions of the index are first judged

 
    public char charAt(int index) { if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return value[index]; } 

getChars method

 
public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) { \\边界检测 System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin); } 

The equals method redefines equals according to semantic equality (equal content, not pointing to the same memory)

 
   public boolean equals(Object anObject) { if (this == anObject) { return true; } if (anObject instanceof String) { String anotherString = (String)anObject; int n = value.length; if (n == anotherString.value.length) { char v1[] = value; char v2[] = anotherString.value; int i = 0; while (n-- != 0) { if (v1[i] != v2[i]) return false; i++; } return true; } } return false; } 

If both sides of the comparison point to the same piece of memory, they are naturally equal; (comparison == is sufficient)
if the contents are equal, they are also equal. The comparison method is as follows:
first anObject must be of type String (use the keyword instanceof),
and then compare whether the lengths are equal;
if If the lengths are equal, compare element by element, and return true if each is equal.

There is also thread-safe comparison with StringBuffer contents
contentEquals(StringBuffer sb), the implementation is using synchronization on sb.

compareTo():
If A is greater than B, return a number greater than 0;
A is less than B, return a number less than 0;
A=B, return 0

 
    public int compareTo(String anotherString) { int len1 = value.length; int len2 = anotherString.value.length; int lim = Math.min(len1, len2); char v1[] = value; char v2[] = anotherString.value; int k = 0; while (k < lim) { char c1 = v1[k]; char c2 = v2[k]; if (c1 != c2) { return c1 - c2; } k++; } return len1 - len2; } 

regionMatches: if the regions of both strings are equal,

 public boolean regionMatches(int toffset, String other, int ooffset,
            int len)
   {
    //判断边界条件
            while (len-- > 0) {
            if (ta[to++] != pa[po++]) {
                return false;
            }
        }
            }
public boolean regionMatches(boolean ignoreCase, int toffset,
            String other, int ooffset, int len)
{   
    while (len-- > 0) {
            char c1 = ta[to++];
            char c2 = pa[po++];
            if (c1 == c2) {
                continue;
            }
            if (ignoreCase) {
                // If characters don't match but case may be ignored,
                // try converting both characters to uppercase.
                // If the results match, then the comparison scan should
                // continue.
                char u1 = Character.toUpperCase(c1);
                char u2 = Character.toUpperCase(c2);
                if (u1 == u2) {
                    continue;
                }
                // Unfortunately, conversion to uppercase does not work properly
                // for the Georgian alphabet, which has strange rules about case
                // conversion.  So we need to make one last check before
                // exiting.
                if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
                    continue;
                }
            }
            return false;
        }
        return true;
}

 

startsWith(String prefix, int toffset)
startsWith(String prefix)
endsWith(String suffix)

{
    return startsWith(suffix, value.length 
    - suffix.value.length);
    }

substring(int beginIndex,int endIndex)
In addition to conditional judgment:

 return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);

String concatenationconcat(String str)

        int otherLen = str.length();
        if (otherLen == 0) {
            return this; } int len = value.length; char buf[] = Arrays.copyOf(value, len + otherLen); str.getChars(buf, len); return new String(buf, true); 

For StringBuffer and StringBuilder
StringBuffer and StringBuilder are inherited from AbstractStringBuilder, and the underlying logic (such as append) is contained in this class.

 public AbstractStringBuilder append(String str) {
        if (str == null) str = "null"; int len = str.length(); ensureCapacityInternal(count + len);//查看使用空间满足,不满足扩展空间 str.getChars(0, len, value, count);//getChars就是利用native的array copy,性能高效 count += len; return this; } 

The bottom layer of StringBuffer is also char[], and the size of the array is determined when the array is initialized. If the constant append must exceed the size of the array, do we define a large-capacity array, which is a waste of space. Just like the implementation of ArrayList, it adopts dynamic expansion. Each append first checks the capacity. If the capacity is not enough, it expands first, and then copies the contents of the original array to the expanded array.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324845798&siteId=291194637