The fourth chapter String source code analysis of the basic classes of Java (on)

Preface

In the real world, people's lives are always inseparable from electricity. As the most basic things such as human activities and mechanical operations, they play a major role in the operation of the entire world. In the Java world, it also has its own electricity, which can be seen in almost most places. It is the String class. Because of the high frequency of use and many content points, the String class will be divided into two parts for introduction. The first part is the source code decomposition, and the second part is the analysis of common difficult points.

data structure

In the data structure, there is a data structure called "string", the bottom layer of which is composed of an array. You can manipulate, search, etc. on this structure. The String class is the realization of the "string" data structure in the Java language. It simulates the data operation algorithm of the string, and can realize the addition, deletion, and modification of the string. In this article, I will focus on analyzing the String class and its source code implementation. Once you have mastered this class, you have mastered the most basic tools. When you use the String class in your future development career, you will be very proficient and confident.

Overview map

Although there are many methods in this category, it can be analyzed to categorize its behaviors and sort them in order of priority. Finally, it focuses on its common and important parts.
Insert picture description here
Insert picture description here
Insert picture description here

Analysis sequence of this chapter

The above api looks a lot, in fact, after being divided into modules, it is roughly divided into several aspects. As shown in the figure:
Insert picture description here
This chapter focuses on the three parts of Api commonly used methods, key methods, and special methods. All the methods have been divided into [Add] [Delete] [Check] [Change] [Segment] [Compare] [Convert] and [Others] according to functions.
Up to now, it has been the eighth month to be a programmer. With the refinement of business, project training and self-understanding, I feel that nothing can be added, deleted, checked, or changed. Examining the reasons, I found that these operations are actually basic operations on the data structure. Whether it is a linear table, a tree structure, or a graph structure, there are these common operations. The String class is a JAVA implementation of strings in a linear structure, and its basic operations are also inseparable from addition, deletion, and modification. On this basis, APIs for different data types, for different return values, and for different input parameter values ​​are extended. The following classifications only represent personal understanding. Maybe everyone has a different understanding of some APi, and the division is the same.

String class

(1) Increase

concat(String str)

(2) Delete

trim()

(3) Check

长度 :	length()      isEmpty()

字符 : 	indexOf(int ch)  	  indexOf(int ch, int fromIndex)    
	  	indexOf(String str)   indexOf(String str, int fromIndex)
	  	lastIndexOf(int ch)   lastIndexOf(int ch, int fromIndex)
	  	lastIndexOf(String str)  lastIndexOf(String str, int fromIndex)
	  	charAt(int index)
	  	
hash值: hashCode()

(4) Change

replace(char oldChar, char newChar)
replace(CharSequence target, CharSequence replacement)
replaceFirst(String regex, String replacement)
replaceAll(String regex, String replacement)

(5) Split

割为数组:   split(String regex)         split(String regex, int limit)

割成子串:   substring(int beginIndex)   subSequence(int beginIndex, int endIndex)
			substring(int beginIndex, int endIndex)

(6) Conversion

转为字节: 	getBytes() 	 getBytes(Charset charset)
		    getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin)
		   
转为字符:  toCharArray() 

转为字符串: toString()

大小写转换: toUpperCase()    toUpperCase(Locale locale)    
		    toLowerCase()	 toLowerCase(Locale locale)

(7) Comparison

整串比较: equals(Object anObject)   	contentEquals(CharSequence cs)
		  compareTo(String anotherString)
	
内部类比较器: CaseInsensitiveComparator类	  

整或子串比较: startsWith(String prefix)   startsWith(String prefix, int toffset)
		      endsWith(String suffix)
		      contains(CharSequence s)
		      compareToIgnoreCase(String str)
        	  equalsIgnoreCase(String anotherString)	

(8) Other

static  join(CharSequence delimiter, CharSequence... elements)
native String intern()			// 这个native方法和常量池有关
static  valueOf(Object o)    //  这个静态valueOf方法对不同的数据类型有同样的操作。

before the start

The key APIs are probably these, and the usage rate is relatively high. It looks a lot, but many methods are mutually implemented, and then there are some methods that are very simple. So there are not many methods to analyze. Of course, in addition to Api, there are also constructors and internal private methods to support these Api. This part will be combined and analyzed in the corresponding Api method.
The internal properties of String are very simple: a char array and a hash value are maintained internally, which is 0 by default;

 /** The value is used for character storage. */
private final char value[];

/** Cache the hash code for the string */
private int hash; // Default to 0

Others are so versatile based on the extension of this array.
Let's analyze the source code step by step.

1. Increase

concat(String str)

The source code is introduced like this:

 	* Concatenates the specified string to the end of this string.
 	* 翻译: 将指定字符串追加在末尾

Source code implementation:

 // 整个思路还是很简单的。
public String concat(String str) {
 int otherLen = str.length();   // 获取指定字符串长度
    if (otherLen == 0) {
        return this;
    }
    int len = value.length;
    char buf[] = Arrays.copyOf(value, len + otherLen);  // 新建一个长度总和的数组
    str.getChars(buf, len);   // 把指定字符串的字符,赋值到新数组的后面。这个新数组就已装下了两者之和。
    return new String(buf, true);  // 把新数组构建成字符串
   }

2. Delete

trim()

The trim() method is also called the trim method. From the user's perspective, there are often such questions. The input data is correct. Why can't it be found or the prompt is wrong? Because from the user's point of view, they don't care about whether there are spaces before and after, because spaces are invisible on the interface. Based on this operation, trim() method internally implements a trimming method.
Source code implementation:

public String trim() {
    int len = value.length;   // 尾指针  (从尾向头查找)
    int st = 0;				  // 头指针  (从头向尾查找)
    char[] val = value;    /* avoid getfield opcode */

 // 判断用的是ASCII码对比,空格或者空格以下的字符将会被清理,顺序从头到尾,获取第一个有效索引。
    while ((st < len) && (val[st] <= ' ')) {
        st++;
    }
// 这里也一样,顺序从尾到头。获取最后一个有效字符的索引。
    while ((st < len) && (val[len - 1] <= ' ')) {
        len--;
    }
// 根据头指针 和 尾指针的最终位置,对原字符串进行截取,获取的字符串。
    return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}

Three, check

长度 :	length()      isEmpty()

字符 : 	indexOf(int ch)  	  indexOf(int ch, int fromIndex)    
	  	indexOf(String str)   indexOf(String str, int fromIndex)
	  	lastIndexOf(int ch)   lastIndexOf(int ch, int fromIndex)
	  	lastIndexOf(String str)  lastIndexOf(String str, int fromIndex)
	  	charAt(int index)

This part is related to query. The length() method and isEmput() method are very simple, so I won’t introduce them here. This paragraph focuses on getting the characters in the string.
There are nine related methods for character acquisition, and the relationship and function between them are represented by a graph structure here.

Insert picture description here
It can be clearly seen from the figure that the String class provides indexes in two directions for the string. The IndexOf() method above is to traverse sequentially from beginning to end. lastIndexOf() is to traverse sequentially from end to beginning. Although there are low-level methods for their respective implementations, the principles are similar. Therefore, only one indexOf() method group is analyzed here, and the lastIndexOf() method group on the other side is similar.
In IndexOf, there are int parameter types and String parameter types. The difference is similar to Integer's parseInt() method. Let's analyze the two sets of methods and underlying implementations derived from these two parameters. The indexOf() and lastIndexOf () methods both return the index of the search character.
charAt(int index) is to find the corresponding character according to the subscript, and the reverse of the above.

(1)indexOf(int ch)

indexOf(int ch) source code:

// 内部默认从下标 0 开始。返回的是一个出现ch的第一个下标。ch是什么呢?
//  源码中有解释,ch是一个Unicode码点,每一个字符在Unicode中都对应一个码点。所以ch其实是一个字符
* @param   ch          a character (Unicode code point).
* 
public int indexOf(int ch) {
    return indexOf(ch, 0);
}

indexOf(int ch, int fromIndex) source code:

// 参数多了一个fromIndex
* @param   fromIndex   the index to start the search from.
* 翻译: 指定从第几个下标开始
* 
public int indexOf(int ch, int fromIndex) {
    final int max = value.length;
    if (fromIndex < 0) {
        fromIndex = 0;  //也就是说起点允许为负数,不会报错
    } else if (fromIndex >= max) {
        // Note: fromIndex might be near -1>>>1.  -1>>>1 指的是最大值2^31
        return -1;
    }

	//下面以 Character.MIN_SUPPLEMENTARY_CODE_POINT 为界,即补充码点最小值,暂时可以理解为一个临界点。
    if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {  
        // handle most cases here (ch is a BMP code point or a
        // negative value (invalid code point))
        final char[] value = this.value;
        for (int i = fromIndex; i < max; i++) {
            if (value[i] == ch) {
                return i;    // 这里通过遍历获得ch码点对应的字符,在串中的下标值。
            }
        }
        return -1;
    } else {
    	// 如果超过临界点,则调用 下面这个方法
        return indexOfSupplementary(ch, fromIndex);
    }
}

indexOfSupplementary(int ch, int fromIndex)源码:

private int indexOfSupplementary(int ch, int fromIndex) {
    if (Character.isValidCodePoint(ch)) {
        final char[] value = this.value;
        final char hi = Character.highSurrogate(ch);
        final char lo = Character.lowSurrogate(ch);
        final int max = value.length - 1;
        for (int i = fromIndex; i < max; i++) {
            if (value[i] == hi && value[i + 1] == lo) {// 这里也是通过循环遍历 来寻找对应的码点,
                return i;  				  //  然后找到字符所在串中的下标。具体的码点相关知识这里不做补充
            }
        }
    }
    return -1;
}

(2)indexOf(String str)

indexOf(String str) source code:

 * @param   str   the substring to search for.
 * 	// str指的是目标字符串,其源码内部是引用IndexOf(String str, Int fromIndex)
 * 
 public int indexOf(String str) {
    return indexOf(str, 0);
}

indexOf(String str, int fromIndex) source code:

// 其源码内部直接调用的底层方法,因此下面详细分析底层方法。
// fromIndex 是指从哪个下标开始。

public int indexOf(String str, int fromIndex) {
    return indexOf(value, 0, value.length,
            str.value, 0, str.value.length, fromIndex);
}

indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex) source code: [This is temporarily not understood, the variables sourceCount and targetCount are not understood]

/**
 * Code shared by String and StringBuffer to do searches. The
 * source is the character array being searched, and the target
 * is the string being searched for.
 * 翻译: 这是一个由String 和 StringBuffer 共享的方法
 *
 * @param   source       the characters being searched.  //  源字符串
 * @param   sourceOffset offset of the source string.    //  源串中的偏移量,意思就是从开始寻找的位置偏移后的下标,比如字符串“abcdefg”,开始遍历下标为1,但是偏移量为2,因此开始下标就要从1+2=3开始。
 * @param   sourceCount  count of the source string.	 // 源串计数变量
 * @param   target       the characters being searched for.  // 目标字符串 
 * @param   targetOffset offset of the target string.		// 目标串中的偏移量
 * @param   targetCount  count of the target string.    //  目标串计数变量
 * @param   fromIndex    the index to begin searching from.  // 源串中的开始下标
 */
static int indexOf(char[] source, int sourceOffset, int sourceCount,char[] target, 
	int targetOffset, int targetCount, int fromIndex) {
	
    if (fromIndex >= sourceCount) {
        return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
        fromIndex = 0;
    }
    if (targetCount == 0) {
        return fromIndex;
    }

    char first = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
        /* Look for first character. */
        if (source[i] != first) {
            while (++i <= max && source[i] != first);
        }

        /* Found first character, now look at the rest of v2 */
        if (i <= max) {
            int j = i + 1;
            int end = j + targetCount - 1;
            for (int k = targetOffset + 1; j < end && source[j]
                    == target[k]; j++, k++);

            if (j == end) {
                /* Found whole string. */
                return i - sourceOffset;
            }
        }
    }
    return -1;
}

Four, change

重点分析:replace(char oldChar, char newChar)

非重点分析:replace(CharSequence target, CharSequence replacement)
	  	   replaceFirst(String regex, String replacement)
	  	   replaceAll(String regex, String replacement)

The above four methods are all replacements, which are APIs that are extended for different business needs. Among them, only the replace(char oldChar, char newChar) method will be discussed here.

(1) Replace(char oldChar, char newChar) source code:

// 这是一个替换字符操作的Api,不是字符串的操作。
public String replace(char oldChar, char newChar) {
    if (oldChar != newChar) {
        int len = value.length;
        int i = -1;
        char[] val = value; /* avoid getfield opcode */

		// 下面遍历,找出目标字符在字符串中的第一个下标 i
        while (++i < len) {
            if (val[i] == oldChar) {
                break;
            }
        }
        if (i < len) {
        	// 将第一个下标前的所有字符赋值到新数组中
            char buf[] = new char[len];
            for (int j = 0; j < i; j++) {
                buf[j] = val[j];
            }
			// 从第一个目标的下标开始再次向后遍历,如果是目标字符,则用newChar 来替换。
            while (i < len) {
                char c = val[i];
                buf[i] = (c == oldChar) ? newChar : c;
                i++;
            }
            // 最后将新数组生成字符串,返回。
            return new String(buf, true);
        }
    }
    return this;
}

The source code is probably like this, but after reading the source code, I am very puzzled!
Why is it so complicated? Just traverse the string directly and add an if to judge it. Its time complexity and space complexity are lower. Under the puzzle, I posted the code that I expressed, and asked the great god to point out the meaning of the source code and point to its profoundness.

// StringMo 是我模拟String的一个类,内部也是维护着一个字符数组。下面贴出replace方法,我觉得可以直接如下这样:
public StringMo replace(char oldChar,char newChar){

    if (oldChar == newChar){
        return this;
    }

    char[] arr = val;
    for (int i = 0; i < arr.length; i++) {
        if (arr[i] == oldChar){
            arr[i] = newChar;
        }
    }
    return  new StringMo(val);
}

Five, segmentation

割为数组:   split(String regex)         split(String regex, int limit)

割成子串:   substring(int beginIndex)   subSequence(int beginIndex, int endIndex)
			substring(int beginIndex, int endIndex)

There are two types of String segmentation, one is to split into string arrays, and the other is to cut into substrings.
Splitting into an array is the split() method, and there are two overloaded methods. Let's analyze this method below.

(1)split(String regex)
// regex 就是正则表达式,或叫做分割条件,如: “a-b-c-d”.split(”-“)得到的就是a b c d四个字符串组成的字符串数组。
public String[] split(String regex) {
    return split(regex, 0);  // 这个0 是指,数组里面的空字符串单元将会被去掉。详情原因请看下面的方法。
}
(2)split(String regex, int limit)

Different from split (String regex), there is an additional input parameter limit, which is introduced in the source code.

 * @param  limit
 * 
 * <p>{@code limit}参数控制应用模式,因此会影响结果的长度阵列。
 * 如果极限limit大于零,则模式最多应用limit-1次,数组的长度将不大于limit,
 * 并且数组的最后一个条目将包含最后一个匹配分隔符以外的所有输入。
 * 如果limit如果为非正,则该模式将被应用可能,数组可以有任意长度。
 * 如果limit为零,则该模式将被尽可能多次应用,数组可以任何长度,都将丢弃尾随的空字符串。
*

In layman's terms, limit controls the mode of this method. There are three modes, namely positive, non-positive, and zero. Limit means the maximum number of splits, (even if the maximum number of splits exceeds limit).
The relevant source code is as follows:

public String[] split(String regex, int limit) {

    char ch = 0;
    // 条件这一块做了分割,好看点。 整个结构是 (a1 || a2) && b 
    
    if (((regex.value.length == 1 && ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
	// a1:  上面这句的意思是,如果正则表达式只有一个字符,并且不是字符串".$|()[{^?*+\\"中的一个,则为true。

     (regex.length() == 2 && regex.charAt(0) == '\\' && (((ch = regex.charAt(1))-'0') | ('9'-ch)) < 0 &&
     ((ch-'a')|('z'-ch)) < 0 && ((ch-'A')|('Z'-ch)) < 0)) 
	// a2:   如果长度为2,并且第一个字符为 \,并且 第二个字符  ( ch - '0' | '9' - ch ) < 0 ,才为true。中间是或运算,计算机中0 为 正,1为负。或运算小于0的条件就是,字符其中一个为负。所以这个意思就是 ch 不能是在 ‘0’ ~ ‘9’(包含)之间 并且 不在‘a’ ~ 'z' 之间,并且不在 'A'~'Z'之间。
	
          &&
      (ch < Character.MIN_HIGH_SURROGATE || ch > Character.MAX_LOW_SURROGATE))
         // b:  最后要求ch 在Character.MIN_HIGH_SURROGATE 到 Character.MAX_LOW_SURROGATE 之外。
	{
        int off = 0;
        int next = 0;
        boolean limited = limit > 0;   // 模式控制
        
        ArrayList<String> list = new ArrayList<>(); 
        // 分割字符串后用来存放的容器
        while ((next = indexOf(ch, off)) != -1) {	
        // off初始值为0,next 为字符串中每一次 ch 出现的下标。它会随着偏移值的改变而改变。
        
        	/**
        	* limit 模式为 零 时 或模式为 正 时
        	* 
            if (!limited || list.size() < limit - 1) {
                list.add(substring(off, next));   
                // 将每一个目标串分割出来 放入list中
                
                off = next + 1;			
                // 计算下一次循环从哪里开始,off可以看做下一次开始遍历的起点。
                // 举个例子:“a-ab-abc-d".splict(”-“), off第一次的偏移量是0,next是”-“第一次的下标。
                // 所以下一次开始应该是从next+1 ,开始向后索引第二个“-”;  直到next ==  -1跳出循环
            } else {    // last one
              /**
        	  * limit 模式为正时的最后一段
        	  *
                //assert (list.size() == limit - 1);
                list.add(substring(off, value.length));  // limit限值下的最后末尾一段装进list中,如果limit = 1,则就是原串。
                off = value.length;   // 偏移值移到最后
                break;
            }
        }
        
        // If no match was found, return this
        // 如果没有进行分割,则返回一个包含原串的数组
        if (off == 0)  
            return new String[]{this};

        // Add remaining segment
        // 如果是模式零,添加最后剩下的一段。因为索引只到达最后一个目标字符,不到达字符串的末尾。
        // 如果是模式正 , 添加了最后一段。与上面的else 不同的是,上面else的情况是指可分割次数 > limit情况下。
        // 这个是指 可分割次数 <= limit 的情况下。
        if (!limited || list.size() < limit)
            list.add(substring(off, value.length));

        // Construct result
        int resultSize = list.size();
		// 如果模式为 零 , 则会去除里面的空字符串。
        if (limit == 0) {
            while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                resultSize--;
            }
        }
		//最后构造成一个数组并返回。
        String[] result = new String[resultSize];
        return list.subList(0, resultSize).toArray(result);
    }
    
    //  Pattern.compile这个部分不做研究
    return Pattern.compile(regex).split(this, limit);
}

Finally, a brief summary of these two methods:
split (String regex) internally called split (String regex, 0 [int limit]). 0 represents a mode.
If it is 0, the string will be split according to the maximum number of splits, and the empty string generated by the split will be removed.
If it is greater than 0, the actual splitting situation must be based on the relationship between the maximum number of splits and limit; there are two types
(1) The number of splits> limit; a substring array with limit +1 length will be obtained.
(2) The number of divisions <= limit; the substring array with the maximum number of divisions will be obtained.
(3) limit = 0; will get the maximum number of substrings that can be divided, and the substring array generated after removing the empty string.

(3)substring(int beginIndex)

subString should belong to interception, it will intercept part of the source string as a new string. The internal source code is as follows:

// 代码比较简答,有一个注释特别重要,
* @param      beginIndex   the beginning index, inclusive. 
* 翻译: 下标beginIndex 将会包含在内。
* 
public String substring(int beginIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    int subLen = value.length - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
}

The above subString is intercepted from the specified subscript to the end, and contains the characters of the specified subscript.

(4)substring(int beginIndex, int endIndex)

The code is similar, but the comments are very important.

 * @param      beginIndex   the beginning index, inclusive.
 * @param      endIndex     the ending index, exclusive.
 * // 翻译: 起点下标包含在内,结束下标不包含在内。也就是说这是一个左闭右开的方法。
 * 
public String substring(int beginIndex, int endIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > value.length) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    int subLen = endIndex - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
            : new String(value, beginIndex, subLen);
}
(5) subSequence(int beginIndex, int endIndex)

The internally called substring(beginIndex, endIndex); is also the interval of left closed and right open.

public CharSequence subSequence(int beginIndex, int endIndex) {
    return this.substring(beginIndex, endIndex);
}

Six, conversion

转为字节:	getBytes() 	 getBytes(Charset charset)  // 转换为字节 或者 按指定编码转换为字节
		   
转为字符:   toCharArray()   // 转换为char 数组

转为字符串: toString()

大小写转换: toUpperCase()    toUpperCase(Locale locale)     // 全部转换为大写
		    toLowerCase()	 toLowerCase(Locale locale)     // 全部转换为小写

This part is relatively simple.

Seven, comparison

String comparison part,

整串比较: equals(Object anObject)   	contentEquals(CharSequence cs)
		  compareTo(String anotherString)
	
内部类比较器: CaseInsensitiveComparator类	  

整或子串比较: startsWith(String prefix)   startsWith(String prefix, int toffset)
		      endsWith(String suffix)
		      contains(CharSequence s)
		      compareToIgnoreCase(String str)
        	  equalsIgnoreCase(String anotherString)	

Let's first look at the comparison of the whole series.

(1) equals(Object anObject)

Its parameter is an Object, that is, any object can be passed. The internal comparison is divided into three levels (object type, length, single character comparison).

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {   // 判断对象类型
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length) {  // 判断字符串长度
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])   // 单个字符对比,包括字符顺序。
                    return false;
                i++;
            }
            return true;
        }
    }
    return false;
}
(2)contentEquals(CharSequence cs)

It is an input parameter is a sequence of characters, that is, String, StringBuffer, StringBuilder can all be passed in. Therefore, it makes type judgments internally. The method is basically the same as equals().

public boolean contentEquals(CharSequence cs) {
    // Argument is a StringBuffer, StringBuilder
    if (cs instanceof AbstractStringBuilder) {
        if (cs instanceof StringBuffer) {
            synchronized(cs) {
               return nonSyncContentEquals((AbstractStringBuilder)cs);
            }
        } else {
            return nonSyncContentEquals((AbstractStringBuilder)cs);
        }
    }
    // Argument is a String
    if (cs instanceof String) {
        return equals(cs);
    }
    // Argument is a generic CharSequence
    char v1[] = value;
    int n = v1.length;
    if (n != cs.length()) {
        return false;
    }
    for (int i = 0; i < n; i++) {
        if (v1[i] != cs.charAt(i)) {
            return false;
        }
    }
    return true;
}
(2)compareTo(String anotherString)

The thing to note about this method is that its return value is an int, which is a difference. How is the difference calculated?

public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2);  
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {			
            return c1 - c2;   // 差值返回的是从头开始遍历,第一个不同字符的ASCII码差值。
        }
        k++;
    }
    return len1 - len2; // 否则返回两者的长度差,其实到了这里,其中一个字符串肯定为另一个字符串的子串。
}

8. Other

join(CharSequence delimiter, CharSequence... elements)
join(CharSequence delimiter, Iterable<? extends CharSequence> elements)

native String intern()			// 这个native方法和常量池有关
static  valueOf(Object o)    //  这个静态valueOf方法对不同的数据类型有同样的操作。
(1) join() method

Let's first look at the join() method. There are two overloads. The parameter type is CharSequence, and then there are variable-length parameter list elements.

a.

CharSequence is an interface under the java.lang package, used to represent the behavior of strings, and its subclasses generally have three implementations: String, StringBuffer, and StringBuilder. So CharSequence can be regarded as the parent of these three classes. When this parent class is used as a parameter, different subclasses can be passed in. It is a kind of Java polymorphism implementation, also called upward reference. So when CharSequence is used as a parameter, you can pass in String or StringBuffer or StringBuilder.

b.

The variable-length parameter list is a new definition after Java 5.0. Three dots are added after the parameter, such as: Object..., which means that an Object array or multiple Object object parameters need to be passed in.
In the join(CharSequence delimiter, CharSequence... elements) method, you can pass in join("-","a","b","c");
you can also pass in join("-","a"," b"); Or you can pass in join("-",StringBuffer[]);

join(CharSequence delimiter, CharSequence... elements)
join(CharSequence delimiter, Iterable<? extends CharSequence> elements)

The function of the join() method is to use the first input parameter string delimiter to connect each string in the subsequent parameter list. It is a static method. It is equivalent to a tool class.

(2) ValueOf() method group

It is similar to the valueOf of the basic type of packaging class. It uses an input parameter to construct a new string. Each method has new. The input parameters include eight basic data types, object types, etc., which are not listed here. .

public static String valueOf(Object obj) { return (obj == null) ? "null" : obj.toString() }

public static String valueOf(char data[]) { return new String(data); }

public static String valueOf(char data[], int offset, int count) {
 return new String(data, offset, count);
 }
 
 public static String valueOf(boolean b) { return b ? "true" : "false";}
 ...
(3) Native String intern() method

This method involves the constant pool. This is a local method. Its comments are written like this.

/**
 * <p>
 * A pool of strings, initially empty, is maintained privately by the
 * class {@code String}.
 * 翻译: String 的常量池中,最开始是空的,由String类维护着。
 * <p>
 * When the intern method is invoked, if the pool already contains a
 * string equal to this {@code String} object as determined by
 * the {@link #equals(Object)} method, then the string from the pool is
 * returned.
 * 翻译: 当intern()方法被调用时,如果常量池中,已经存在一个通过equal方法比较和这个字符串相等的对象了,
 * 则返回常量池中的这个对象。
 *  Otherwise, this {@code String} object is added to the
 * pool and a reference to this {@code String} object is returned.
 * 翻译:否则,就把这个对象加入到常量池中,然后返回常量池中的这个对象。
 * <p>
 * It follows that for any two strings {@code s} and {@code t},
 * {@code s.intern() == t.intern()} is {@code true}
 * if and only if {@code s.equals(t)} is {@code true}.
 * 翻译:任意两个字符串 s 和 t,如果要满足 s.intern() = t.intern(),则他们一定满足 s.equals(t); 
 * 不知道我理解得对不对。
 * <p>
 * All literal strings and string-valued constant expressions are
 * interned.
 * 翻译:所有的常量表达式都被调用了 intern() 方法。
 *

Example demonstration

    char[] s = {'a'};
    String t = new String(s);
    String m = t.intern();
    System.out.println(t == m );
    // 结果: ture
	这个例子说明了,对象 t 被创建在堆中,常量池 m 只是引用堆中的地址。

Example demonstration

	char[] t = {'a'};
    String s = new String(t);
    String a = "a";
    System.out.println(a == s);
    // 结果: false
    这个例子说明了,两种不同创建方式,在堆里创建了两个对象。

Example demonstration

	char[] t = {'a'};
    String s = new String(t);
    s.intern();
    String a = "a";
    System.out.println(a == s);
    // 结果:true
    这个例子说明了,通过intern()方法,将s指向的对象,加入了String的常量池。变量a会先去常量池中
    寻找,发现有一个a了,于是引用了常量池这个地址,而常量池这个地址就是 s 的地址。

Example demonstration

	String b = "a";
    String a = "a";
    System.out.println(a == b);
    结果: true
    // 这个例子说明,变量a 和变量b 都引用的同一个对象,也只生成了一个对象。
    这个对象肯定不是变量a的时候生成的,所以一定是变量b的时候生成的。
    变量a 没有生成对象,只是引用了变量b 指向的对象。
    所以判断出这样直接赋值时,会先去常量池中寻找,如果没有,才会在堆中创建一个新的对象。

Example demonstration

	String b = new String("a");
    b.intern();
    String a = "a";
    结果:false
    // 这个例子反向推理,因为是false,变量a会先到常量池查找,发现有一个“a”,但是不是对象b,
    所以在执行b.intern( )方法前,常量池中已经有了一个对象,这个对象从哪里来呢?
    在变量b 构造对象之前,也就是new之前,需要一个入参,这个入参就是“a”字符串.所以在获得这个入参时,
    程序默认执行了 类似: String xxx = "a" 的操作。只不过这个操作是系统执行的,没有xxx变量引用。

Guess you like

Origin blog.csdn.net/weixin_43901067/article/details/104574343