Interesting source code of the three giants of JDK and StringBuilder family

AbstractStringBuilder

It is an abstract class of variable character sequence class, originally appeared in jdk1.5.

history

It is well known that the most commonly used character data type of Java is the Stirng object , but the String object is designed as a constant class (internal maintenance final char []), and each change will produce a new object. In scenarios where character values ​​need to be changed frequently, system resources are wasted. To solve this problem, an object family of variable characters was born. (StringBuffer + StringBuilder) And AbstractStringBuilder is now their father.
Why is the word " now " bolded ?

  • Because the first thing that appeared was StringBuffer, which was born in jdk.10 (he was not born yet at this time when he was a strange event), he is a thread-safe class. Later, our smart Java development team found that in most applications, thread safety is not required in most scenarios.
  • So in jdk1.5, the little father AbstractStringBuilder (encapsulating some public methods) and the little brother StringBuilder (non-thread-safe class) were born

Class Diagram

Insert picture description here

abstract class AbstractStringBuilder implements Appendable, CharSequence {

    char[] value;

    int count;
}

AbstractStringBuilder implements two interfaces. Let ’s briefly introduce his two godfathers

  • CharSequence is a readable sequence interface, and string also implements it. We can understand that he regulates the basic methods of character type objects in java (int length (); char charAt (int index); public String toString (); etc.)
  • Appendable
    be added character sequences class interface, we can understand CharSequence regulate the operation of the character, the character is added Appendable specification operation for cutoff of the jdk8 it provides a method and append his overloaded methods

Core member

  • char [] value;
    This is a good understanding of the value array used to store character types
  • int count;
    This is used to indicate the length of char [] characters. The question is: why not use the value.length of the char array to represent the character length? Let's look down with questions

Common methods

   /**
   * 追加字符串 str
   **/
   public AbstractStringBuilder append(String str) {
        if (str == null)
            return appendNull();
        int len = str.length();
        ensureCapacityInternal(count + len);
        str.getChars(0, len, value, count);
        count += len;
        return this;
   }

   /**
   * char数组扩容
   * @param minimumCapacity 期望的最小数组长度
   */
   private void ensureCapacityInternal(int minimumCapacity) {
        if (minimumCapacity - value.length > 0) {
            // 当前长度小于最小期望 进行扩容
            value = Arrays.copyOf(
                    value,
                    newCapacity(minimumCapacity) // 新的数组长度由 newCapacity方法产生
            ); // 产生新的数组
        }
    }
    
    /**
     * @param  minCapacity 最小期望长度
     */
    private int newCapacity(int minCapacity) {
        // overflow-conscious code
        // 先取当前数组长度 乘2在加2, “为了减少扩容次”数每次扩容长度最起码都要翻倍(不然每次都扩容的话不久和String一样了)。
        // 为什么是成“二再加二”? 不知道!有人知道么?
        // jdk注解很少告诉我为什么,但应该是经过思考的选择. 有兴趣可以深究一下
        int newCapacity = (value.length << 1) + 2;
        if (newCapacity - minCapacity < 0) {
            newCapacity = minCapacity; // 上一步操作后还不满足 最小期望,那就使用最小期望值来作为新的数组长度
        }

        // 当新长度, 小于零或者 大于数组最大长度(MAX_ARRAY_SIZE)的时候,交给hugeCapacity方法选择新数组长度
        // MAX_ARRAY_SIZE 有意思了 MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
        // 为什么最大长度等于 int最大长度 - 8 呢?
        // java中外定义数组时传入数组长度参数类型为int所以无法定义超过Integer.MAX_VALUE长度的数组(编译时就会报错)
        // 那为什么要 - 8 呢 ?
        // jdk注释是这么写的:
        // Some VMs reserve some header words in an array.
        // Attempts to allocate larger arrays may result in
        // OutOfMemoryError
        // 一些虚拟机在数组中保留一些空间,尝试分配较大的数组可能会导致内存溢出错误! 所以减了个8。
        // 不过注意 当程序尝试分配的大小 在 MAX_ARRAY_SIZE 》 Integer.MAX_VALUE 之间时 hugeCapacity方法还是会返回期望值。否则返回 MAX_ARRAY_SIZE 或者抛出异常
        return (newCapacity <= 0 || MAX_ARRAY_SIZE - newCapacity < 0)
            ? hugeCapacity(minCapacity)
            : newCapacity;
    }
    
    /**
     * 最小期望长度 大于 Integer.MAX_VALUE 最大值时抛出异常
     * @param minCapacity 最小期望长度
     * @return 返回 MAX_ARRAY_SIZE(数组最大长度) 或者  返回minCapacity( minCapacity > MAX_ARRAY_SIZE)
     */
    private int hugeCapacity(int minCapacity) {
        if (Integer.MAX_VALUE - minCapacity < 0) { // overflow
            throw new OutOfMemoryError();
        }
        return (minCapacity > MAX_ARRAY_SIZE)
            ? minCapacity : MAX_ARRAY_SIZE;
    }
    
    /**
    * 这个方法还是留着子类自己去实现了
    */
    @Override
    public abstract String toString();

Explain most of them in the comments here for systematic organization

Organize the above questions

  • Why not use the value.length of the char array to represent the character length?
    After reading the above source code and comments, you can know that each expansion of char [] is determined by the newCapacity method. There is a mechanism to prevent the char array from being expanded every time the string is manipulated, and every time it is expanded, it tries to open up more free space. The next time you change the character, you will first use free space not enough for expansion, so the length of the array is greater than or equal to the actual content length, then you need a separate field to record the actual content length.

An extended question: What is the maximum length of the array in our commonly used HotSpot java virtual machine?

Execute this code

   public static void main(String[] args) {
        int i = Integer.MAX_VALUE;
        while (true) {
            try {
                System.out.println(new char[i].length);
            } catch (OutOfMemoryError e) {
                i--;
                e.printStackTrace();
                // 异常继续
            }
            System.out.println("数组最大长度 十进制:" + i);
            System.out.println("数组最大长度 二进制:" + Integer.toBinaryString(i));
            System.out.println("数组最大长度 二进制位数:" + Integer.toBinaryString(i).length());
            return;
        }
    }

What kind of content will be entered?

java.lang.OutOfMemoryError: Requested array size exceeds VM limit
	at com.tlong.TestAbstracString.main(TestAbstracString.java:10)
数组最大长度 十进制:2147483646
数组最大长度 二进制:1111111111111111111111111111110
数组最大长度 二进制位数:31

Theoretically, the maximum length of the array should be 2 ^ 31
  • 31-bit binary maximum
  • Although java stipulates that the space used to record the length of the array in the header information of the array object is 32 bits, but. The size of int in Java is 4 bytes and 32 bits, one of which indicates the sign. So the maximum value of int positive integer is 2 ^ 31
In fact, the maximum length of the array under HotSpot1.8 is (2 ^ 31-1)
  • As some of the virtual machines mentioned in the comments reserve some space in the array, trying to allocate a larger array may cause a memory overflow error
Published 17 original articles · won 24 · views 280,000 +

Guess you like

Origin blog.csdn.net/qq_22956867/article/details/99477637