JDK original code research: java.lang.Integer prints binary, octal, hexadecimal strings

Experience makes all the future, and experience is the result of all the past. The jdk source code is an excellent experience.

Integer.toBinaryString is used to print the binary string of Integer, similarly, toOctalString prints octal string, and toHexString prints hexadecimal. with the conversion from base 10 to binary, octal, and hexadecimal (Divide the decimal number by two, save the remainder of each division, and divide until the final quotient is less than 1, then splice and output at the end of the code) are different. JDK uses a more intuitive and concise method to quickly obtain it.

Take Integer.toBinaryString as an example, it calls toUnsignedString0(int val, int shift), val is the integer to be converted, and shift is the weight of each bit in the base number , using bit to represent the required number of bits (Binary shift=1, octal shift=3, hexadecimal shift=4) . The process is to first determine the length of the string, and then determine the value of each position of the string.

There are 2 methods in toUnsignedString0, numberOfLeadingZeros determines the number of high-order bits of integer complement numbers that are 0. formatUnsignedInt determines the character at each position.

source code

public static String toBinaryString(int i) {
    
    
    return toUnsignedString0(i, 1);
}

private static String toUnsignedString0(int val, int shift) {
    
    
    /* 分析 code-1 */
    int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
    /* 分析 code-2 */
    int chars = Math.max(((mag + (shift - 1)) / shift), 1);
    /* 分析 code-3 */
    char[] buf = new char[chars];
    
    /* 把计算好的字符推如buf中 */
    formatUnsignedInt(val, shift, buf, 0, chars);

    /* 返回结果String */
    return new String(buf, true);
}


static int formatUnsignedInt(int val, int shift, char[] buf, int offset, int len) {
    
    
    /* len就是字符串长度 */
    int charPos = len;
    
    /* 分析 code-4 */
    int radix = 1 << shift;
    int mask = radix - 1;
    
    /* 分析 code-5 */
    do {
    
    
        buf[offset + --charPos] = Integer.digits[val & mask];
        val >>>= shift;
    } while (val != 0 && charPos > 0);

    return charPos;
}   

analyze

  • CODE-1

    int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
    All data in the computer exists in the form of binary (complement code), here is to obtain the binary code length of val in the computer, numberOfLeadingZeros indicates the number of high bits all 0, Integer.SIZE is 32, which represents 4 of Integer in java Bytes are 32 bits. mag is the binary digits with all 0's removed.insert image description here

  • CODE-2

    int chars = Math.max(((mag + (shift - 1)) / shift), 1);
    Here is to calculate the number of digits in the base system to be converted to by val, that is, the length of the result string. shift represents the number of binary digits required for one bit in the base system to be converted. (mag + (shift - 1))/shiftThat is mag / shiftrounded up. (For the proof, refer to my other blog [java interview question-algorithm] how to round up the division? For example, octal shift is 3, binary has 10 bits, chars is 10/3 rounded up to 4

  • CODE-3

    char[] buf = new char[chars];
    Initialize the result string buf. The string in java is actually a char array, and the built-in encoding method of java is UTF-16 of unicode. (If you want to know more about the encoding method, please see the new solution to the old question of character encoding, which is entangled (ASCII, GBK, GB2312, GB18030, UNICODE, UTF-8, UTF-16, UTF-32)

  • CODE-4

    int radix = 1 << shift;
    int mask = radix - 1;
    1 Shifting the shift bit to the left actually gets the binary expression of the weights to be converted into bases . The binary weight is 2 (shift=1 is represented by 10) , the octal weight is 8 (shift=3 is represented by 1000) , and the hexadecimal weight is 16 (shift=4 is represented by 10000) , respectively subtract 1 Obtain the mask, that is, the maximum number of bits in the target system, which is 1 (1) in binary, 7 (111) in octal , and 15 (1111) in hexadecimal .

  • CODE-5

    do {   buf[offset + --charPos] = Integer.digits[val & mask];   val >>>= shift; } while (val != 0 && charPos > 0);


    mask is the maximum number of digits in the target base, val & maskwhich means that val is converted to the lower value of the target base, digits is an array of 36 characters stored in [0-z], subscript 0 corresponds to '0', 1 corresponds to '1 ', 15 corresponds to 'F'.

    So it val & maskis assumed that the subscript gets the digit character as the low-order character of the target base of val. val >>>= shiftEquivalent to value=value>>>shift. From low to high, every time you get a bit, move the position to the right shift (Represents the number of binary digits required for one bit in the base system to be converted) . Judge byte by byte. Then put the result into buf.

    For example, val is 9 in decimal, its binary expression is 1001, mask is 111, digit[1001& 111] = digit[1] = '1'. Then 1001 >>>= 3 = 1. Finally execute digit[1&111] = digit[1] = '1'. So the octal representation of 9 is 11.

Guess you like

Origin blog.csdn.net/kiramario/article/details/115503141