Java Integer.toBinaryString() method source code and principle analysis (base conversion, bit operation)


title: Java Integer.toBinaryString() method source code and principle analysis (base conversion, bit operation)
date: 2022-12-27 17:31:38
tags:

  • Java
    categories:
  • Java
    cover: https://cover.png
    feature: false

1. Overview of usage and source code

Integer.toBinaryString()The method is used to convert decimal integers to binary, as in the following example:

insert image description here

The complete source code call is as follows:

public static String toBinaryString(int i) {
    
    
        return toUnsignedString0(i, 1);
}

private static String toUnsignedString0(int val, int shift) {
    
    
        // assert shift > 0 && shift <=5 : "Illegal shift value";
        int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
        int chars = Math.max(((mag + (shift - 1)) / shift), 1);
        char[] buf = new char[chars];

        formatUnsignedInt(val, shift, buf, 0, chars);

        // Use special constructor which takes over "buf".
        return new String(buf, true);
}

public static int numberOfLeadingZeros(int i) {
    
    
        // HD, Figure 5-6
        if (i == 0)
            return 32;
        int n = 1;
        if (i >>> 16 == 0) {
    
     n += 16; i <<= 16; }
        if (i >>> 24 == 0) {
    
     n +=  8; i <<=  8; }
        if (i >>> 28 == 0) {
    
     n +=  4; i <<=  4; }
        if (i >>> 30 == 0) {
    
     n +=  2; i <<=  2; }
        n -= i >>> 31;
        return n;
}

static int formatUnsignedInt(int val, int shift, char[] buf, int offset, int len) {
    
    
        int charPos = len;
        int radix = 1 << shift;
        int mask = radix - 1;
        do {
    
    
            buf[offset + --charPos] = Integer.digits[val & mask];
            val >>>= shift;
        } while (val != 0 && charPos > 0);

        return charPos;
}

final static char[] digits = {
    
    
        '0' , '1' , '2' , '3' , '4' , '5' ,
        '6' , '7' , '8' , '9' , 'a' , 'b' ,
        'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
        'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
        'o' , 'p' , 'q' , 'r' , 's' , 't' ,
        'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};

2. Analysis

2.1 Binary conversion

First of all, understand how to convert from decimal to binary from the logic of operation. Generally speaking, there are two methods. Here, only 8 bits are used for demonstration.

1. Short division

Essentially, it keeps dividing by 2 until the quotient is 0, and then outputs the remainder in reverse order. Example: 15, 16

2| 15                      2| 16  
  ————                       ————
  2| 7           1  ^        2| 8             0  ^
    ————            |          ————              |
    2| 3         1  |          2| 4           0  |
      ————          |            ————            |
      2| 1       1  |            2| 2         0  |
        ————        |              ————          |
           0     1  |              2| 1       0  |
                                     ————        |
                                        0     1  |

From the above, the binary representation of 15 is 0000 1111, and the binary representation of 16 is 0001 0000

2. Weighted addition method

That is to say, the binary number is first written as a weighted coefficient expansion, corresponding to the binary bits in turn, and then summed according to the decimal addition rule

2 to the power of 0 is 1 -------------- corresponding to the first digit
2 to the power of 1 is 2 -------------- corresponding to the second digit
2 to the 2nd power is 4 -------------- Corresponding to the third digit
2 to the 3rd power is 8 -------------- Corresponding to the 4th digit
2 to the 4th power is 16 -------------- Corresponding to the 5th digit
2 to the 5th power is 32 -------------- Corresponding to the 6th digit
2 to the 6th power is 64 -------------- corresponds to the 7th digit
...

Example: 15 = 2^3 + 2^2 + 2^1 + 2^0, 16 = 2^4, namely:

15            16
0000 0000     0000 0000
0000 1000     0001 0000
0000 1100
0000 1110
0000 1111

2.2 Original code, inverse code, complement code

Then learn about the relevant knowledge of original code, inverse code and complement code, and also only use 8 bits for demonstration

1. Original code

The original code, that is, the conversion shown in 2.1 to binary, for example: 15, the original code is 0000 1111. In 2.1, only positive numbers are used as an example. Here, it is necessary to distinguish positive numbers from negative numbers, and introduce the concept of sign bit at the same time. The first bit of binary is the sign bit, the positive number is 0, the negative number is 1, the sign bit does not participate in the conversion and operation of the bit

Example: -15, the original code is 1000 1111

15 original code: 0000 1111
-15 original code: 1000 1111

2. Inverse code

Inverse code, that is, the original code is reversed bit by bit. Note here that the inverse code of a positive number is the same as the original code . Example: -15, the original code is 1000 1111, the sign bit does not participate in the conversion, and the bitwise inversion is 1111 0000

15 inverse code: 0000 1111, the same as the original code

-15 original code: 1000 1111

-15 inverse code: 1111 0000

3. Complement code

Complement code, that is, add 1 to the inverse code, and the complete statement should be the inversion of the original code and add 1. Also note that the complement of a positive number is the same as the original . Example: -15, the inverse code is 1111 0000, plus 1 is 1111 0001

15 complement code: 0000 1111, the same as the original code

-15 original code: 1000 1111
-15 inverse code: 1111 0000
-15 complement code: 1111 0001

The original code, inverse code, and complement code of a positive number are the same, and the binary representation of a positive number can be expressed as a binary original code (in fact, it is also a complement code, which does not need to be calculated). The binary representation of negative numbers is two's complement , which is the example shown in the first picture of the article, as follows. Since int is 32 bits, negative numbers display a bunch of 1s, and positive numbers remove the previous 0s

insert image description here

2.3 Bitwise operators

Let’s learn more about bit operations, and also use only 8 bits to demonstrate

1. <<: Bitwise left shift operator

Shift the converted binary to the left by the specified number of digits, for example: 15 << 2, that is, 0000 1111 << 2, which is 0011 1100, and the decimal representation is 60. Here the back is filled with 0

0000 1111
0011 1100

insert image description here

Note: 1 << n can be used here to represent the nth power of 2, because in fact, every left shift of 1 bit is equivalent to multiplying by a 2

insert image description here

2. >>: Bitwise right shift operator

Shift the converted binary to the right by the specified number of digits, for example: 15 >> 2, that is, 0000 1111 >> 2, which is 0000 0011, and the decimal representation is 3. Here the previous complement digits are signed, positive numbers, sign bit is 0, then complement 0; negative numbers, sign bit is 1, then complement 1

0000 1111
0000 0011

insert image description here

Shifting right can be regarded as dividing by 2 in disguise. At this time, there are even and odd numbers. The values ​​of even numbers, positive numbers and negative numbers are the same, and the value mentioned here refers to its own value without a sign; when it is odd, the value of the negative number is 1 greater than the positive number

3. >>>: bitwise right shift zero padding operator

Shift the converted binary right by the specified number of bits, and fill the shifted positions with zeros. The sign bit is not distinguished here, so it is also called unsigned right shift. Positive numbers have no effect, because the front is originally 0, and negative numbers will change the original value. For example: -15 >>> 2, that is, 1111 0001 >>> 2. It is 0011 1100, here only 8 bits are used for demonstration, and the complete 32 bits are shown in the figure below, where the previous 0 is omitted, and it can also be seen that there are two bits less than the original:

1111 0001
0011 1100

insert image description here

4. &: If the corresponding bits are all 1, the result is 1, otherwise it is 0 , for example: 15 & 16, 3 & 7

15: 0000 1111     3: 0000 0011
16: 0001 0000     7: 0000 0111
    0000 0000        0000 0011

Then the result of 15 & 16 is 0, and the result of 3 & 7 is 3

2.4 Source code analysis

1. First look at the top-level method called. Here you can see that toUnsignedString0()a . The parameter i is the value we pass in to be converted. The 1 here indicates the number of digits, 1 means binary, and 3 means is octal, 4 is hexadecimal

public static String toBinaryString(int i) {
    
    
        return toUnsignedString0(i, 1);
}

public static String toOctalString(int i) {
    
    
        return toUnsignedString0(i, 3);
}

public static String toHexString(int i) {
    
    
        return toUnsignedString0(i, 4);
}

2. Let’s look at toUnsignedString0()the method Integer.numberOfLeadingZeros(). Here we first call a method. This method is mainly used to calculate the number of high-order continuous 0 bits in binary representation, and then subtract this number with Integer.SIZE (32) to calculate the length of the character array to be represented. , which can be understood as omitting the preceding 0. For example:

15: 0000 0000 0000 0000 0000 0000 0000 1111, the original representation
15: 1111, the actual representation

This step can be understood as omitting the previous 0, and only retaining the number of digits that need to be represented

private static String toUnsignedString0(int val, int shift) {
    
    
        // assert shift > 0 && shift <=5 : "Illegal shift value";
        int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
        int chars = Math.max(((mag + (shift - 1)) / shift), 1);
        char[] buf = new char[chars];

        formatUnsignedInt(val, shift, buf, 0, chars);

        // Use special constructor which takes over "buf".
        return new String(buf, true);
}

@Native public static final int SIZE = 32;

3. Next, let's look at the specific implementation Integer.numberOfLeadingZeros()of , here a simple dichotomy is used, which is divided into multiple intervals [16, 24, 28, 30] for judgment. There is still the interval [30, 32] here, why is it not counted? See step 6 below. Here n represents the number of high consecutive 0s

  1. First judge whether i is 0, if it is 0, it is 32 high bits continuous 0, return 32 directly
  2. Then judge whether i >>> 16 is 0, which can be understood as judging half of the interval first. If it is 0, it means that it contains at least 16 high-order consecutive 0s, n adds 16, and then i <<= 16, remove i 16 bit 0 for subsequent judgment
  3. Judging whether i >>> 24 is 0, that is, judging whether it contains at least 8 high-order consecutive 0s, if it is 0, then add 8 to n, then i <<= 8, remove 8 bits of 0 from i and then make subsequent judgments
  4. Same as above, judge whether it contains at least 4 high-order consecutive 0s
  5. As above, judge whether it contains at least 2 high-order consecutive 0s, here it is already i >>> 30
  6. Finally, there are still [30, 32], the interval of length 2, there are four cases, [00, 01, 10, 11], we have already judged that it is equal to 0, so 00 is excluded What is left is [01, 10, 11], if it is x1, then there is no need to judge the number of x, because only the highest continuous 0 bit needs to be judged; if it is x0, since 00 has been excluded, then x is 1. Therefore, in fact, it is enough to judge only one bit, which is why the interval [30, 32] is not counted. Only when it reaches 31, the
    initial value of n is first assigned to 1. Let’s assume that this bit is 0 by default, and then Pass again n -= i >>> 31to judge what this bit is, if it is 0, then n = n - 0, unchanged; if it is 1, then n = n - 1, minus the original default initial value of 1. In fact, this technique is used here to replace the judgment if (i >>> 31 == 0) { n += 1; }of , the second way of writing is as follows
public static int numberOfLeadingZeros(int i) {
    
    
        // HD, Figure 5-6
        if (i == 0)
            return 32;
        int n = 1;
        if (i >>> 16 == 0) {
    
     n += 16; i <<= 16; }
        if (i >>> 24 == 0) {
    
     n +=  8; i <<=  8; }
        if (i >>> 28 == 0) {
    
     n +=  4; i <<=  4; }
        if (i >>> 30 == 0) {
    
     n +=  2; i <<=  2; }
        n -= i >>> 31;
        return n;
}

public static int numberOfLeadingZeros(int i) {
    
    
        // HD, Figure 5-6
        if (i == 0)
            return 32;
        int n = 0;
        if (i >>> 16 == 0) {
    
     n += 16; i <<= 16; }
        if (i >>> 24 == 0) {
    
     n +=  8; i <<=  8; }
        if (i >>> 28 == 0) {
    
     n +=  4; i <<=  4; }
        if (i >>> 30 == 0) {
    
     n +=  2; i <<=  2; }
        if (i >>> 31 == 0) {
    
     n += 1; }
        return n;
}

4. Go back to toUnsignedString0()this method, numberOfLeadingZeros()get the number of high-order consecutive 0s by calling , and then subtract this number from Integer.SIZE to get the number of digits to be represented

Then use Math.max(((mag + (shift - 1)) / shift), 1);to calculate the length of the character array corresponding to the 2/8/16 base system. The parameter shift mentioned above is used to indicate the number of bases, 1 means binary, 3 means octal, and 4 means hexadecimal system

Create the corresponding character array after getting the length of the character array, and call formatUnsignedInt()to fill the array

private static String toUnsignedString0(int val, int shift) {
    
    
        // assert shift > 0 && shift <=5 : "Illegal shift value";
        int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
        int chars = Math.max(((mag + (shift - 1)) / shift), 1);
        char[] buf = new char[chars];

        formatUnsignedInt(val, shift, buf, 0, chars);

        // Use special constructor which takes over "buf".
        return new String(buf, true);
}

5. formatUnsignedInt()The method is as follows, the parameter val is the value to be converted; shift indicates the number of digits, here is 1; buf is the created character array; offset is the offset, here is 0; len is the length of the array. Among them, line 6 Integer.digitsuses a defined character array containing all numbers and letters

Here is actually filling according to the corresponding number of digits

  1. Assign the length of the array to charPos, and then use charPos for calculation

  2. radix = 1 << shift, As mentioned earlier, 1 << n is actually 2 to the nth power, here it is used to represent base, binary is 2 to the 1st power, octal is 2 to the 3rd power, hexadecimal is 2 to the 4th power

  3. The mask is base minus one, which is used for subsequent & operation with val, which is actually to match the number of digits corresponding to the base one by one. Example: shift is 3, that is, octal, and mask is 7

    8 == radix = 1 << 3;
    7 == mask = radix - 1;

    When the mask is used for the & operation, it is expressed as 111, that is, 3 bits of binary represent one bit of octal

  4. val & mask After getting the value, find the corresponding index character in the digits array and assign it to buf, that is, the created character array. Note that it is stored in reverse order, corresponding to the change of the number of digits, from right to left

  5. Then val is shifted to the right by the corresponding number of digits, and the cycle matches

  6. Finally return the filled character array

static int formatUnsignedInt(int val, int shift, char[] buf, int offset, int len) {
    
    
        int charPos = len;
        int radix = 1 << shift;
        int mask = radix - 1;
        do {
    
    
            buf[offset + --charPos] = Integer.digits[val & mask];
            val >>>= shift;
        } while (val != 0 && charPos > 0);

        return charPos;
}

final static char[] digits = {
    
    
        '0' , '1' , '2' , '3' , '4' , '5' ,
        '6' , '7' , '8' , '9' , 'a' , 'b' ,
        'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
        'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
        'o' , 'p' , 'q' , 'r' , 's' , 't' ,
        'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};

6. Go back to toUnsignedString0()the method , the last step is to convert the character array into a String and return, and the whole process ends here

private static String toUnsignedString0(int val, int shift) {
    
    
        // assert shift > 0 && shift <=5 : "Illegal shift value";
        int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
        int chars = Math.max(((mag + (shift - 1)) / shift), 1);
        char[] buf = new char[chars];

        formatUnsignedInt(val, shift, buf, 0, chars);

        // Use special constructor which takes over "buf".
        return new String(buf, true);
}

Guess you like

Origin blog.csdn.net/ACE_U_005A/article/details/128456243