The relationship between the function of & 0xFF and the conversion of byte and Int types

Every time I encounter operations such as AND or displacement, I always get confused once. Baidu understands it and then forgets it. The understanding is not deep enough and clear enough, and I rarely write about it at work, so I write it down.

I accidentally looked through the code and found a piece of code that was difficult to understand.

byte[] bs = digest.digest(origin.getBytes(Charset.forName(charsetName))) ;  
        for (int i = 0; i < bs.length; i++) {
    
      
            int c = bs[i] & 0xFF ;
            if(c < 16){
    
     
                sb.append("0");  
            }  
            sb.append(Integer.toHexString(c)) ;  
        }  
        return sb.toString() ;

bs is a byte array output after MD5 encryption of a string. At first it was difficult for me to understand why bs[i]&oxFF should be copied to the int type in the next loop?

bs[i] is an 8-bit binary. When 0xFF is converted into an 8-bit binary, it is 11111111. So isn’t bs[i]&0xFF still bs[i] itself? Is this interesting?

Later I wrote another demo

package jvmProject;

public class Test {
    
    

    public static void main(String[] args) {
    
    
        byte[] a = new byte[10];
        a[0]= -127;
        System.out.println(a[0]);
        int c = a[0]&0xff;
        System.out.println(c);
    }
}

I print a[0] first, and then print the value after a[0]&0xff. Originally, I thought the result should be -127.

But the result is really unexpected!

-127

129

Why? &0xff is wrong.

The original poster really didn’t understand, so he thought about it later in the direction of complementing the code.

I remember when I was learning computer principles, I learned that the storage in the computer is stored using binary complement.

Let’s review the three concepts of original code and complement code.

For the original code of a positive number (00000001), the first bit represents the sign bit, and the complement of the complement is itself.
For the original code of a negative number (100000001), the complement is the inversion operation of the original code except the sign bit, that is ( 111111110), the complement code is to perform +1 operation on the complement code, that is, (111111111)
The concept is as simple as that.

When -127 is assigned to a[0], a[0] is a byte type, and its complement stored in the computer is 10000001 (8 bits).

When a[0] is output to the console as an int type, JVM performs a bit filling process. Because the int type is 32 bits, the complement after the bit filling is 11111111111111111111111111 10000001 (32 bits). This 32-bit binary complement The code also represents -127.

Found no, although the two's complement code stored behind the byte->int computer was converted from 10000001 (8 bits) to 1111111111111111111111111 10000001 (32 bits). Obviously, the decimal numbers represented by these two complement codes are still the same.

But am I doing byte->int conversion just to maintain decimal consistency?

Maybe not? For example, if the file we get is converted into a byte array, do we care about the decimal value of the byte array? What we care about is the complement of the binary storage behind it, right?

So you should be able to guess why the byte type number is &0xff and then assigned to the int type. The essential reason is to maintain the consistency of the two's complement.

When byte is converted to int, the high 24 bits will inevitably be supplemented by 1. In this way, the two's complement is actually inconsistent. &0xff can set the high 24 bits to 0 and keep the low 8 bits as they are. The purpose of this is to ensure the consistency of binary data.

Of course, while ensuring binary data integrity, if binary is interpreted as byte and int, its decimal value must be different because the sign bit position has changed.

In example 2, int c = a[0]&0xff; a[0]&0xff=111111111111111111111111 10000001&11111111=000000000000000000000000 10000001, this value is 129.

So the output value of c is 129. Someone asked why a[0] in the above formula is not 8 bits but 32 bits. This is because when the system detects that byte may be converted into int or performs operations on byte and int types, it will change the high bit of byte's memory space. Add 1 (that is, fill in the sign bit) to 32 bits, and then participate in the operation. The 0xff above is actually a literal value of type int, so it can be said that byte and int are operated on.

Guess you like

Origin blog.csdn.net/zch981964/article/details/132792875