A detailed introduction to Java binary and bit operations

insert image description here

foreword

Off-topic: The topic of Jackson is very small, and I almost don't have the confidence to write it down because of the amount of reading. But what I said is a debt I owe , and I have to disclose the promised payment content even if I stay up late. After all, there are still a few people who are prostitutes for nothing.

Voiceover: In the future, keep your head down and do things, stop bragging

Although a small crowd, there are still friends who want to know more about the first wave, which really inspired me for three seconds. In this case, let's do it. This article will first understand the bit operations in Java . Bit operations are rarely used in Java, so why is it so popular in Jackson? Everything is two words: performance / efficiency . To deal with it in a language that the computer can directly understand, you don't need to think too much about whether it is fast or not.
insert image description here

text

Mentioning bit operations is a familiar and unfamiliar feeling to most Java programmers . Familiarity is because you must have learned it when you were learning JavaSE, and you can see it when looking at some open source frameworks (especially JDK source code); unfamiliarity is because there is a high probability that we will not use it. Of course, there are reasons why it cannot be "popular": it is difficult to understand, it does not conform to human thinking, and it is poorly readable...

Tips: Generally speaking, it is more important for people to understand the program than to be understood by the machine

Bitwise operations low-levelare used more often in languages ​​like Java, but they are rarely mentioned in high-level languages ​​like Java. Although we rarely use it, Java also supports it. After all, it is best practice to use bit operations in many cases .

Bit operations are rarely used in daily development, but clever use of bit operations can greatly reduce operating overhead and optimize algorithms. A statement may have little effect on the code, but it will save a lot of overhead in the case of high repetition and large data volume.

binary

Before understanding what bit operations are, it is very necessary to popularize the concept of binary.

Binary is a number system widely used in computing technology. Binary data is a number represented by two digits of 0 and 1. Its base is 2, the carry rule is every two into one , and the borrow rule is borrow one as two . Because it only uses two digital symbols of 0 and 1, it is very simple and convenient, and it is easy to realize electronically.

Tips: Semiconductor ON means 1, OFF means 0, this is the bottom-level principle of CPU computing??

Let's look at an example first:

1011(二进制)+ 11(二进制) 的和?
结果为:1110(二进制)
12

Binary is very, very simple to understand, much simpler than decimal. You may also think about how to convert binary and decimal? After all, 1110 can't see it either. Or go deeper and continue thinking: how to convert to octal, hexadecimal, 30-binary... Hexadecimal conversion is not what this article wants to talk about, please do it yourself if you are interested.

Binary and Encoding

Although this is not very related to the content of this article, but by the way, it is still relatively common in development.

Computers can only recognize 1 and 0, that is, binary, 1 and 0 can express all the characters and language symbols in the world. So how to express words and symbols? This involves character encoding . Character encoding forces each character to correspond to a decimal number (please pay attention to the difference between characters and numbers, for example, the 0decimal number corresponding to a character is 48), and then converts the decimal number into binary that the computer understands, and after the computer reads these 1 and 0, it will The corresponding text or symbols will be displayed.

  • Generally, for English characters , one byte represents one character, but for Chinese characters, because the low-bit encoding has been used (early computers did not support Chinese, so in order to expand support, the only way is to use more bytes number) had to expand to high
  • The range of character set encodings utf-8>gbk>iso-8859-1(latin1)>ascll. The ascll code is the English abbreviation of the American Standard Information Interchange Code, which contains 255 commonly used characters, such as Arabic numerals, English letters and some printing symbols (generally speaking, a total of 128 characters is not a big problem)

UTF-8: A set of variable-length codes with 8 bits as a coding unit , which will encode a code point (Unicode) into 1 to 4 bytes (1 byte in English, 3 bytes in most Chinese characters).

binary in java

Before the Java7 version, Java did not support direct writing of decimal literals other than decimal. But this is allowed in Java7 and later versions:

  • Binary: leading 0b/0B
  • Octal: leading 0
  • Decimal: default, no need to prepend
  • Hexadecimal: leading 0x/0X
@Test
public void test1() {
    
    
    //二进制
    int i = 0B101;
    System.out.println(i); //5
    System.out.println(Integer.toBinaryString(i));
    //八进制
    i = 0101;
    System.out.println(i); //65
    System.out.println(Integer.toBinaryString(i));
    //十进制
    i = 101;
    System.out.println(i); //101
    System.out.println(Integer.toBinaryString(i));
    //十六进制
    i = 0x101;
    System.out.println(i); //257
    System.out.println(Integer.toBinaryString(i));
}

The resulting program, outputs:

5
101
65
1000001
101
1100101
257
100000001

Description: System.out.println()It will be automatically converted to decimal before output; it means converted to binarytoBinaryString() for string output .

Convenient base conversion API

JDK 1.0has provided a very convenient hex conversion API since the beginning, which is very useful when we need it.

@Test
public void test2() {
    
    
    int i = 192;
    System.out.println("---------------------------------");
    System.out.println("十进制转二进制:" + Integer.toBinaryString(i)); //11000000
    System.out.println("十进制转八进制:" + Integer.toOctalString(i)); //300
    System.out.println("十进制转十六进制:" + Integer.toHexString(i)); //c0
    System.out.println("---------------------------------");
    // 统一利用的为Integer的valueOf()方法,parseInt方法也是ok的
    System.out.println("二进制转十进制:" + Integer.valueOf("11000000", 2).toString()); //192
    System.out.println("八进制转十进制:" + Integer.valueOf("300", 8).toString()); //192
    System.out.println("十六进制转十进制:" + Integer.valueOf("c0", 16).toString()); //192
    System.out.println("---------------------------------");
}

Run the program, output:

---------------------------------
十进制转二进制:11000000
十进制转八进制:300
十进制转十六进制:c0
---------------------------------
二进制转十进制:192
八进制转十进制:192
十六进制转十进制:192
---------------------------------

How to prove that Long is 64-bit?

I believe every Javaer knows that the Long type in Java occupies 8 bytes (64 bits), so how to prove it?

Tips: This is a classic interview question, at least I have asked it many times~

There is the simplest method: get the maximum value of the Long type , and convert it into a string in binary notation to see the length. The code is as follows:

@Test
public void test3() {
    
    
    long l = 100L;
    //如果不是最大值 前面都是0  输出的时候就不会有那么长了(所以下面使用最大/最小值示例)
    System.out.println(Long.toBinaryString(l)); //1100100
    System.out.println(Long.toBinaryString(l).length()); //7

    System.out.println("---------------------------------------");

    l = Long.MAX_VALUE; // 2的63次方 - 1
    //正数长度为63为(首位为符号位,0代表正数,省略了所以长度是63)
    //111111111111111111111111111111111111111111111111111111111111111
    System.out.println(Long.toBinaryString(l));
    System.out.println(Long.toBinaryString(l).length()); //63

    System.out.println("---------------------------------------");

    l = Long.MIN_VALUE; // -2的63次方
    //负数长度为64位(首位为符号位,1代表负数)
    //1000000000000000000000000000000000000000000000000000000000000000
    System.out.println(Long.toBinaryString(l));
    System.out.println(Long.toBinaryString(l).length()); //64
}

Run the program, output:

1100100
7
---------------------------------------
111111111111111111111111111111111111111111111111111111111111111
63
---------------------------------------
1000000000000000000000000000000000000000000000000000000000000000
64

Explanation: In computers, negative numbers are represented as the complement of their positive values . Therefore, in the same way you can prove to yourself that the Integer type is 32 bits (4 bytes).

Bitwise operations in Java

insert image description here

There are still many bitwise operators supported by the Java language, listed as follows:

  • &: bitwise AND
  • |: bitwise or
  • ~: bitwise NOT
  • ^: bitwise XOR
  • <<: left shift operator
  • >>: right shift operator
  • >>>: unsigned right shift operator

Except for , the rest are binary operators, and the data to be operated can only be of integer type (long or short) or char character type. For these operation types, examples are given below, which is clear at a glance.

Since it is an operation, it can still be classified and explained into two categories: simple operation and compound operation.

Tips: For ease of understanding, I will use binary representation for literal examples, and using decimal (any base) will not affect the operation results

simple operation

Simple operations, as the name implies, use only one operator at a time.

&: bitwise AND

Operation rules: 1 if both are 1, otherwise 0 . The output is 1 only if both operands are 1, otherwise 0.

Note: 1. All literal values ​​in this example (the same below) are expressed in decimal, please use binary thinking to understand when understanding; 2. This article does not describe the bit operations between negative numbers

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4
    int j = 0B101; // 十进制为5

    // 二进制结果:100
    // 十进制结果:4
    System.out.println("二进制结果:" + Integer.toBinaryString(i & j));
    System.out.println("十进制结果:" + (i & j));
}

|: bitwise or

Operation rules: 0 if both are 0, otherwise 1 . The output is 0 only if both operands are 0.

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4
    int j = 0B101; // 十进制为5

    // 二进制结果:101
    // 十进制结果:5
    System.out.println("二进制结果:" + Integer.toBinaryString(i | j));
    System.out.println("十进制结果:" + (i | j));
}

~: bitwise NOT

Operating rules: 0 is 1, 1 is 0 . All 0's are set to 1's and 1's are set to 0's.

Tips: Please be sure to pay attention to all, don’t ignore the 0s in front of positive numbers~

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4

    // 二进制结果:11111111111111111111111111111011
    // 十进制结果:-5
    System.out.println("二进制结果:" + Integer.toBinaryString(~i));
    System.out.println("十进制结果:" + (~i));
}

^: bitwise XOR

Operation rules: same as 0, different as 1 . When the operands are different (1 meets 0, 0 meets 1), the corresponding output result is 1, otherwise it is 0.

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4
    int j = 0B101; // 十进制为5

    // 二进制结果:1
    // 十进制结果:1
    System.out.println("二进制结果:" + Integer.toBinaryString(i ^ j));
    System.out.println("十进制结果:" + (i ^ j));
}

<<: bitwise left shift

Operating rules: Move all the digits of a number to the left by several digits.

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4

    // 二进制结果:100000
    // 十进制结果:32 = 4 * (2的3次方)
    System.out.println("二进制结果:" + Integer.toBinaryString(i << 2));
    System.out.println("十进制结果:" + (i << 3));
}

Shift left is used so much that it doesn't take much effort to understand. Shift x to the left by N bits, and the effect is the same as that in decimal, just multiply by 2 to the Nth power, but you need to pay attention to the overflow of the value, and be careful when using it.

>>: bitwise right shift

Operating rules: Move all the digits of a number to the right by several digits.

@Test
public void test() {
    
    
    int i = 0B100; // 十进制为4

    // 二进制结果:10
    // 十进制结果:2
    System.out.println("二进制结果:" + Integer.toBinaryString(i >> 1));
    System.out.println("十进制结果:" + (i >> 1));
}

Shift right with a negative number:

@Test
public void test() {
    
    
    int i = -0B100; // 十进制为-4

    // 二进制结果:11111111111111111111111111111110
    // 十进制结果:-2
    System.out.println("二进制结果:" + Integer.toBinaryString(i >> 1));
    System.out.println("十进制结果:" + (i >> 1));
}

Right shift is also used a lot, and it is more understandable: the operation is actually to directly cut off the N bits on the right side of the binary number , and then .正数右移高位补0,负数右移高位补1

>>>: unsigned right shift

Note: there is no unsigned left shift, and there is no <<<such signed

The difference between it and >>the signed right shift is: whether it is a positive or negative number, the high bits are all filled with 0 . So for positive numbers, there is no difference; then look at the performance of negative numbers:

@Test
public void test() {
    
    
    int i = -0B100; // 十进制为-4

    // 二进制结果:11111111111111111111111111111110(>>的结果)
	// 二进制结果:1111111111111111111111111111110(>>>的结果)
    // 十进制结果:2147483646
    System.out.println("二进制结果:" + Integer.toBinaryString(i >>> 1));
    System.out.println("十进制结果:" + (i >>> 1));
}

I deliberately put the results of >> on it for your convenience. Because the high bit is filled with 0, it is not displayed, but you should know what is going on in your heart.

compound operation

A compound operation in a broad sense refers to the nesting of multiple operations , usually of the same type. The compound operation referred to here refers to the use with the = sign, similar to += -=. Originally, this belongs to basic common sense and does not need to be explained separately, but who asked Brother A to manage birth, care, and death??.

Mixed operation: refers to the same calculation contains a variety of operators, such as addition, subtraction, multiplication, division, multiplication, square root, etc.

Take & and operation as an example, others are similar:

@Test
public void test() {
    
    
    int i = 0B110; // 十进制为6
    i &= 0B11; // 效果同:i = i & 3

	// 二进制结果:10
	// 十进制结果:2
    System.out.println("二进制结果:" + Integer.toBinaryString(i));
    System.out.println("十进制结果:" + (i));
}

The operation rule to review &is: if both are 1, then 1, otherwise it is 0 .

Example of Bit Operation Usage Scenario

In addition to the high-efficiency features of bit operations , there is another feature that cannot be ignored in application scenarios: the reversibility of calculations . Through this feature, we can use it to achieve the effect of concealing data , and also ensure efficiency.

In the original code of JDK. There are many initial values ​​that are calculated by bitwise operations. The most typical such as HashMap:

HashMap:
	
	static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
	static final int MAXIMUM_CAPACITY = 1 << 30;

Bit operations have many good properties and can work with linearly growing data. And for some operations, bit operations are the most direct and easiest method. Let me arrange some specific examples (usually interview questions) to get a feel for it.

Check if two numbers are the same

Both positive and negative numbers mean the same, otherwise they are different. A small case like this >/<can certainly be done with a decimal plus a comparison operator, but it is more direct (and most efficient) to use bitwise operators:

@Test
public void test4() {
    
    
    int i = 100;
    int j = -2;

    System.out.println(((i >> 31) ^ (j >> 31)) == 0);

    j = 10;
    System.out.println(((i >> 31) ^ (j >> 31)) == 0);
}

Run the program, output:

false
true

The int type has a total of 32 bits. If you move 31 bits to the right, there is only 1 sign bit left (because it is a signed right shift , so positive numbers are left with 0 and negative numbers are left with 1), and ^the result of XOR operation on the two sign bits is 0. It shows that the two are consistent.

Review ^the XOR operation rules: the same is 0, and the difference is 1 .

Check the parity of a number

In decimal numbers, it can be done by taking the modulus of 2, and there is a more efficient way for bit operations:

@Test
public void test5() {
    
    
    System.out.println(isEvenNum(1)); //false
    System.out.println(isEvenNum(2)); //true
    System.out.println(isEvenNum(3)); //false
    System.out.println(isEvenNum(4)); //true
    System.out.println(isEvenNum(5)); //false
}

/**
 * 是否为偶数
 */
private static boolean isEvenNum(int n) {
    
    
    return (n & 1) == 0;
}

Why &1can we judge the evenness? Because in binary, the last bit of an even number must be 0, and the lowest bit of an odd number must be 1 . The first 31 bits of binary 1 are all 0, so after ANDing
with the first 31 bits of other numbers, all digits must be 0 (whether it is 1&0 or 0&0, the result is 0), then the only difference is to look at the lowest The result of the AND operation of bits and 1: the result is 1, which means an odd number, otherwise, the result is 0, which means an even number.

Swap the values ​​of two numbers (without resorting to third-party variables)

This is a very old interview question, swap the values ​​of A and B. If there are no words in brackets in this question, it is a question that everyone will know. You can solve it like this:

@Test
public void test6() {
    
    
    int a = 3, b = 5;
    System.out.println(a + "-------" + b);
    a = a + b;
    b = a - b;
    a = a - b;
    System.out.println(a + "-------" + b);
}

Run the program, the output (successful exchange):

3-------5
5-------3

The biggest advantage of using this method is: easy to understand. The biggest disadvantage is: a+b may exceed the maximum range of the int type, resulting in loss of precision and errors, resulting in very hidden bugs . So if you use it in a production environment like this, there will be a relatively large security risk.

Tip: If you estimate that there is absolutely no chance that the number will exceed the maximum value, this is fine. Of course, if you are a string type, please pretend that I did not say

Because this method not only introduces third-party variables, but also has major security risks. So this article introduces a safe alternative, using the reversibility of bit operations to complete the operation:

@Test
public void test7() {
    
    
    // 这里使用最大值演示,以证明这样方式是不会溢出的
    int a = Integer.MAX_VALUE, b = Integer.MAX_VALUE - 10;
    System.out.println(a + "-------" + b);
    a = a ^ b;
    b = a ^ b;
    a = a ^ b;
    System.out.println(a + "-------" + b);
}

Run the program, the output (exchange completed successfully):

2147483647-------2147483637
2147483637-------2147483647

Since the full text does not perform addition operations on a/b, overflow cannot occur, so it is safe. The core principle of this approach is based on: the reversibility of bit operations , using XOR to achieve the goal.

Bit operations are used on database fields (important)

This use case is of great practical significance, because I have used it many times in production, and the feeling is not generally good.

Embarrassing phenomenon of database design in business systems: Usually our data tables may contain various status attributes. For example, in the blog table, we need fields to indicate whether it is public, whether it has a password, whether it is blocked by the administrator, whether it is blocked Stick to the top and so on. In the later stage of operation and maintenance, you will also need to add new fields due to planning requirements to add new functions . This will cause difficulties in later maintenance, too many fields, and increased indexes. At this time, you can use bit operations Clever solution.

For example: when we perform authentication and authorization on the website, we generally support multiple authorization methods, such as:

  • Personal authentication 0001 -> 1
  • Email authentication 0010 -> 2
  • WeChat authentication 0100 -> 4
  • Hypertube Certification 1000 -> 8

In this way, we can use 1111these four bits to express whether the respective positions are certified or not. The conditional statement to query WeChat authentication is as follows:

select * from xxx where status = status & 4;

To inquire about those who have passed both personal authentication and WeChat authentication:

select * from xxx where status = status & 5;

Of course, you may also have sorting requirements, like this:

select * from xxx order by status & 1 desc

This case is the same as the Linux permission control that everyone is familiar with. It is controlled by bit operations: permissions are divided into r for reading, w for writing, and x for execution. Their weights are 4, 2, and 1 respectively. You can Combine licenses as you like. For example chomd 7, 7=4+2+1 indicates that the user has rwx permissions,

Precautions

  1. You need your DB storage to support bit operations, such as MySql supports
  2. Please make sure your field type is not a char character type, but a numeric type
  3. In this way, it will cause the index to fail , but in general, the status value does not need to be indexed
  4. Specific analysis of specific business, do not blindly use it for the show, if you use it wrong, you will easily be criticized.

Serial number generator (order number generator)

Generate order serial number, of course, this is actually not a difficult function, the most direct way is date + host Id + random string to splice a serial number, even see a lot of places directly use UUID, of course this is very Not recommended.

UUID is a string, too long, out of order, and cannot carry effective information to provide effective help for positioning problems, so it is generally an alternative

Today I learned bit operations, and I think there is a more elegant way to achieve it. What is elegance: You can refer to the order numbers of Taobao and JD.com. It seems to be regular, but it is not regular :

  • I don't want to expose the relevant information directly.
  • Through the serial number, you can quickly get relevant business information and quickly locate problems (this is very important, which is the most important reason why UUID is not recommended).
  • Using AtomicInteger can increase concurrency and reduce conflicts (this is another important reason for not using UUID, because numbers are more efficient than strings)

insert image description here

Introduction to Implementation Principles

This serial number is composed of: a long series of numbers composed of date + Long type value, in the form of 2020010419492195304210432. Obviously, the front is the date data, and the long string behind it contains a lot of meaning: the current second, the merchant ID (or other business data of yours), the machine ID, a string of random codes, and so on.

Introduction to each part:

  1. The first part is the millisecond value of the current time. The maximum is 999, so it occupies 10 digits
  2. The second part is: serviceType indicates the business type. Such as order number, operation serial number, consumption serial number, etc. The maximum value is set to 30, which is enough . 5 places
  3. The third part is: shortParam, indicating user-defined short parameters. Category parameters such as order type, operation type, etc. can be placed. The maximum value is set at 30, which is definitely enough. 5 places
  4. The fourth part is: longParam, same as above. Users can generally place id parameters, such as user id, merchant id, etc., and the maximum support is 999.99 million. Most of them are enough, accounting for 30
  5. Part V: The remaining digits are handed over to random numbers, and a number is randomly generated to fill up the remaining digits. Generally, there are at least 15 bits remaining ( the number of bits in this part is floating ), so it is enough to support concurrency of 2 to the 15th power
  6. Finally, add the date time (year, month, day, hour, minute, second) in front of the above long value

This is a serial number generation tool based on bit operations written by Brother A, which has been used in the production environment. Considering the long source code (one file, about 200 lines in total, without any other dependencies), I will not post it. If necessary, please reply to the official account background to 流水号生成器get it for free .

Summarize

Bit operations still have quite a few disadvantages from an engineering point of view . In actual work, if it is only for numerical calculations, it is not recommended to use bit operators. Only in some special scenarios, using bit operations will give you a bright future. feelings, such as:

  • N multi-state control requires scalability. For example, the field design of the status of the database
  • There are extreme requirements for efficiency. Such as JDK
  • The scene fits perfectly. For example, Jackson's Feature special needle value

Don't use it for the sake of showing off (zhuang) skills (bi). Showing off skills for a while is fun, and you will fall into the crematorium; the guy is still young, so I hope you are cautious. In most cases, it is more important for people to understand the code easily than for machines to understand it .

insert image description here

Guess you like

Origin blog.csdn.net/qq_43842093/article/details/132529197