How do we think in a computer way

Since the first day of college, the teacher has taught us about various algorithms. From various search and sorting algorithms to recursive and greedy algorithms, I have been fighting with these algorithms in my freshman year. Until after work, in order to cope with the interview, I still have to go back to gnaw algorithm books or brush some algorithm exercises, so that I can retrieve some memories from school. Why are algorithms so hard to remember? In other words, why can't computer algorithms be more intuitive?

Because computer algorithms are anti-human, in essence, this is caused by the difference between the way of thinking of the computer and the way of thinking of the human brain.

There is no definite theory about the mechanism of human brain thinking. It is temporarily believed to be the effect of chemical substances and electrical signals. Although there is no scientific explanation, each of us has a brain, and each of us can feel our own way of thinking.

The computer is created by humans. From the beginning of its design, it is not designed to simulate the human brain. Therefore, it has its own unique working method. Only by understanding the working method of a computer can you learn to think in its way. Can write the most suitable program code for computer operation.

Find a specific number in a sorted array-human brain vs computer round 1

We use a specific example to illustrate that the human brain and the computer have different ways of thinking. Suppose we want to find a specific number from an array that has been sorted.

Knowing that the sorted array is 1 2 3 5 7 13 34 67 90 127 308, we want to find out whether the number 13 is in the array.

How does the human brain accomplish tasks?

The human brain is almost "cheating" to deal with such problems. We can see ten lines at a glance. We can find 13 when we glance at the glasses, so if I ask myself how I found 13, I can only say that I "see" Up.

And how does the computer accomplish this task?

The simplest and most dumb algorithm is to read the array one by one starting from the array. I believe that every student who has learned the basics of programming can write code similar to the following.

boolean isNumInArray(int num, int[] array) {
	for (int i = 0; i < array.length; i++) {
	    if (array[i] == num) {
	    	return true;
	    }
	}
	return false;
}

The computer needs to start with the first element of the array and check the current array elements one by one, and compare them with 13 to see if they are equal. In order to find out the number 13, the computer has to perform 6 cycles, and people see the answer almost instantly.

Why are computers so "stupid" to solve problems? We have to start with how computers work.

How the CPU works

As the core component of a computer, CPU is also the main carrier of algorithm.

The CPU does not think like a person, it only understands some basic instructions. Every CPU has its instruction set, which is a hard program that is stored inside the CPU and guides and optimizes CPU operations. In layman's terms, the instruction set is all the way of thinking of the CPU. For example, there will be an ADD instruction in a common instruction set. This instruction can add the values in two registers and store them in another register. Corresponding to this, there will also be a SUB instruction, which is used to add two register values. Subtract. If you check the manuals of various CPU instruction sets, you will find that they basically contain basic addition, subtraction, multiplication, and division instructions, as well as instructions for storing and fetching data from memory. The common CPU instruction set is a few hundred instructions at most. In other words, the CPU will only have these hundreds of commands.

Compared with the CPU, the human brain has powerful memory and association capabilities. For example, when you see 1+1, you think of 2. When you see a red light, you think of stopping. When you see the door, you know to open the doorknob. It is something you can immediately reflect without thinking.

Therefore, the CPU knows much less things (instructions) than humans. Isn't the CPU very dumb? Yes, the CPU is very stupid, but the advantages of the CPU are also unmatched by the human brain:

Although the CPU can only do simple things (hundreds of instructions), it can guarantee correct calculations and correct results within a fixed time (instruction execution time). The human brain cannot guarantee that the "same" thinking result will be produced in a fixed time.
Modern CPU technology can execute more than a million instructions in one second, but the thinking speed of the human brain is not as fast as it is. Our "thought" requires a reaction time of a few tenths of a second.

In summary, the CPU is a dumb and fast guy.

Computer storage

Common storage of computers includes registers, cache, memory, hard disk, etc.

Registers are equivalent to things that can be immediately recalled in the human brain. All operations performed by the CPU are performed on the data in the registers. The register stores information such as what calculation the computer is currently going to do (instruction register), the data to be calculated (data register), and which step has been calculated (segment register). Whether it is the earliest register CPU or the latest and strongest CPU, the number of their registers is only a few dozen at most (in special cases there are hundreds), which means that the information that the CPU can use immediately at the same time is this. Dozens of numbers.

The memory is the main storage facility of the computer. It can store the information of the running program. The memory is equivalent to the bookshelf of the library. The CPU needs to use a certain piece of data in the memory. It needs to pass the LOAD instruction and attach a bookshelf number ( Memory address), and then the memory controller can transmit the data of the corresponding address to the CPU through the bus, and the CPU will put the loaded result into the register for use. The speed of memory access is much lower than that of registers, but the speed of accessing data distributed in various sections of the memory is basically the same.

Since most of the time the CPU needs to read a continuous segment of memory to perform operations, the CPU usually has a high-speed cache to cache the entire recently used memory, so that the CPU does not need to read the memory every time it executes. The speed of cache is between register and memory, but much higher than memory. The size of the cache is generally between several megabytes to more than ten megabytes.

The hard disk is an external storage. The old mechanical hard disk has a rotatable head. When reading the disk file, the head needs to be turned to the corresponding position. The speed of the disk is much lower than that of the memory, and if the head of the disk stays at At a certain position, the information at different positions on the random disk will be limited by the physical speed of the head movement, resulting in uneven speed. New-style solid-state hard drives use storage media similar to memory, which greatly improves the performance of random access.

Therefore, the computer has a small head (register) that can only remember a few things, but it can have a relatively large fast memory (cache), has a knowledge reserve (memory) far exceeding humans, and it also carries a huge amount of movement. Library (hard disk), so from the storage point of view, the computer is like a Rain Man with birth defects.

So, let's analyze why the computer does what kind of operation in round 1?

First we look at the definition of our function

boolean isNumInArray(int num, int[] array)

In the underlying implementation of the calling function, the parameters are allocated to two registers. isNumInArrayWhen this function is called, numthe value of the first parameter 13will be loaded into the register (r1), and the second parameter arrayof is only arraythe address information in the memory when it is passed to the CPU , which is stored in another register (R2).

In the fourth row array[i] == numwhen, CPU needs to do three things before you can complete this work:

Through the ADD instruction, calculate the memory address that needs to be read according to the address of the array (r2) and the number of i (r4)
Load the number corresponding to the memory address to the register through the LOAD instruction (r3)
Compare the value of num(r1) and r3 through the CMP instruction, and store the result in the result memory

According to the result of operation 3, if the result is not equal, the CPU needs to iadd 1 to the loop counter and store it in register r4, and perform the above calculation again. The difference is that the second to Nth step two will be much faster than the first time, because the entire array content has been captured by the cache.

So, we can see why the computer is so stupid in solving this problem:

Computer input is restricted. Computers can only read a single value at a time (with the help of a cache, which is not too bad), and put a limited number of values in registers, while humans can read multiple values at once and store them in their minds through vision and so on.
The computer's instructions are limited and can only support basic arithmetic instructions. The human brain can have a wealth of instructions, such as matching the number 13 directly through the visual pattern of a bunch of numbers just seen.

Find a specific number in a sorted array-human brain vs computer round 2

The computer was defeated in the last round of PK with the human brain, but this is not very fair, because the number of arrays is only a few, and the upper limit of the computer's storage is much more than that. So we started the second competition. This time we will expand the input

1 2 3 5 7 13 34 67 90 127 308 502 ... 2341245 ... (1 million

The searched number becomes 2341245.

How about the performance of human brain and computer this time?

For an ordinary person, we assume that these 1 million numbers are printed in a dictionary, then how can he find a number in a 1 million ordered array?

At this time, the ability of "one eye and ten lines" that humans are proud of is already very small. When the number of digits increases, it is difficult to compare whether a number is the same as the target number. Such figures are also very small.

Ever since, we honestly compare the numbers from beginning to end, turning page by page to see if there are numbers on the current page, and if not, turn to the next page.

Is this idea very familiar? That's right, this is the thinking of computers, which is almost the same as the computer code described in the previous section, except that people can look at a few more data at a glance.

However, the speed at which humans compare whether large numbers are equal, and the speed at which they can flip through the dictionary is far less than the speed at which a computer can read these 1 million numbers. It is also a "stupid bird", which performs millions of operations per second. Ability to complete such a task almost instantly.

That is to say, in the case of large-scale input, the human brain's thinking mode "degenerates" to be similar to the computer, but is defeated by the computer's overwhelming performance advantage.

Find a specific number in a sorted array-human brain vs computer round 3

In the second round, the human brain lost to the computer, but this competition is undoubtedly faster than two dumb birds. Is there a smarter way?

That's right, we have learned the Binary Search algorithm can come in handy.

Step 1: There is a dictionary printed with 1 million numbers in front of us. We don’t know where the numbers we are looking for will be, so we first open the dictionary in half (it doesn’t matter if it doesn’t have to be so precise), and look at the current page The first number and the last number, whether the number we are looking for is within this range, if so, we can continue to look for this number on the current page.

Step 2: If the first number on the current page is still larger than the number we are looking for, then we can tear up the second half of the dictionary (because the number we are looking for cannot be in the second half), continue with the above step.

Step 3: If the last number on the current page is smaller than the number we are looking for, then we can tear up the first half of the dictionary (the reason is the same as above) and continue to step 1.

In this way, we will say that the dictionary becomes thinner and thinner. In the worst case, we will tear to the last page. This page either has this number or no number, but we guarantee that we will not miss it by following the above steps. Any page that may contain this number.

This logic is the same as the binary search principle in computer algorithms. Let’s take a look at how the actual algorithm code is implemented.

boolean isNumInArray(int num, int[] array, int start, int end) {	
	if(num < arr[start] || key > arr[end] || start > end){
		return false;				
	}
	
	int middle = (start + end) / 2; //找到对折点
	if(array[middle] > num) {
		return isNumInArray(arr, key, start, middle - 1); //撕掉后一半
	} else if(array[middle] < num){
		return isNumInArray(arr, key, middle + 1, end); //撕掉前一半
	}else {
		return middle;
	}	
}

We can see that, compared with the way of thinking of human beings, a computer does not turn "a page", it only looks at a number, but other ways of thinking are exactly the same. Using such algorithms, although humans are still slower than computers in terms of results, both parties have found the most suitable method to achieve the greatest improvement in self-efficiency.

Find a specific number in a sorted array-more thinking

So when we look back, why should I assume that these 1 million numbers are printed on the dictionary? Because the model of dictionary and computer memory is very similar.

The computer can directly access the memory through the memory address, which is similar to turning to a certain page through the page number of the dictionary.

In computer coding, we can know the length of the array, and find the middle number by halving. The dictionary has thickness. We can find the middle page number by halving the thickness. This is also similar.

Just imagine, if the number of 1 million is not printed in a dictionary, but printed on a highway, can we still use the algorithm in the previous section to search for human flesh? The answer is no, because running half of the road will consume a lot of your energy. If you use the binary search method, it will only make you consume more energy than the dumbest method in round 1. Because the concept of road storage corresponds not to the memory model, but the tape model. For such a model, I believe that whether it is a human or a computer, the algorithm needs to be adjusted to achieve the highest efficiency. .

to sum up

Through the above examples, we can see that the computer algorithm is anti-human, because the computer is not a "normal person", it has its own shortcomings, but also its own strengths. Many times we feel that the algorithm is not intuitive, not because our thinking ability is worse than the computer, but precisely because as human beings, we are exposed to too much information at the same time, and there are too many things that block our thinking. Then, at this time, you might as well "degenerate" yourself into a computer with "short-sighted" and "little-known". At this time, you may have a clearer idea.