In layman's language computer composition principle: data integrity (lower) - How to Restore the crime scene (say 50)

First, the primer

After finished checking code, you should now know, whether parity or cyclic redundancy check code CRC such, can only tell us a thing, your data is wrong. Therefore, also called checksum error detection code (Error Detecting Code).

Whether check code or error detection code, when hardware errors, can only tell you, "I was wrong." However, the next question, "Where is wrong," it is not the answer. This leads to our treatment shutter mode only one, that is, as "Where are all wrong." If the Download
file and found the check code does not match, we can only go to download again; if it is put inside the data memory computing program, we can only re-count again.

Such efficiency is too low, so we need to have a way to not only tell us, "I was wrong", but also tells us that "what was wrong was wrong." As a result, computer scientists invented the error correction code. Error correction code requires more redundant information, through the redundant information, we not only
know where the data is wrong , but also to change the data directly to the right. This is not magical sounds? Next, let's take a look.

Second, the Hamming code: how much information we need redundancy?

The most well-known error correction code is the Hamming code. Hamming code (Hamming Code) is based on his invention PORTRAIT Richard Hamming (Richard Hamming) named. This encoding back in the 1940s was invented. To this day, we said to the last lecture of
ECC memory, also still using Hamming code error correction.

The most basic Hamming code Hamming code is called 7-4. Here, "7" refers to the actual valid data, a total of seven (Bit). And here's the "4", it refers to our four additional stores data for error correction.

First of all, you have to clear a little bit, error correction code error correction capability is limited. Not to say that no matter how many wrong place, we can to correct them. Otherwise, we do not need that seven data bits, only four parity bit like that, which means we can not be used for the data bits can be transmitted the information. This
unscientific. In fact, at 7-4 Hamming code inside, we can only correct a wrong one. This is how to do it? We ⼀ look.

4-bit check code may represent a total of 2 ^ 4 = 16 different numbers. The data bits calculated check value, must be determined. Therefore, if a data bit error, calculated checksum, the checksum is determined and a certain different. That may be the value that is 2 ^ 4-1 = 15
among the 15 possible that the rest of the check value.

15 may check value, in fact, may correspond to 15 possible bit errors. This time you might ask, since we only seven data bits, then why do we use four check code it? With three not enough? 2 ^ 3-1 = 7, just to the upper 7 bits of data different ah!

Do not forget, flipping a single bit errors, may not only appear in the data bits, may also appear in the check digit. Parity bit would have been possible errors. So, seven data bits and three parity bit, if only single-bit error, the number of bits that can go wrong is 10, 2 ^ 3-1 = 7 case is
not specific to help us find which one error.

In fact, if we have K bits of data bits, parity has N bits. So we need to satisfy the following inequality, to ensure that we are able to flip a single bit of data error correction. This inequality is:

In the seven data bits, i.e. the case of K = 7, N is the minimum values ​​for 4.4 parity bit, in fact, can support up to 11 bits of data. Below I've listed a number of data bits and a simple check digit comparison table, you can do the math yourself, understand what the formula above.

Third, the principle of error correction Hamming code

Now you should figure out, in the case of data bits determined how to calculate the required check digit. Then the next, we will look at ⼀ from Hamming code encoding method is kind of how.

To figure it simple, we use less bits, to count a 4-3 Hamming code (i.e., four data bits, parity bit 3). We four data bits, denoted as d1, d2, d3, d4. D Here, take the first letter of the data bits of the data bits. We put three parity bits,
respectively denoted by p1, p2, p3. Individual cases from the p, is taking the first letter of the parity bit parity bits.

From the 4-bit data which we take one, then calculate a corresponding parity bit. We talked with before calculating the parity check bits on it. For example, we use d1, d2, d3 to calculate a parity bit P1; ⼀ calculated parity bits p2 using d1, d3, d4; with
the calculated parity bit p3 d2, d3, d4. Like in the following table corresponding to the same:

This time, you go to think about it, if this bit of data d1 wrong, what happens? We will find that the results of the check and p1 and p2 are not the same. d2 wrong, because the checksum calculation result of p1 and p3 are not the same; D3 wrong, p2 and p3 is because; if
d4 wrong, are p1, p2, p3 are not the same. You will find that when the data error code, there will be at least two computing the checksum is inconsistent.

Then we come down, if p1 checksum is wrong, it will send ⽣ what circumstances? This time, only the check result p1 error. P2 and p3 wrong result is the same, only a check code is calculated inconsistent.

So check codes do not coincide, a total of 2 ^ 3-1 = 7 case, just corresponds to seven bits of different errors. I have also placed this correspondence table below, you can understand it.

You can see, this Hamming code error correction process, a bit like the movie to see the reasoning Holmes process. Additional information site by mistake, step by step Tiaofenlvxi find, in the end is what a data error, error reduction, when the "crime scene" . See here, I believe you will find on the one hand Hamming code is particularly magical, but will also emerge a new question, how can we use a set of procedures or rules to generate the Hamming code it? In fact, this step is not complicated, then we will look together.

First of all, we first determine the encoded data to be transmitted is the number of bits. For example, we here 7-4 Hamming code is ⼀ a total of 11. Then, we give the 11-bit data are numbered from left to right, and also to write their binary representation.

Then, we put the binary integer power of these 11 data to find out. In this 7-4 Hamming code which is 1,2,4,8. These numbers that we checksum bit, we record them do p1 ~ p4. If the degree ⼆ ⻆ hexadecimal see, among which are that the number of 11, only
four, which is only four bits is a bit of value 1. Then the rest of the number 7, that we d1-d7 of the data symbols.

Then, for our check code bits, we still use parity. But each parity code bit, not use all seven data to calculate the checksum. Children with 3,5,7,9,11 p1 is calculated. I.e., in binary representation, the fourth bit from the right to left is the case 1
, the use as the p1 parity code.

The remaining p2, we use to calculate the checksum 3,6,10,11, i.e. in binary representation, the second bit from the right to left of the case 1, with p2. Then, p3 natural number from right to left, the third bit is the case where the check code number 1. While the fourth bit p4 is a
check code in the case of 1.

This time, you will find that any of the data code is wrong, we will have at least two or three corresponding check code is not on, so that we can turn to find which one data code wrong. If the checksum is wrong, then only check code is not on this one, we know this is
a check code wrong.

The above method, we can determine the period represented by the program, which means either a few Hamming code, we no longer need to manually compact design of the coding scheme.

Fourth, the Hamming distance: image understanding of the role of the Hamming code

In fact, we can change a point of view to understand the role of Hamming codes. For two binary representation of the data, there are differences in the median between them, which we call Hamming distance. For example, Hamming distance is 1 of 1001 and 0001, because they have only left-most position of the first frame is different. The 1001
Hamming distance is 2 and 0000, because most of them left and right most two are different.

So, you can easily imagine, be a so-called correction, which is the number 1 and the Hamming distance of all our data to be transmitted is, can be corrected back.

And any two data we actually want to transfer, if the Hamming distance are at least 3. You might ask, why can not 2 it? Because if it is, then 2, then there will be a number of errors, two Hamming distance to the correct data is 1. When we see this error number of
candidates, we do not know what should be corrected to that of a number.

After cited as the Hamming distance, we can more vividly understand the error correction code. Under no circumstances error correction, we see data like a point is a space inside. This time, we can make the distance between the data is very compact, but if the coordinates of these points a little
wrong, we might be wrong which one point.

After you have an error correction, as if we put a point into a point at the center, a radius of 1 ball. As long as the coordinates in this range balls, we all know that the actual data is to be Center Weighted coordinates of the ball. And each data can not be too close to the ball, the ball must have different data
from the three units.

 

 

V. Summary extension

Well, the contents of the error correction code on individual cases from finished. You can not look at this seemingly simple ⼩ Hamming code. Although it early 1940s was born, but until today's ECC memory inside, we are still using this technical solution. And also because Hamming code Hamming won the Turing Award.

By adding multiple redundancy check code bits in the data, Hamming code is not only able to detect errors in the data, the data can also be when only a single bit error, the error correction of an over. And in understanding the process of calculating Hamming code, there is a very important point, is not only the original
data bits may be wrong. We've added a parity bit, the same as flipping a single bit error may occur. This is why, 7 data bits with a three digit check code is not enough, need four checksum bits.

The actual Hamming code encoding process is not complicated, we do not pass through the different parity bit, to match a plurality of different sets of data, to ensure that any data bit error, are intended to produce a plurality of error check code bits a unique combination. In this way, when something goes wrong, we can turn to find the
data bit error and correct them. When there is only one bit error check code, we know the actual error is a bit checksum.

 

Guess you like

Origin www.cnblogs.com/luoahong/p/11442996.html