How to understand linear regression "return", return to where?

Original Address: https://blog.csdn.net/Laputa_ML/article/details/80100739

How to understand linear regression "return", return to where? Take a look at the English linear regression of regression towards the mean. mean is the average mean in English.
So the average and how to understand it? Personally feel that if we can think together and several more help to understand the value. They are - the true value of the measured values.
 
actual value
Is the true value of an object. For example, the real value of the length of the desktop. What is the true value of the characteristic?
1, to determine the presence of the true value, such as length of the table there is a constant value.
2, humans could never get true value, the more difficult to understand, why not get the real value, and forever it? - because the error will always exist, regardless of how sophisticated measuring instruments, no matter how carefully were measured, no matter how many times the measurement error exists with the use of human beings never get the true value. (You want a little philosophical thinking in order to understand)
 
Measurements
Value is the measured value of the human measurement table length obtained, mentioned above, due to measurement errors, certainly not equal to the true value.
 
average value
Popular understanding is that multiple measurements averaged arithmetic mean. So what does the relationship between the average value and the real value? Personal understood as follows:
1, under the premise of a limited number of times of measurement, the average real value will never be equal to
2, when the number of measurements increases the premise, the average will be close to the true value of
3, when the measure reaches the infinite number of times when ∞∞ , equal to the average real value
1 and 2 are well understood, because errors caused.
So why, when measured 3 times reaches ∞∞ time, equal to the average real value of it? Because when the number of measurements reaches infinite, then the final error measurement will cancel each other infinitely generated in each measurement. For example: the length measured with a ruler the desktop will be affected by temperature, an error is generated because the temperature will ruler thermal expansion and contraction. So we do a hypothesis:
    a desk, desktop real length is 20cm, that is, the true value is 200mm
    Suppose a measurement time of high temperature, thermal expansion of the ruler, then the measurement value is less than the true value of the. Suppose the second time measurements and low temperature, and shrink the ruler, then the measurement value is larger than the true value. Then the error on two measurements offset.
    But it may not be fully offset. For example the first time a small thermal expansion measured 10mm, measurement is 190mm, when measured shrink large 8mm, measurement is 208mm, then the average value is (190 + 208) / 2 = 199mm, this value is not equal to actual value. So we are more than a few times to measure it? Every possible measurement errors cancel each other less and less, is more and more close to the true value. So when equal to the true value of it? Only other measuring times reaches ∞∞ when each measurement error can be fully offset, which is the average must be equal to the true value.
    But this is not the above mentioned and never get the true value of human contradict it? Not contradictory, because human beings, mortals will never be able to do ∞∞, ∞∞ has two characteristics:
    1, you can infinitely close ∞∞
    2, you can never reach ∞∞
    due to the second point, mankind will never be able to do measuring ∞∞ times, then humanity can never get the real value of the object.
    So back to the topic, in the end what is the return, return to where? Is a return to the true value, or called back to the essence of things.
    I said above, when the more time the number of measurements, the average is closer to the true value, which explains why the data must be big job. When the amount of data is large enough, it means we get closer to the essence of things - real value, that linear regression is a return to the essence of things - real value.
 
The relationship between the average and the regression equation
Some people may not see the formulas and regression equation averaged What is the relationship, because the two formulas do not look like the surface looks.
Arithmetically average formula:
X1 + X2 + X3 + ... + .. + xnnx1 X2 + X3 + ... + .. XNN
The regression equation formula:
the y-w1w1x1x1 = w0w0 + + + ...... + wnwn w2w2x2x2 * xnxn
Refer to previous article linear regression
mean, in fact, for the purposes of the experimentally observed characteristics of the sample. For example, our results come to x1x1, x2x2, x3x3 ... ..xnxn these n values, then we mean is calculated
x1+x2+x3+…..+xnnx1+x2+x3+…..+xnn
For example, we were dice, throwing six, points are 2,2,2,4,4,4, this is what we observed six of the samples, so we can say mean (2 + 2 + 2 + 4 + 4 + 4) / 3 = 6. Then the formula looks and how the regression equation does not look the same, but the change to the formula would mean:
(2+2+2+4+4+4)/6 = 3
3 = (2+2+2+4+4+4)/6
3 = 1/6*2 + 1/6*2 + 1/6*2 + 1/6*4 + 1/6*4 + 1/6*4
Look at this simple formula and regression equation is not a bit like it? If seen 3 y, 2,2,2,4,4,4 seen as x1x1, x2x2, x3x3, x4x4, x5x5, x6x6, then the formula is
y = 1616x1x1 + 1616x2x2 + 1616x3x3 + 1616x4x4 + 1616x5x5 + 1616x6x6
is not already very much like a regression equation? Just weighted regression equation is re-weighted, the weighted average of the weights are equal weights. This explains why y is the mean of.
y = w0w0 + w1w1x1x1 + w2w2x2x2 + w3w3x3x3 + w4w4x4x4 + w5w5x5x5 + w6w6x6x6
In fact, you can see it, y regression equation is actually weighted mean, mean that equal-weighted mean, but the essence is the same. So the regression equation y is the average mean.
 
to sum up
Concepts of mathematics in fact, to meet the needs of production and life of humans, the production of human life, people concerned about what value? Certainly not the measured value, because the error will always exist, human concern is the nature of things, which is the true value of human life and production is to want to get the true value, the return is to return to the true value.
Although the measured value is not human want, but the real value is indeed a human could never get, then by the principle of human mathematical statistics, uses the measurements to estimate the true value, suggesting that the method is to increase the number of measurements as much as possible, to strike multiple times measurement of the mean, and the more the number of measurements, mean to get closer to the true value.
Mentioned above, humans, mortals do little to measure ∞∞ number, assuming you are God, you are the Tathagata, you are God, you can do measure ∞∞ times, this time the mean value of y = true.
y = x1 + x2 + x3 + ... .. + xnnx1 + x2 + x3 + ... .. + xnn = true value (where, n = ∞∞)
 
Several concepts of linear regression
example is the sample, the sample is set examples, but is generally called the sample space.
feathers become characteristic, that is, the regression equation ... ..xnx1x2x3 ... ..xn X1X2X3 
x0x0 = 1 is the intercept of the equation.
β1β2β3 ... ..βnβ1β2β3 ... ..βn is the probability of each feature (x) occurs.
outcome: that is the true value of human to expect, but we get the outcome is close to the true value of the mean.
It refers ξ between each example and the true values of the deviation error.
----------------
Disclaimer: This article is the original article CSDN bloggers "Laputa_ML", and follow CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement. .
Original link: https: //blog.csdn.net/Laputa_ML/article/details/80100739

Guess you like

Origin www.cnblogs.com/lzhu/p/11745433.html