Biased estimation and unbiased estimation

Unbiased and biased

  Essentially, unbiased / unbiased estimation means to estimate statistics of formula unbiased estimate is contemplated, the statistic calculation of multiple samples (obtained according to the estimation formula) is on both sides around the true value. It resembles a normal distribution bell-shaped graph. For example, the mean estimate for:

mean = (1/n)Σxi

  Some certainly larger than μ, some smaller than μ.

  Then for biased estimate is sampled a plurality of times, it will be estimated statistics on one side of the true value (both are greater than or less than the true value). For example, for a variance formula:

S² = (1/n)Σ(xi - m)

  From the above equation we can know, m is a fixed value, if and only if m = mean (x) when S² made minimum, this means that if the true value of μ ≠ mean (x), then there must be σ² <s², which means s² is a biased estimate because s² are distributed on the left side of less than σ².

  In addition, from a mathematical point of view, it means better able to maintain no bias because the average calculation process still essentially a linear process, this is unbiased; but it is not a linear model, so biased for the variance.

 

Which one is better

  But one thing, unbiased estimate is not necessarily biased estimates than more "effective" because the so-called effective means to estimate closer to the true value, this is close to be reflected by S², if the statistics estimates, although biased, but more close to the true value, it is more efficient; as shown below:

 

  Left is unbiased, the results around the "true value", the right side is biased results biased in favor of one side of the "true value", but there is no doubt biased better, because closer to the true value.

 

consistency

  With the increase of the sample, S² value will be closer to σ², in fact, if the sample size is very large, if biased longer important, because in fact large sample case either biased or unbiased, variance will be small enough to be close to true value.

 

appendix

  There are partial variance unbiased estimator

  Unbiased estimate:

  S² = (1 / n) S (x - m) ² ---- (1)

  S² = (1/n-1)Σ(x - mean(x)) -----(2)

  Biased estimate:

  S² = (1/n)Σ(x - mean(x)) --------(3)

  Note (1) and (3), only a difference of a subtraction, but resulted in a biased and unbiased, because a fixed value, a value is uncertain (not the same for each sample).

  Here we look at variance unbiased derivation of the formula:

 

 

 

 

 

reference

 

https://www.matongxue.com/madocs/808.html target figure is the reference to this article

https://www.matongxue.com/madocs/607.html This article is for God, to explain what is biased and what is unbiased, and the appendices to derive an unbiased variance process is from this article

https://spaces.ac.cn/archives/6747 data science site linear and nonlinear part is a son this article, got me thinking on what is distributed

 

Guess you like

Origin www.cnblogs.com/xiashiwendao/p/12213310.html