Robust Regression - Robust Regression

Robust regression

Robust regression is a method in statistical robust estimation. Its main idea is to modify the objective function in the classical least squares regression which is very sensitive to outliers. The objective function of classical least squares regression is to minimize the sum of squared errors.

Robust regression is the use of robust estimation methods for regression models to fit the structures that exist in most data, and at the same time identify potential outliers, strong influence points, or structures that deviate from model assumptions. When the error follows a normal distribution, its estimation is almost as good as the least squares estimation, and when the least squares estimation condition is not satisfied, the result is better than the least squares estimation.
The example of applying Robust Statistics is that in some subjective evaluation competitions, such as singing competitions and rhythmic gymnastics competitions, the highest score is removed, the lowest score is removed, and the remaining scores are averaged as the player's score In this way, it is very effective to prevent some judges from deliberately scoring a particularly high or low score to affect the final score of the contestants. Such scoring statistics rules are robust.

Robust statistics : To have a good performance of a statistical method in practical application, two conditions are required:

  • One is that the conditions on which the method is based are consistent with the conditions in the actual problem;
  • Second, the sample is indeed random and does not contain negligent errors, such as recording errors. However, these conditions are difficult to meet strictly in practical applications. For example, when the method was originally proposed, it was based on the assumption that the overall distribution is a normal distribution, but the overall distribution in actual problems deviates slightly from normal; or in a large number of observations There are "abnormal data" etc. that are affected by negligent errors in the data. A statistical method is said to be robust if, in such cases, the performance of the statistical method used is only marginally affected.

Guess you like

Origin blog.csdn.net/weixin_45521594/article/details/127567772