The fifth chapter summarizes the learning machine learning in supervised learning what is over-fitting problem?

Overfitting

What is the over-fitting?

Linear regression overfitting

Linear regression problem, we used linear equations, quadratic, polynomial equations to fit the data set, as shown:
Linear regression equation
apparently not a very good fit to the data set linear equations, is less fit , there is a high error ,

Quadratic model is a good fit.

Although the high-order equation through each data sample, but the curve is too tortuous, I do not think it is a good model. I called over-fitting . Another term describing the problem is: a high variance

High variance: We use a function when fitted to the data sample, this function can well fit the training set, can fit almost all of the training data, this function may face problems too big, too many variables, and if we do not have enough data to restrain excessive variable model, then this is the over-fitting .

General speaking, overfitting occurs when too many variables, this time always fit the training data well trained equation, all you cost function may actually be very close to zero, thus leading to the equation can not be generalization to new data samples that they can not predict the price of the new sample

Generalization refers to a hypothetical model ability to be able to reference the new samples.

Logistic regression overfitting

In the following an example set of data samples:
Logistic regression models
Obviously, there are also logical underfitting linear regression function is used as, there is a high variation model assumptions .

Figure II is added just to fit well a dataset quadratic term.

And this after adding more high, over-fitting, curve distortion function model itself, is not a good predictor of new samples. That can not be generalized to a new sample.

How to solve the problem of over-fitting:

You can draw the appropriate order of the polynomial function by drawing graphics. But there are many variables when the draw function graph is not a good method.

The first approach is to minimize the number of selected variables, specifically, we can check the manual entry of variables, and thus determine which variables are more important, then decide which variables should be retained and which should be discarded.

The second method for calculating the regularization . Regularization will retain all of the characteristic variable. But we will reduce the size of the order of magnitude or parameter values. This is a good way, because every variable we used to.
--------------------

Published 42 original articles · won praise 3 · Views 4905

Guess you like

Origin blog.csdn.net/Ace_bb/article/details/104063964