Machine Learning model evaluation and diagnostics

First, how to improve a machine learning algorithms
assume that you have to train a machine learning algorithm, but the effect is not very good, then there are several improved methods:
1, more training data
2, use fewer features
3 increase feature quantity
4, increasing the higher order terms
5, increase or decrease the value of the regularization parameter lambda
many people just randomly select one of the above methods, i.e., time consuming and ineffective. So the next will introduce machine learning assessment and diagnostics model.

Second, the model evaluation (Evaluating Indicates A Hypothesis)
. 1, assuming Evaluation:
a minimum of training is sometimes assumed that the error is not a good assumption, but when the characteristic is too large, it is difficult to observe the draw function is assumed.
There is a standard method, the training data into two parts, the first part is the training set, the test set is the second part, usually about 7: 3.

2, the specific steps:
then we can model training process has the following two parts:
(1) using the training set to train the model parameters [theta], (minimizing the cost function Jtrain ([Theta]))
(2) calculated using the test set error Jtest (Θ)
for linear regression, the error:

Here Insert Picture Description

For logistic regression, the error:

Here Insert Picture Description
Third, the model selection and training, validation test set (the Model Selection and Train / Validation / Test sets)
. 1, without using model selection validation set:

Here Insert Picture Description

In general, the choice of the smallest model J, but this can only be made small model errors for the test data set, but the effect is not good generalization to other data. Thus this method can not well verified error.


2, using the validation set (Cross Validation Set) **
In order to solve generic problems, we introduce a third data set, cross-validation set. Which as an intermediate layer between the training and validation sets is trained polynomial d, then tested using the test set, we can not get an error for the test set "special optimization" off. Ratio: 60% training set, cross-validation set of 20%, 20% test set.

Here Insert Picture Description

Error three data sets as follows:

Here Insert Picture Description
IV diagnosis deviation and variance (Diagnosing Bias vs. Variance)
In this section, we look for polynomial d and less fit, the relationship between the over-fitting.
First, we need to confirm whether the deviation and variance factor is the impact we have achieved good results:
high deviation means less fit, is used to describe the degree of dispersion of data, the object model is a single
high variance means that over-fitting to describe the data with we expect the central difference was how far objects are more models
we need to find a good way to balance them.

Then we can quantitatively determine the over-fitting, less fit, and to find the optimal solution parameters d according to the following chart.

Here Insert Picture Description
Fifth, regularization and deviation, variance (Regularization and Bias / Variance)
In this section, we look for the relationship between the regularization parameter λ and deviation, variance.
Larger λ: high deviation (underfitting)
moderate λ: no major problem
smaller λ: high variance (overfitting)
a large penalty [lambda] value of all the parameters θ, which greatly simplifies the function curve, it will lead to less fit.

Then we can quantitatively determine the following figure to over-fitting, less fit, and to find the optimal solution of lambda

Here Insert Picture Description
Six, learning curve (Learning Curves)
using the learning curve effectively determines whether a learning algorithm deviation variance or both.

1, the learning curve drawing:

Here Insert Picture Description

The mean square error plotted training set and cross-validation sets with the number of samples and m varies variation curve.


2, the learning curve at a high deviations

Here Insert Picture Description

Conclusion: If a learning algorithm has a high deviation, increase training samples useless for improving algorithm performance.

**
3, the learning curve at high variance case

Here Insert Picture Description

Conclusion: If a learning algorithm has high variance, increase the training sample algorithm is supposed to help.

Seven summary (revisited)
How to debug learning algorithm
1, get more training data: Fix high variance case
2, use fewer features: high variance amendment
3, to increase the amount of features: high deviation correction
4, an increase higher order terms: high deviation correction
5, decreasing the value of the regularization parameter lambda: high deviation correction
6, increasing the value of the regularization parameter lambda: correcting high variance

Eight, the diagnosis of neural networks

Here Insert Picture Description

In general, choose the left simpler neural network, often have poor fitting problem, a small amount of calculation
to select the right side of the larger neural network architecture, sometimes over-fitting problem, computationally intensive

Guess you like

Origin www.cnblogs.com/liunaiming/p/11967128.html