Machine Learning --- inductive learning

Human learning: experience → ( brain thinking ) → law

The induction machine learning: data → ( learning algorithms ) → model

So learning algorithm is a simulation of the process of the human brain to think.

1) Data:

Data stored in the computer, to the training set Form D is present, D = {X . 1  , X 2  , ..., X m } wherein X . 1 ~ m to m a sample (Example) .

Sample X  = X { I1 , X I2 , ..., X ID } → produce d dimensional samples (sample input) space .

Supervised: the Y is X  is labeled , Y  set → flag (output) space .

X + Y → (X , Y ) is called a sample .

2) learning algorithm:

Objective : learning a training set by the input from the (sample, example) to the output space (labeled) space mapping x → y model.

Process: hypothesis space → (consistent with the training set) → version space → (induction preference) → model

In all hypothesis space, consistent with the training set is likely to be more than one, which are consistent with the hypothesis space training set constitutes a version of space.

Among the many versions of space, the need for further screening model is the most realistic, we have to rely on induction preferences (model evaluation and selection) a.

So the question is, how to determine the model and the actual training of the fit of it? The answer is error analysis.

Error analysis:

  1. aside method: under conditions in stratified sample training set D (to ensure consistency of data distribution) of N times is divided into sets of samples S and T test set, the results are averaged, it is noted here, S, T in division is luxurious, if too much S, T is too small, the error in the result obtained is not accurate enough, if S is too small, too T, the trained model is not accurate enough, generally classified as S: D = 2 / 3 or 4/5, to get rid of S: T errors can leave-one : a total of K samples is provided, the K-1 samples as the training set, remaining as a test set, the test is repeated K times averaged, However, this method also has problems that in many cases the samples down using leave-one is time-consuming and impractical, the other, leaving a law may not always accurate than other methods.

  2. The cross-validation: the sample set N times D divided into K subsets layered samples, each time using a K-1 as the training set, remaining as a test set, the results were averaged, it is also known N times K-fold cross-validation , leave-one is in fact only a subset (the collection) in N times M-fold cross validation.

  : 3. Bootstrap after a given sample set containing samples D m, a randomly selected sample from each of D, copy it into D 'and then back into the sample data set D, m times repeatedly performed , to obtain a data set comprising m of samples D ', and D will have a portion of the sample D' appear multiple times in the sample and another part does not appear.

The probability of not being selected as follows:

 

 The unselected data (about 1/3) was used as the test set, the results thus obtained are also referred to as package outer estimation . Bootstrap changed but the distribution of the data set, the deviation is generated, it is used more often leave a method / cross validation when more data.

Machine learning error on the training set is called a training error , the error on the test set is called the test error , the new object model of the application error produced called generalization error , we can not get generalization error, can only test error approximated.

Note: The training set has a universal general nature and specific to the training set of special properties , if the model fails to get the general nature of the training, you become less fit , on the contrary, if the training was excessive, and even special nature also learned , that is over-fitting .

For example, if the training set is some red apples, if the test set is given an apple tree, apple tree underfitting mistakenly identified as a whole apple, if the test set is a green apple, over-fitting mistakenly the green Apple is not recognized as apples, visible under / over-fitting can not be subject to proper induction, test error over / under is inappropriate.

TBD

Guess you like

Origin www.cnblogs.com/shenminghe/p/MachineLearning_IndustiveLearning.html