Integration of machine learning models

Reference blog: https: //blog.csdn.net/qq_31342997/article/details/88078213

     https://blog.csdn.net/u012969412/article/details/76636336

     https://blog.csdn.net/maqunfi/article/details/82220115

     https://www.jianshu.com/p/fc4de92a9486

First, the concept model of integration

  To produce a set of "individual learner", and then combine them some kind of strategy to strengthen the model results. 

Second, the common model of integration methods

  boosting methods described: an iterative training base model (base model), each based on the prediction error to get a case of the right to modify the training sample weight, easy to over-fitting.

  bagging method described: extracting a subset of the training set to train different for each base model (base model), the final vote on each of the base model (majority thought).

  stacking methods described: the level of integration of thought, in the first round with a more basic model (a plurality of one, or more) of training, then results in training with LR (basic model to match the weight of each weight).

  blending method introduced: the level of integration of ideas, data disjoint. The training set data designated train in two parts (d1, d2), with the model trained to predict d1 d2, test.

           D2 spend a predicted value and label training new classifier, and then enter the New Features as the final test of the predicted value.

 Third, the integration of the difference between the various models

  On Bagging, the difference between the two on Boosting
    (. 1) on the sample selection:
    on Bagging: between the original training set is chosen with replacement concentrated, from the original training set each wheel is independently selected concentration.
    Boosting: Each round of the training set unchanged, but the weight of each sample in the training set classifier weight changes occur. The weights are adjusted according to the results of the last round of classification.
    (2) Sample weight:
    on Bagging: using a uniform sampling, sample weight equal to the weight of each
    Boosting: The error rate is continuously adjusted sample weights, the greater the error rate is greater the weight.
    (3) prediction function:
    on Bagging: ownership prediction function equal weight.
    Boosting: Each weak classifier has a corresponding weight, a small error for the classification classifier will have a greater weight.
    (4) Parallel Computing:
    on Bagging: each prediction function can be generated in parallel
    Boosting: each prediction function generation procedure only, because the results of the previous one model parameter requires a model.

  Stacking, Blending difference between the two

    1.Blending very similar way and Stacking way, compared Stacking more simply, the difference between the two is:        

        blending directly ready set aside a portion of only 10% in the prediction set aside to continue, with different training data disjoint Base Model, take their outputs (weighted) average.

     Simple, but use less training data.          

    2.blending advantages are: simple stacking ratio, will not cause data through the (so-called Year of the data, such as training on the part of the data by the statistical characteristics of the global time, leading to model the effect of excessive good),

     generalizers and stackers use different data, you can always add other models to the blender.                

    3. drawback: blending only a part of the data set as set aside for validation, and stacking multi-fold cross validation, than a single more robust set aside

    4. The two methods are very good, to see preferences, you can do Blending part and parcel of doing Stacking.

Guess you like

Origin www.cnblogs.com/tyh666/p/11666690.html