Carding machine learning algorithm context of integrated learning

Integrated Learning (ensemble learning) to complete the task by building and combining multiple learners. Also known as multi-classifier system (multi-classifier system), based on learning Committee (committee-based learning) and so on. Its general structure is Mr. into a set of "individual learner" (individual learner), then a strategy to combine them.

 

If the "individual learner" are of the same type, for the homogeneous integration (homogeneous ensemble);

If the "individual learner" is not the same type, for the heterogeneous integration (heterogenous).

Homogenous integration of the "individual learner", referred to as " group learner " (base learner), referred to as a learning algorithm and a learning algorithm (base learning algorithm); heterogeneously integrated "Individual learner" referred " component learner " (component learner).

 To get a good integration of individual learning should be "good and different", that the individual learner to have a certain accuracy (learning can not be too bad) and diversity (between diversity, learning has a difference).           

                                                          

How to generate individual learner?

The individual learner generation mode , divided into two categories:

* Among individuals there is a strong dependence learner must serial sequence generation method; Boosting is representative of

* Does not exist among individuals learner dependency, parallel methods may be simultaneously generated; Bagging and Random Forests are representatives

 

Boosting is a family may be weak learners promoted to a strong learner algorithm. Weak learner (weak learner) often refers to the generalization performance is slightly better than random guessing learner; slightly more than 50%, for example, precision in the classification of two classification problem.

After the initial start training set to train a group learner, then training samples is adjusted to reflect the performance of distributed learning, such training samples previously yl learners wrong in the subsequent more attention, and based on the adjustment: mechanism training in the distribution of samples of a group learner; This is repeated until the study is the number of groups reaches the pre-specified value T, the T groups eventually these learners weighted binding.

Bagging a given sample comprising m data sets, based on self-sampling (bootstrap sampling, i.e., sampling with replacement), sampling the set of samples containing the T m of the training samples, and based on each set of samples to train a group learner, the learner then these groups are combined. When the prediction output combination, Bagging usually classified tasks using simple voting method, using a simple average method to return to the task.

Random Forest (Random Forest, referred to RF) Bagging is a variant of the extension. RF is based on the decision tree is constructed based learning Bagging integrated on the random nature of selection is further introduced during the training of the decision tree. Specifically, in the RF, a decision tree for each node of the group, starting with a set of attributes of the node containing k is randomly selected subset of attributes, and then concentrate this promoters, selecting the optimal properties with in the division. Where the parameter k is introduced to control the degree of randomness.

 

What strategies generated by the combination of these individual learner? Consider the following combination of strategies :

For numeric output, the most common strategy is to combine average method (averaging). In particular there is a simple average (simple averaging) and a weighted average (weighted averaging).

Category tag for output, the most common strategy is to combine the voting process (voting). In particular there is an absolute majority voting (majority voting), the majority voting (plurality voting) and the weighted voting method (weighted voting).

When a lot of training data, a more powerful combination of strategy is " learning ", that is joined by another learner. A typical representative Stacking.

Stacking start initial data set to train the primary learner, and "generate" a new data set used to train the secondary learner. In this new data set, the output is used as the primary learner sample input feature. The initial sample of the mark is still considered a sample tag.

 

Diversity

Individual learner choice? "Good but different" theory - "Error - differences decomposition" (error-ambiguity decomposition)

Diversity metrics (diversity measure) is a measure of the diversity of integration in individual classifier that estimates the degree of diversification of the individual learner. A typical approach is to consider individual classifier pairwise similarity or dissimilarity. Common diversity measures are as follows:

* Substandard measure (disagreement measure)

* The correlation coefficient (correlation coefficient)

* Q- statistic (Q-statistic)

* K- statistics (k-statistic)

 

How to enhance diversity?

数据样本扰动。通常基于采样法,从初始训练集,产生不同的数据子集,再利用不同的数据子集训练出不同的个体学习器。

输入属性扰动。从初始属性集中,抽取属性子集,再基于每个属性子集训练一个基学习器。

输出表示扰动。

算法参数扰动

Guess you like

Origin www.cnblogs.com/klchang/p/11332837.html