Part machine - a decision tree (a)

Skip to main content

Returns the directory tree

Previous: Machine papers - logistic regression (Logistic Regression)

Next: Machine chapter - a decision tree (B)

 

The main content of the decision tree to explain:

A decision tree of understanding

II. Algorithm elaborate

III. Evaluation Index

 

Directory Contents:

Part machine - a decision tree (a)  a brief overview of the decision tree

Part machine - a decision tree (b)  elaborate algorithm Hunt

Part machine - a decision tree (c)  elaborate ID3 and C4.5 algorithms

Part machine - a decision tree (d)  elaborate CART algorithm

Part machine - Decision Tree (e)  elaborate evaluation index correlation curve (ROC, KS, PR)

Part machine - a decision tree (six)  elaborate evaluation index of cross-validation

 

In this section, a brief description of the decision tree, the next section beginning elaborate decision tree algorithm.

 

A decision tree of understanding

        1. Definitions

            (1) Each non-leaf node represents a segmentation of the sample, a feature selection is generally a sample, the sample into different child node.

            (2) The sub-node continues to sample the dispersion dividing operation

            (3) Sub-nodes represent the output samples of each dispersion to the leaf nodes belong to the same class (or similar regression value)

 

        2. Architecture

            (1) Decision Tree Learning

                  ①. A process according to sample-based inductive learning

                  ②. Uses a top-down method recursively

                        Initially, the data on the root node, recursion data piece

                  ③ by pruning process, to prevent over-fitting:

                        It is divided into two kinds of pre-pruning and pruning

            (2) Using the decision tree

                  ①. Unknown data classification

                  ②. Severability down layer by layer in accordance with the decision tree generating adopted, until a leaf node. The leaf nodes at zero entropy.

            (3) decision tree shown in FIG.

                  

                 Root: the beginning of the node, such as ①

                 Parent: upper node bifurcation

                 Child nodes: node bifurcated lower end

                 Leaf node: The node bifurcation Finally, as ③, ⑤, ⑥, ⑦, ⑧

                 Bifurcation: data division determination condition

                 Properties: Properties node

                 Tags: ultimate goal

                 Stump: the time when only two layers, referred to as stump, such as ①, ②, ③

 

        3. Decision Tree Features

            (1). The biggest advantage of decision tree learning algorithm is that it can be self-learning.

                  In the learning process, the user does not need to know too much background knowledge, only need to be better labeled training examples, we can learn

            (2) decision tree belongs to supervised learning

                  From a class of disordered, random things (concept) reasoning in the decision tree classification rules represented

 

        4. The decision tree learning algorithm to generate

            Establish key decision tree, choose which attribute as the basis for classification in the current state. Depending on the objective function, build decision trees mainly in the following three algorithms:

            (1). Hunt algorithm (chi-square test)

            (2) The information gain (ID3)

            (3) The information gain ratio (C4.5)

            (4) Gini coefficient (Gini). (CART: Classification And Regression Tree)

 

Later, we will elaborate on these algorithms. The next section explains the main algorithm Hunt

 

              

 

返回主目录

返回决策树目录

上一章:机器篇——逻辑回归(Logistic Regression)

下一章:机器篇——决策树(二)

发布了42 篇原创文章 · 获赞 15 · 访问量 2759

Guess you like

Origin blog.csdn.net/qq_38299170/article/details/103720777