Examples of realization of the decision tree (statistical learning methods (Li Hang)) of the loan

Statistical learning methods (Li Hang) This book is an example-based
caveats:

  1. I use pycharm
  2. python version 3.7
  3. graphviz is a software, in which case the pycharm have Quguan under the net
    was added environment variables may also need to restart the computer after the next complete
  4. Lack of Han Han library on the security libraries
  5. That is my own data set, hand knock.

Loan application sample data table

ID age have a job They have their own house Credit conditions category
1 youth no no general no
2 youth no no it is good no
3 youth Yes no it is good Yes
4 youth Yes Yes general Yes
5 youth no no general no
6 middle aged no no general no
7 middle aged no no it is good no
8 middle aged Yes Yes it is good Yes
9 middle aged no Yes very good Yes
10 middle aged no Yes very good Yes
11 elderly no Yes very good Yes
12 elderly no Yes it is good Yes
13 elderly Yes no it is good Yes
14 elderly Yes no very good Yes
15 elderly no no general no

data set

Feature amount Show
age Youth: a middle-aged: Elderly 2: 3
have a job Are: 1 No: 0
They have their own house 1: No: 0
Credit conditions General: 1 Good: Very good 2: 3
category Are: 1 No: 0

dataset=[
[1,0,0,1,0],
[1,0,0,2,0],
[1,1,0,2,1],
[1,1,1,1,1],
[1,0,0,1,0],
[2,0,0,2,0],
[2,0,0,2,0],
[2,1,1,2,1],
[2,0,1,3,1],
[2,0,1,2,1],
[3,0,1,3,1],
[3,0,1,2,1],
[3,1,0,3,1],
[3,1,0,3,1],
[3,0,0,1,0]]

X = [x[0:4] for x in dataset] #取出特征值
print(X)
Y = [y[-1] for y in dataset]#取Y值
print(Y)

The method of seeking a decision tree using decision tree sklearn was determined, and then visualized using graphviz

from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
dataset=[
    [1,0,0,1,0],
    [1,0,0,2,0],
    [1,1,0,2,1],
    [1,1,1,1,1],
    [1,0,0,1,0],
    [2,0,0,2,0],
    [2,0,0,2,0],
    [2,1,1,2,1],
    [2,0,1,3,1],
    [2,0,1,2,1],
    [3,0,1,3,1],
    [3,0,1,2,1],
    [3,1,0,3,1],
    [3,1,0,3,1],
    [3,0,0,1,0]
]
feature =['年龄','没有工作','没有自己的房子','信贷情况']
classname =['不借','借']

X = [x[0:4] for x in dataset]
print(X)
Y = [y[-1] for y in dataset]
print(Y)
tree_clf = DecisionTreeClassifier(max_depth=4)
tree_clf.fit(X, Y)

The above method is not required but visual tree, and on this basis with the following code

export_graphviz(
            tree_clf,
            out_file=("loan.dot"),
            feature_names=feature,
            class_names=classname,
            rounded=True,
            filled=True,

        )

Run the code will generate loan.dot file in this directory
and then into the current directory, execute the following command in pycharm inside the local terminal

dot -Tpng loan.dot -o loan.png

It will generate png images.
My directory as follows

But you will find that there will be Chinese garbled
then you continue to add the following code

import re
# 打开 dot_data.dot,修改 fontname="支持的中文字体"
f = open("./loan.dot", "r+", encoding="utf-8")
open('./Tree_utf8.dot', 'w', encoding="utf-8").write(re.sub(r'fontname=helvetica', 'fontname="Microsoft YaHei"', f.read()))
f.close()

Then take a look at renderings

The entire code is as follows

from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
dataset=[
    [1,0,0,1,0],
    [1,0,0,2,0],
    [1,1,0,2,1],
    [1,1,1,1,1],
    [1,0,0,1,0],
    [2,0,0,2,0],
    [2,0,0,2,0],
    [2,1,1,2,1],
    [2,0,1,3,1],
    [2,0,1,2,1],
    [3,0,1,3,1],
    [3,0,1,2,1],
    [3,1,0,3,1],
    [3,1,0,3,1],
    [3,0,0,1,0]
]
feature =['年龄','没有工作','没有自己的房子','信贷情况']
classname =['不借','借']

X = [x[0:4] for x in dataset]
print(X)
Y = [y[-1] for y in dataset]
print(Y)
tree_clf = DecisionTreeClassifier(max_depth=4)
tree_clf.fit(X, Y)

export_graphviz(
            tree_clf,
            out_file=("loan.dot"),
            feature_names=feature,
            class_names=classname,
            rounded=True,
            filled=True,

        )

import re
# 打开 dot_data.dot,修改 fontname="支持的中文字体"
f = open("./loan.dot", "r+", encoding="utf-8")
open('./Tree_utf8.dot', 'w', encoding="utf-8").write(re.sub(r'fontname=helvetica', 'fontname="Microsoft YaHei"', f.read()))
f.close()


'''
dot -Tpng loan.dot -o loan.png
生成图片
'''

Guess you like

Origin www.cnblogs.com/realwuxiong/p/11962006.html