决策树的使用及数据可视化

from sklearn import tree
dtr=tree.DecisionTreeRegressor(max_depth=2)#实例化一个决策树类
dtr.fit(housing.data[:,[6,7]],housing.target)#调用fit函数进行训练
dot_data=\#将决策树数据转换成DOT格式
    tree.export_graphviz(
        dtr,
        out_file=None,
        feature_names=housing.feature_names[6:8],
        filled=True,
        impurity=False,
        rounded=True
    )
import pydotplus#该包为专门绘制DOT数据的可视化包
graph=pydotplus.graph_from_dot_data(dot_data)#以DOT数据进行graph绘制
graph.get_nodes()[7].set_fillcolor("#FFF2DD")#设置显示颜色
from IPython.display import Image
Image(graph.create_png())#将graph图像显示出来

sklearn.tree.export_graphviz(decision_tree, out_file=None, max_depth=None,
feature_names=None,
class_names=None,
label=’all’,
filled=False,
leaves_parallel=False,
impurity=True,
node_ids=False,
proportion=False,
rotate=False,
rounded=False,
special_characters=False,
precision=3)

作用:Export a decision tree in DOT format.
参数:
decision_tree : decision tree regressor or classifier
The decision tree to be exported to GraphViz.
out_file : file object or string, optional (default=None)
Handle or name of the output file. If None, the result is returned as a string.
Changed in version 0.20: Default of out_file changed from “tree.dot” to None.
max_depth : int, optional (default=None)
The maximum depth of the representation. If None, the tree is fully generated.
feature_names : list of strings, optional (default=None)
Names of each of the features.
class_names : list of strings, bool or None, optional (default=None)
Names of each of the target classes in ascending numerical order. Only relevant for classification and not supported for multi-output. If True, shows a symbolic representation of the class name.
label : {‘all’, ‘root’, ‘none’}, optional (default=’all’)
Whether to show informative labels for impurity, etc. Options include ‘all’ to show at every node, ‘root’ to show only at the top root node, or ‘none’ to not show at any node.
filled : bool, optional (default=False)
When set to True, paint nodes to indicate majority class for classification, extremity of values for regression, or purity of node for multi-output.
leaves_parallel : bool, optional (default=False)
When set to True, draw all leaf nodes at the bottom of the tree.
impurity : bool, optional (default=True)
When set to True, show the impurity at each node.
node_ids : bool, optional (default=False)
When set to True, show the ID number on each node.
proportion : bool, optional (default=False)
When set to True, change the display of ‘values’ and/or ‘samples’ to be proportions and percentages respectively.
rotate : bool, optional (default=False)
When set to True, orient tree left to right rather than top-down.
rounded : bool, optional (default=False)
When set to True, draw node boxes with rounded corners and use Helvetica fonts instead of Times-Roman.
special_characters : bool, optional (default=False)
When set to False, ignore special characters for PostScript compatibility.
precision : int, optional (default=3)
Number of digits of precision for floating point in the values of impurity, threshold and value attributes of each node.

Returns:
dot_data : string
String representation of the input tree in GraphViz dot format. Only returned if out_file is None.

graph.get_nodes()

print(help(graph.get_nodes))
out:Help on method get_nodes in module pydotplus.graphviz:
get_nodes() method of pydotplus.graphviz.Dot instance
    Get the list of Node instances.
None

要显示图像需要两个额外的插件,一个是graphviz数据可视化框架,一个
pydotplus图像包,这两个都需要额外安装,anaconda不默认提供。
我装这两个包花了大概两个小时:
第一次尝试:
http://www.graphviz.org/Download..php 网站下载msi安装包直接安装。
pip install pydotplus安装pydotplus结果出现
GraphViz’s executables not found错误。
第二次尝试:
两个包都删除然后都是用anaconda安装,依然失败
第三次尝试:
将graphviz的bin路径文件加到path环境变量,仍然失败。
第四次尝试:
转换Graphviz和pydotplus的安装次序,仍然失败。
第五次尝试:
代码添加graphviz到环境变量:

import os     
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin/'

成功!

猜你喜欢

转载自blog.csdn.net/Du_Shuang/article/details/84308648