scikit-learn, and writing sklearn, python is a language-based machine learning tools open source packages. It is through NumPy, SciPy and
python Matplotlib numerical calculations such as libraries for efficient algorithm, and covers almost all major machine learning algorithms.
http://scikit-learn.org/stable/index.html
Install the necessary packages:
pip install numpy pandas matplotlib scikit-learn graphviz scipy jupyter
This example runs jupyter, the copy directly to the jupyter in the run.
# -*- coding:utf-8 -*- from sklearn import tree from sklearn.datasets import load_wine from sklearn.model_selection import train_test_split wine = load_wine() print(wine.data.shape) print(wine.target) #如果wine是一张表,应该长这样: import pandas as pd pd.concat([pd.DataFrame(wine.data),pd.DataFrame(wine.target)],axis=1) print(wine.feature_names) print(wine.target_names) Xtrain, Xtest, Ytrain, Ytest = train_test_split(wine.data,wine.target,test_size=0.3) print(Xtrain.shape) Print (Xtest.shape) clf = tree.DecisionTreeClassifier (Criterion = " Entropy " ) CLF = clf.fit (Xtrain, Ytrain) Score = clf.score (Xtest, Ytest) # return prediction accuracy Print (Score) feature_name to = [ ' alcohol ' , ' malic acid ' , ' gray ' , ' gray basic ' , ' magnesium ' , ' total phenols ' , ' flavonoids ' , ' non-phenolic flavanoids ' , 'anthocyanin' ,' Color intensity ' , ' tone ' , ' OD280 / od315 diluted wine ' , ' proline ' ] Import Graphviz dot_data = tree.export_graphviz (CLF , feature_names = feature_name to , class_names = [ " Gin " , " Shirley " , " Bell Moder " ] , Filled = True ,rounded=True ) graph = graphviz.Source(dot_data) graph
operation result:
(178, 13) [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2] ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline'] ['class_0' 'class_1' 'class_2'] (124, 13) (54, 13) 0.9629629629629629
No jupyter students to see here: https://www.cnblogs.com/v5captain/p/6688494.html
Machine learning can not live without it, hey!