To classify the iris data set, Logistic Regression is mainly used here .
The iris dataset is a 4-dimensional 3-class dataset, and its dataset is described as follows:
Data Set Characteristics: :Number of Instances: 150 (50 in each of three classes) :Number of Attributes: 4 numeric, predictive attributes and the class :Attribute Information: - sepal length in cm - sepal width in cm - petal length in cm - petal width in cm - class: - Iris-Setosa - Iris Versicolour - Iris-VirginicaTest code:
# import dataset iris = datasets.load_iris() # from sklearn import datasets lg = linear_model.LogisticRegression(multi_class='ovr') # Multi-classification strategy using one-vs-rest predicted = model_selection.cross_val_predict(lg, iris.data, iris.target, cv=5) # 5 KFold cross validation sets # Determine the classification error rate sums = 0 for i in range(len(predicted)): if predicted[i] == iris.target[i]: sums += 1 print sums * 100.0 / len(predicted), "%" # cartography fig, ax = plt.subplots() # import matplotlib.pyplot as plt ax.plot(range(len(predicted)), predicted, 'gx', label='Predicted Class') ax.plot(range(len(iris.target)), iris.target, 'r--', label='True Class') plt.show()# http://blog.csdn.net/shenpibaipao
The classification accuracy can reach 96.0%