Use the K nearest neighbor algorithm KNN to predict the type of wine. The data comes from the sklearn dataset.
import dataset
from sklearn.datasets import load_wine
wine_data = load_wine()
wine_data.keys()
Each dataset in datasets in sklearn contains the following information: data: feature data, target: target variable, target_names: name of target variable, DESCR: data description, feature_names: feature name.
explore data
wine_data['data'].shape
It can be found that this data set has a total of 178 samples and 13 feature variables.
print(wine_data['DESCR'])
Divide the dataset
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(wine_data['data'],wine_data['target'],random_state=0)
wine_data['data'] and wine_data['target'] in train_test_split() represent feature variables and target variables;
You can also assign X=wine_data['data'], y=wine_data['target'], train_test_split(X,y,random_state=0);
random_state is a random seed, you can write a number arbitrarily.
Using the shape method, you can see that the training set accounts for 75% and the test set accounts for 25%.
Modeling & Calculating Scoring
# KNN分类模型
from sklearn.neighbors import KNeighborsClassifier
model_knn = KNeighborsClassifier()
model_knn.fit(X_train,y_train)
format(model_knn.score(X_test,y_test))
The accuracy of the model is about 73%, that is, the probability of making a prediction about new wine is correct is 73%.