机器学习一百天-day7/11-KNN

机器学习一百天-day7/11-KNN近邻法

在协同过滤里应用的就是近邻法


 一,数据预处理

 读取数据,划分数据集,特征归一化

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

dataset = pd.read_csv('D:\\100Days\datasets\Social_Network_Ads.csv')
X = dataset.iloc[:,[2,3]].values
Y = dataset.iloc[:,4].values

from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.25,random_state=0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

二,将KNN应用于训练集

使用KNeighborsClassifier,

具体介绍见https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2)
classifier.fit(X_train,Y_train)
扫描二维码关注公众号,回复: 4938843 查看本文章

三,预测

y_pred = classifier.predict(X_test)

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_test,y_pred)

生成的混淆矩阵是这样的

[[64  4]
 [ 3 29]]

得分是0.93 

猜你喜欢

转载自www.cnblogs.com/1113127139aaa/p/10276334.html
今日推荐