机器学习一百天-day7/11-KNN近邻法
在协同过滤里应用的就是近邻法
一,数据预处理
读取数据,划分数据集,特征归一化
import numpy as np import pandas as pd import matplotlib.pyplot as plt dataset = pd.read_csv('D:\\100Days\datasets\Social_Network_Ads.csv') X = dataset.iloc[:,[2,3]].values Y = dataset.iloc[:,4].values from sklearn.model_selection import train_test_split X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.25,random_state=0) from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.fit_transform(X_test)
二,将KNN应用于训练集
使用KNeighborsClassifier,
具体介绍见https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html
from sklearn.neighbors import KNeighborsClassifier classifier = KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2) classifier.fit(X_train,Y_train)
扫描二维码关注公众号,回复:
4938843 查看本文章
三,预测
y_pred = classifier.predict(X_test) from sklearn.metrics import confusion_matrix cm = confusion_matrix(Y_test,y_pred)
生成的混淆矩阵是这样的
[[64 4] [ 3 29]]
得分是0.93