本测试在jupyter中实现
直接上代码,入门级别手写数字识别:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.neighbors import KNeighborsClassifier
#使用了1500个数据,比较少,相关的data图片可以在文章后面进行下载
data = []
for i in range(10):
for j in range(1,151):
#使用'./data/%d/%d_%d.bmp'定位到文件
data.append(plt.imread('./data/%d/%d_%d.bmp'%(i,i,j)))
y = [0,1,2,3,4,5,6,7,8,9]*150
y = np.array(y)
y = y.sort()
#1500个图片中,任选1000作为训练样本,500作为测试样本,可能会有重复,但是影响不大
index = np.random.randint(0,1500,size = 1000)
x_train = x[index]
y_train = y[index]
index = np.random.randint(0,1500,size = 500)
x_test = x[index]
y_test = y[index]
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(x_train.reshape(1000,784),y_train)
knn.predict(x_test.reshape(500,784))
y_ = knn.predict(x_test.reshape(500,784))
y_[:20]
y_test[:20]
(y_test == y_).mean()
data文件网盘链接:
https://pan.baidu.com/s/13f9vuxWN5WqyoGZkMf1qNQ