Machine Learning - Handwritten Digit Recognition

0.: Preface

  • This article can help you realize the whole process from data to model
  • However, as for basic issues such as installing third-party libraries, this article does not cover them, because it is really not difficult, and there are a lot of searches
  • The operating environment of this experiment is jupyter, of course, it is also feasible to use pycharm

1. Data:

  • A total of 5000 arrays of handwritten numbers
  • Among them, there are 10 sets of data from 0-9, and each set has 500 pictures of corresponding handwritten numbers
  • Data file:
    Link: https://pan.baidu.com/s/1gTi-0xjDjbVUK_p_AzkZrw
    Extraction code: 1234

2. Data preprocessing:

  • After getting the data, decompress the data into a directory at the same level as the code
  • The focus of this part is to convert image data into two-dimensional array data that can be input into the model
  • The function solution used:
    • The plt.imshow() function is a function in the matplotlib library, which is used to display images. This function accepts a two-dimensional or three-dimensional array as input, representing the data of the image. It then maps the array's values ​​to the color space to display the image. In the plt.imshow() function, cmap is a parameter that represents the colormap (colormap). In image processing, we usually represent an image as a two-dimensional array, and each element of the array represents a pixel of the image. The value of each pixel is usually an integer between 0 and 255, representing the gray level of the pixel. However, we usually cannot see these numbers directly because they may not be visually distinct. Instead, we usually map each pixel's value to a continuous color space so we can display the image on the screen. There are many different colormaps to choose from, such as: 'gray': grayscale colormap, 'hot': heatmap colormap from red to white, 'cool': colormap from blue to green, 'Jet': from blue to red colormap, 'hsv' : The colormap of the HSV color space.
  • the code
# 尝试从文件中读一个数据出来
img = plt.imread('./手写数字识别/0/0_1.bmp')
display(img.shape) # img是一个二维数组
plt.imshow(img,cmap='gray')

insert image description here

3. Achieve:

  • the code
# 批量导入5000个图片数据
data = [] # 分类模型输入数据
target = [] # 分类模型输出数据

for i in range(10):
    for j in range(1,501):
        img = plt.imread(f'./手写数字识别/{
      
      i}/{
      
      i}_{
      
      j}.bmp')
        data.append(img)
        target.append(i)
# 此时data和target作为列表数据运算起来非常耗内存,所以先转为数组形式的数据方便处理,然后再改变维度
data = np.array(data).reshape(5000, -1)
target = np.array(target).reshape(5000, -1)
print('data的形状:',data.shape,'target的形状:',target.shape)

# 数据划分为训练集和测试集
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(data,target,test_size=0.2) # 20%的测试集

# 导入模型
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier()

# 训练数据
knn.fit(x_train,y_train)

# 查看模型得分,如果是pycharm就把下面代码放到print中
knn.score(x_test,y_test)
  • The final result of the model is 0.93
  • Visualize the results
# 随机挑选10个测试值画图查看预测结果
choice = np.random.randint(1,1000,10).tolist()
# 设置画布大小
plt.figure(figsize=(5*10,2*10))

for i in range(10):
    # 画子图
    re = plt.subplot(2,5,i+1)
    re.imshow(x_test[choice[i]].reshape(28,-1),cmap='gray')
     re.set_title(f'real:{
      
      y_test[choice[i]][0]},\npredict:{
      
      y_pred[choice[i]]}',fontsize=40,
                color = 'k' if y_test[choice[i]][0] == y_pred[choice[i]] else 'r')

insert image description here


4. Supplement:

  • If you want to display a picture in the test after dividing the data set, you should first change the picture data back to the original dimension, and then display
    insert image description here
  • Question about how to change the dimension of the array
    insert image description here

Guess you like

Origin blog.csdn.net/sz1125218970/article/details/132575920