Realization of handwritten digit recognition based on matlab using KNN algorithm

Realization of handwritten digit recognition based on matlab using KNN algorithm

I. Introduction

  • The full name of KNN is K-Nearest Neighbors, K-Nearest Neighbors. To put it simply, K is the number of neighbors, select the neighbors that are most similar to the test sample (here are the K neighbors with the shortest Euclidean geometric distance), then what is the neighbor of the sample and what the sample is (in K neighbors, if The label of the neighbor is the number 1, so we think the label of the sample is likely to be the number 1)
  • The principle and code of KNN to realize handwriting recognition are relatively simple, but there are not many related articles on the Internet. This article just writes down my own understanding as a practice of learning matlab. There are many omissions, please advise

Second, the realization process

  1. Process the MNIST data set

    • Download the MNIST data set, download a total of four files: test set, test label, training sample, and training label
    • The downloaded data set is in the IDX file format, so use Python to convert to a 50×50 PNG image, the code is in the back
    • Choose an appropriate number of test sets and training sets, the number of training samples for each number in the training set should be the same
  2. Matlab implementation steps (with an image resolution of 50×50 example)

  • Binarize all pictures: take 1 if there is a value, take 0 if there is no value

  • The training samples of 0-9 numbers are matrixed, and each digital image is a one-dimensional matrix. Taking a 50×50 resolution image as an example, a one-dimensional matrix of 1×2500 is obtained; for each number of 860 pictures, we get a matrix of 8600 × 2500, which is used as the training matrix

  • Add a label column to the training matrix to determine what number a row refers to

  • For each digital image to be recognized, it is also converted into a 1 × 2500 one-dimensional matrix, called the test matrix

  • Calculate the Euclidean geometric distance of each dimension of the test matrix and the training matrix, and also add the column vector to the training matrix, and arrange the training matrix by rows in ascending order of distance

  • Find the mode of the label for the first K row vectors, and the resulting label is the most likely recognition result obtained by the KNN algorithm


Three, code implementation

  1. Processing MINIST dataset Python code thanks name_s_Jimmy articles using Python will MNIST data sets into pictures

    import numpy as np
    import struct
     
    from PIL import Image
    import os
     
    data_file =  #需要修改的路径,测试或训练样本图像,如t10k-images.idx3-ubyte或train-images.idx3-ubyte
    # It's 47040016B, but we should set to 47040000B
    data_file_size = 47040016
    data_file_size = str(data_file_size - 16) + 'B'
     
    data_buf = open(data_file, 'rb').read()
     
    magic, numImages, numRows, numColumns = struct.unpack_from(
        '>IIII', data_buf, 0)
    datas = struct.unpack_from(
        '>' + data_file_size, data_buf, struct.calcsize('>IIII'))
    datas = np.array(datas).astype(np.uint8).reshape(
        numImages, 1, numRows, numColumns)
     
    label_file =  #需要修改的路径,测试或训练样本标签,如t10k-labels.idx1-ubyte或train-labels.idx1-ubyte
     
    # It's 60008B, but we should set to 60000B
    label_file_size = 60008
    label_file_size = str(label_file_size - 8) + 'B'
     
    label_buf = open(label_file, 'rb').read()
     
    magic, numLabels = struct.unpack_from('>II', label_buf, 0)
    labels = struct.unpack_from(
        '>' + label_file_size, label_buf, struct.calcsize('>II'))
    labels = np.array(labels).astype(np.int64)
     
    datas_root = r'C:\Users\TITAN\Desktop\KNN\test' #需要修改的路径
    if not os.path.exists(datas_root):
        os.mkdir(datas_root)
     
    for i in range(10):
        file_name = datas_root + os.sep + str(i)
        if not os.path.exists(file_name):
            os.mkdir(file_name)
     
    for ii in range(10000):# 生成10000张测试或训练样本
        img = Image.fromarray(datas[ii, 0, 0:50, 0:50])
        label = labels[ii]
        file_name = datas_root + os.sep + str(label) + os.sep + \
            'mnist_train_' + str(ii) + '.png'
        img.save(file_name)
    
    print('Finished!')
    

  1. Matlab code

    clc;
    clear;
    
    matrix = [];% 训练矩阵
    for delta = 0:9%构建训练区样本的矩阵
      label_path = strcat('C:\Users\ABC\Desktop\KNN\trian\',int2str(delta),'\');
      disp(length(dir([label_path '*.png'])));
      for i = 1:length(dir([label_path '*.png']))
            im = imread(strcat(label_path,'\',int2str(delta),'_',int2str(i-1),'.png'));
            %imshow(im);
            im = imbinarize(im);%图像二值化
            temp = [];
            for j = 1:size(im,1)% 训练图像行向量化
                temp = [temp,im(j,:)];
            end
            matrix = [matrix;temp];
      end
    end
    
    label = [];%在标签矩阵后添加标签列向量
     for i = 0:9
        tem = ones(length(dir([label_path '*.png'])),1) * i;
        label = [label;tem];
    end
    matrix = horzcat(matrix,label);%带标签列的训练矩阵
    
    %测试对象向量
    for delta = 0:9%构建测试图像的向量
        test_path = strcat('C:\Users\ABC\Desktop\KNN\test\',int2str(delta),'\');
        len = (length(dir([test_path '*.png'])));
        disp(len);
        p = 0;% 识别结果计数
        for i = 1:len
            vec = []; % 测试样本行向量化       
            test_im = imread(strcat('test2\',int2str(delta),'\',int2str(delta),'_',int2str(i-1),'.png'));
            imshow(test_im);
            test_im = imbinarize(test_im);
            for j = 1:size(test_im,1)
                vec = [vec,test_im(j,:)];
            end
    
            dis = [];
            for count = 1:length(dir([label_path '*.png'])) * 10
                row = matrix(count,1:end-1);% 不带标签的训练矩阵每一行向量
                distance = norm(row(1,:)-vec(1,:));% 求欧氏几何距离
                dis = [dis;distance(1,1)];% 距离列向量
            end
            test_matrix = horzcat(matrix,dis);% 加入表示距离的列向量
    
    
            %排序
            test_matrix = sortrows(test_matrix,size(test_matrix,2));
            %输入K值,前K个行向量标签的众数作为结果输出
            K = 5;
            result = mode(test_matrix(1:K,end-1));
            disp(strcat('图像',int2str(delta),'_',int2str(i),'.png','的识别结果是:',int2str(result)));
    
            if(delta == result)
                p = p + 1;
            end
            
            
        end
        pi = p/len;
        disp(strcat('识别精度为:',num2str(pi)));
        disp('Finished!'); 
    end
    

Fourth, the result

  • The KNN (Nearest Neighbor) algorithm is used to realize the recognition of handwritten digits. After testing, the overall accuracy is above 0.9 under the condition of K = 5 and the training sample is 8600. The recognition of individual numbers such as 8 is only about 0.8.
  • The KNN algorithm is simple, but the shortcomings are also obvious, the running time is long, it is easy to converge to the local value, and the accuracy is not high.
  • Increasing the number of training samples, adjusting the K value, and performing preliminary image processing before executing the algorithm may have better performance

Guess you like

Origin blog.csdn.net/Taplus/article/details/112996077