1. Concept introduction:
Image recognition (Image Recognition) refers to the use of computers to process, analyze and understand images to identify various patterns of targets and objects.
The development of image recognition has gone through three stages: character recognition, digital image processing and recognition, and object recognition. The field of machine learning generally transforms such identification problems into classification problems.
Handwriting recognition is a common image recognition task. Computers recognize words in pictures through handwritten pictures. Different from printed fonts, different people's handwriting styles and sizes are different, which makes it difficult for computers to recognize handwriting tasks.
Digit handwriting recognition is a relatively simple handwriting recognition task due to its limited categories (10 digits from 0 to 9). DBRHD and MNIST are two commonly used digital handwriting recognition datasets
2. Data introduction:
Download link of MNIST: http://yann.lecun.com/exdb/mnist/.
MNIST is a dataset of handwritten pictures containing numbers 0~9. The pictures have been normalized to 28*28 pictures centered on the handwritten digits.
MNIST consists of two parts, a training set and a test set. The scales of each part are as follows:
Training set: 60,000 handwritten images and corresponding labels
Test set: 10,000 handwritten images and corresponding labels
DBRHD (Pen-Based Recognition of Handwritten Digits Data Set) is a digital handwritten database provided by UCI's Machine Learning Center: https://archive.ics.uci.edu/ml/datasets/PenBased+Recognition+of+Handwritten+Digits.
The DBRHD dataset contains a large number of handwritten pictures of numbers 0~9. These pictures are derived from the handwritten numbers of 44 different people. The pictures have been normalized to 32*32 size pictures centered on the handwritten numbers.
The training set and test set of DBRHD are composed as follows:
Training set: 7,494 handwritten images and corresponding labels, from 40 handwritten writers
Test set: 3,498 handwritten pictures and corresponding labels, from 14 handwritten writers
3. Task process:
①Input
②Output
③The structure of MPL
④Step
import numpy as np #Use the listdir module to access local files from os import listdir from sklearn.neural_network import MLPClassifier #Define img2vector function to expand the loaded 32*32 image matrix into a column vector def img2vector (fileName): retMat =np.zeros([1024],int) fr = open (fileName) #Open a digital file containing a size of 32*32 lines =fr.readlines() #Read all lines of the file for i in range ( 32 ): for j in range ( 32 ): # put 01 numbers Stored in retMat retMat[i* 32 +j]=lines[i][j]; return retMat #And convert the sample label to a one-hot vector def readDataSet (path): fileList =listdir(path) #Get all files in the folder numFiles = len (fileList) #Count the number of files to be read dataSet =np.zeros([numFiles , 1024 ] , int ) #Used to store all digital files hwLabels =np.zeros([numFiles , 10 ] ) #used to store the corresponding label one-hot for i in range (numFiles): filePath =fileList[i] #Get file name / path digit = int (filePath.split( '_' )[ 0 ]) hwLabels[i][digit]= 1.0 dataSet[i]=img2vector(path+ '/' +filePath) #Read the file content return dataSet , hwLabels train_dataSet,train_hwLabels =readDataSet('trainingDigits') #Build a neural network: Set the number of hidden layers of the network, the number of neurons in each hidden layer, #activation function, learning rate, optimization method, and the maximum number of iterations. #hidden_layer_sizes stores a tuple, indicating the number of neurons in the hidden layer of the i -th layer #Use the logistic activation function and the adam optimization method, and set the initial learning rate to 0.0001 clf =MLPClassifier( hidden_layer_sizes =( 50 , ) , activation = 'logistic' , solver = 'adam' , learning_rate_init = 0.0001 , max_iter = 2000 ) #fit function can automatically set the number of neurons in the input and output layers of the multilayer perceptron according to the training set and the corresponding label set. #Example _ train_dataSet is a matrix of n*1024 , train_hwLabels is a matrix of n*10 , #The fit function sets the number of neurons in the input layer of MLP to 1024 , and the number of neurons in the output layer is 10. clf.fit(train_dataSet , train_hwLabels) #Test set evaluation dataSet , hwlLabels =readDataSet( 'testDigits' ) res=clf.predict(dataSet) #Predict the test set error_num = 0 #Number of statistical prediction errors num = len (dataSet) #Number of test sets for i in range (num): #Compare arrays of length 10 , Returns an array containing 01 , 0 is different, 1 is the same if np.sum(res[i]==hwlLabels[i])< 10 : error_num+=1 print("Total num:",num,"Wrong num:",error_num," WrongRate:",error_num/float(num))
Experimental effect:
The following results are the results of the course, and the results of my own experiments are not much different from the results.
2. Use the KNN classifier to recognize the handwritten digits of the data set DBRHD (the content is similar to the above, but the algorithm is slightly different)
import numpy as np #Use the listdir module to access local files from os import listdir from sklearn import neighbors #Define img2vector function to expand the loaded 32*32 image matrix into a column vector def img2vector (fileName): retMat =np.zeros([1024],int) fr = open(fileName) #打开包含32*32大小的数字文件 lines =fr.readlines() #读取文件的所有行 for i in range(32): for j in range(32): #将01数字存放在retMat retMat[i*32+j]=lines[i][j]; return retMat #并将样本标签转化为one-hot向量 def readDataSet(path): fileList =listdir(path) #获取文件夹下所有文件 numFiles =len(fileList) #统计需要读取的文件的数目 dataSet =np.zeros([numFiles,1024],int) #用于存放所有的数字文件 hwLabels =np.zeros([numFiles,10]) #用于存放对应的标签one-hot for i in range(numFiles): filePath =fileList[i] #获取文件名称/路径 digit =int(filePath.split('_')[0]) hwLabels[i][digit]=1.0 dataSet[i]=img2vector(path+'/'+filePath)#读取文件内容 return dataSet,hwLabels train_dataSet,train_hwLabels =readDataSet('trainingDigits') #构建KNN分类器:设置查找算法以及邻居点 数量(k)值。 #KNN是一种懒惰学习法,没有学习过程,只在预测时去查找最近邻的点, #数据集的输入就是构建KNN分类器的过程 knn =neighbors.KNeighborsClassifier(algorithm='kd_tree',n_neighbors=3) knn.fit(train_dataSet,train_hwLabels) #测试集评价 dataSet,hwlLabels =readDataSet('testDigits') res=knn.predict(dataSet) #对测试集进行预测 error_num =np.sum(res!=hwlLabels) #统计预测错误的数目 num =len(dataSet) #测试集的数目 print("Total num:",num,"Wrong num:",error_num," WrongRate:",error_num/float(num))
实验结果(同上)