手写体数字识别(二) 训练图片提取 文档

版权声明: https://blog.csdn.net/zeroice7/article/details/77996748

环境: Win10 64bit + VS2015(推荐VS2013OpenCV 2.4.9兼容性好) + OpenCV 2.4.9

参考网页:http://www.cnblogs.com/xuanyuyt/p/6405944.html

特征提取方式:HOG(方向梯度直方图,Histogram of Oriented Gradient)

学习模型:SVM(支持向量机,Support Vector Machine)


本文目的:通过Matlab获取识别数字的训练数据图片


1. 获取训练模型

从网站:

http://yann.lecun.com/exdb/mnist/

获得MNIST的手写数据集,共四个文件:

train-images-idx3-ubyte.gz:  training set images (9912422 bytes)

train-labels-idx1-ubyte.gz:  training set labels (28881 bytes)

t10k-images-idx3-ubyte.gz:   test set images (1648877 bytes)

t10k-labels-idx1-ubyte.gz:   test set labels (4542 bytes)

分别为两个图片集,两个标签;

train-images-idx3-ubyte.gz  为:训练集样本

train-labels-idx1-ubyte.gz 为:训练集标签

t10k-images-idx3-ubyte.gz 为:测试集样本

t10k-labels-idx1-ubyte.gz 为:测试集标签

2. 通过Matlab提取模型

由于存储格式不同,需利用Matlab将上述文件转为后缀为".bmp"的位图。

程序如下:

fid_image=fopen('train-images.idx3-ubyte','r');
fid_label=fopen('train-labels.idx1-ubyte','r');
% Read thefirst 16 Bytes
magicnumber=fread(fid_image,4);
size=fread(fid_image,4);
row=fread(fid_image,4);
col=fread(fid_image,4);
% Read thefirst 8 Bytes
extra=fread(fid_label,8);
% Read labels related to images
imageIndex=fread(fid_label);
Num=length(imageIndex);
% Count repeattimes of 0 to 9
cnt=zeros(1,10);
for k=1:Num
   image=fread(fid_image,[max(row),max(col)]);     % Getimage data
   val=imageIndex(k);      % Get value of image
    for i=0:9
        ifval==i
           cnt(val+1)=cnt(val+1)+1;
        end
    end
    ifcnt(val+1)<10
       str=[num2str(val),'_000',num2str(cnt(val+1)),'.bmp'];
    elseifcnt(val+1)<100
       str=[num2str(val),'_00',num2str(cnt(val+1)),'.bmp'];
    elseifcnt(val+1)<1000
       str=[num2str(val),'_0',num2str(cnt(val+1)),'.bmp'];
    else
       str=[num2str(val),'_',num2str(cnt(val+1)),'.bmp'];
    end
   imwrite(image',str);
end
fclose(fid_image);
fclose(fid_label);

【内容待添加......】

猜你喜欢

转载自blog.csdn.net/zeroice7/article/details/77996748
今日推荐