Matlab simulation of handwritten digit recognition based on Bayesian recognizer and linear recognizer

Table of contents

1. Theoretical basis

2. Core program

3. Simulation conclusion


1. Theoretical basis

      Handwritten digit recognition is an important problem in the field of computer vision, and its purpose is to automatically recognize digits from handwritten digit images. Based on Bayesian recognizer and based on linear recognizer are two common handwritten digit recognition algorithms, this paper will introduce these two algorithms in detail from two aspects of mathematical formula and algorithm implementation.

Handwritten Digit Recognition Based on Bayesian Recognizer

Mathematical formula

       The basic idea of ​​the handwritten digit recognition algorithm based on Bayesian recognizer is to calculate the posterior probability of each digit according to Bayesian theorem, and select the digit with the highest posterior probability as the recognition result. Specifically, for a handwritten digit image $x$, its posterior probability of belonging to digit $i$ is:

$$
p(i|x)=\frac{p(x|i)p(i)}{p(x)}
$$

      Among them, $p(x|i)$ represents the probability that the number $i$ generates an image $x$, $p(i)$ represents the prior probability of the number $i$ appearing, and $p(x)$ represents the image $x $ Probability of occurrence. Since $p(x)$ is the same for all numbers, it can be omitted, giving:

$$
p(i|x)\properly p(x|i)p(i)
$$

Among them, $\propto$ means "proportional to".

$p(x|i)$ can be calculated from samples of numbers $i$ in the training dataset, usually modeled using a Gaussian distribution, namely:

$$
p(x|i)=\frac{1}{(2\pi)^{d/2}|\Sigma_i|^{1/2}}\exp\left(-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)\right)
$$

       Among them, $d$ represents the dimension of the image, $\mu_i$ represents the mean vector of the number $i$, and $\Sigma_i$ represents the covariance matrix of the number $i$.

$p(i)$ can be calculated from the samples of the number $i$ in the training data set, that is, $p(i)=n_i/N$, where $n_i$ represents the number of samples of the number $i$, and $N$ represents the total Number of samples.

Algorithm implementation

The following is the specific implementation process of the handwritten digit recognition algorithm based on the Bayesian recognizer:

Preprocessing
      First, some preprocessing operations need to be performed on the input handwritten digital image, including grayscale, binarization, morphological processing, and feature extraction. These operations can be realized by common image processing algorithms.

Train the Model
Next, you need to use the training dataset to train the model. The training process includes the following steps:

(1) For each number $i$ in the training data set, calculate its mean vector $\mu_i$ and covariance matrix $\Sigma_i$.

(2) Calculate the prior probability $p(i)$ of each number $i$.

Test recognition
For each handwritten digit image in the test dataset, it needs to be recognized using the trained model. The specific process is as follows:

(1) For the input handwritten digital image, preprocessing operation is performed.

(2) For each number $i$, calculate the value of $p(x|i)p(i)$, and select the number with the maximum value as the recognition result.

(3) Output the recognition result.

Recognition of Handwritten Digits Based on Linear Recognizer

Mathematical formula

      The basic idea of ​​the handwritten digit recognition algorithm based on a linear recognizer is to use a linear classifier to classify handwritten digit images. Specifically, for a handwritten digit image $x$, the probability of being classified as a digit $i$ can be expressed as:

$$
p(i|x)=\frac{\exp(w_i^Tx+b_i)}{\sum_{j=0}^9\exp(w_j^Tx+b_j)}
$$

Among them, $w_i$ represents the weight vector of the number $i$, and $b_i$ represents the bias item of the number $i$.

Algorithm implementation

The following is the specific implementation process of the handwritten digit recognition algorithm based on the linear recognizer:

Preprocessing
      is similar to the handwritten digit recognition algorithm based on the Bayesian recognizer. First, some preprocessing operations are required on the input handwritten digit image, including grayscale, binarization, morphological processing, and feature extraction. These operations can be realized by common image processing algorithms.

Train the Model
Next, you need to use the training dataset to train the model. The training process includes the following steps:

(1) For each number $i$ in the training data set, calculate its weight vector $w_i$ and bias item $b_i$. Optimization algorithms such as gradient descent are usually used to minimize the loss function. For the specific implementation process, please refer to textbooks and papers related to machine learning.

Test recognition
For each handwritten digit image in the test dataset, it needs to be recognized using the trained model. The specific process is as follows:

(1) For the input handwritten digital image, preprocessing operation is performed.

(2) For each number $i$, calculate the value of $p(i|x)$, and select the number with the maximum value as the recognition result.

(3) Output the recognition result.

       Both Bayesian-based and linear-based recognizers are common handwritten digit recognition algorithms. The Bayesian recognizer uses Bayesian theorem to calculate the posterior probability. The modeling complexity is high, but it has better recognition performance. Using a linear classifier to classify handwritten digit images based on a linear recognizer has low modeling complexity, but its recognition performance may be affected by the data distribution. The appropriate algorithm should be selected according to the actual application scenario.

2. Core program


global flag
global pos0
global x0 y0

 pos=get(handles.WritingAxes,'currentpoint');   
 x=pos(1,1);
 y=pos(1,2);
 if flag && (pos(1,1)>=0&pos(1,1)<100) && (pos(1,2)>=0&pos(1,2)<100)  
      line(x,y, 'marker', '.','markerSize',18, 'LineStyle','-','LineWidth',2,'Color','Black');
      if x>x0
          stepX=0.1;
      else
          stepX=-0.1;
      end
      if y>y0
          stepY=0.1;
      else
          stepY=-0.1;
      end
      X=x0:stepX:x;      
                          
      if abs(x-x0)<0.01    
          Y=y0:stepY:y;     
      else
         Y=(y-y0)*(X-x0)/(x-x0)+y0;   
      end
      line(X ,Y, 'marker', '.','markerSize',18, 'LineStyle','-','LineWidth',2,'Color','Black');
      x0=x;
      y0=y;
      pos0=pos;
 else
      flag=0;
 end
 %-------------------------------------------------------------------------
 
 

 %-------------------------------------------------------------------------
function figure1_WindowButtonUpFcn(hObject, eventdata, handles)
%clc
%手写板实现程序---释放鼠标左键结束画线的程序
global flag
flag=0;

%global data
data=[];
Img=getframe(handles.WritingAxes);
imwrite(Img.cdata,'当前手写数字.bmp','bmp');
I=imread('当前手写数字.bmp');
I=rgb2gray(I);
I=im2bw(I);    
imwrite(I,'当前手写数字.bmp','bmp');
I=imread('当前手写数字.bmp');
data=GetFeature(I);
%--------------------------------------------------------------------------





% --- Executes on selection change in popupmenuNUM.
function popupmenuNUM_Callback(hObject, eventdata, handles)
%-------------------------------------------------------------------------

% --- Executes during object creation, after setting all properties.
function popupmenuNUM_CreateFcn(hObject, eventdata, handles)
if ispc
    set(hObject,'BackgroundColor','white');
else
    set(hObject,'BackgroundColor',get(0,'defaultUicontrolBackgroundColor'));
end
%-------------------------------------------------------------------------

%-------------------------------------------------------------------------
function pushbuttonSave_Callback(hObject, eventdata, handles)

%global data
I=imread('当前手写数字.bmp');
data=GetFeature(I);

load template pattern;

num=get(handles.popupmenuNUM,'value');
switch num
    case 1
        msgbox('请选择数字类别再保存','提示');
    case 2
        pattern(1,1).num=pattern(1,1).num+1;
        pattern(1,1).feature(:,pattern(1,1).num)=data;
    case 3
        pattern(1,2).num=pattern(1,2).num+1;
        pattern(1,2).feature(:,pattern(1,2).num)=data;
    case 4
        pattern(1,3).num=pattern(1,3).num+1;
        pattern(1,3).feature(:,pattern(1,3).num)=data;
   case 5
        pattern(1,4).num=pattern(1,4).num+1;
        pattern(1,4).feature(:,pattern(1,4).num)=data;
    case 6
        pattern(1,5).num=pattern(1,5).num+1;
        pattern(1,5).feature(:,pattern(1,5).num)=data;
    case 7
        pattern(1,6).num=pattern(1,6).num+1;
        pattern(1,6).feature(:,pattern(1,6).num)=data;
    case 8
        pattern(1,7).num=pattern(1,7).num+1;
        pattern(1,7).feature(:,pattern(1,7).num)=data;
    case 9
        pattern(1,8).num=pattern(1,8).num+1;
        pattern(1,8).feature(:,pattern(1,8).num)=data;
    case 10
        pattern(1,9).num=pattern(1,9).num+1;
        pattern(1,9).feature(:,pattern(1,9).num)=data;
    case 11
         pattern(1,10).num=pattern(1,10).num+1;
        pattern(1,10).feature(:,pattern(1,10).num)=data;
end
        
save template pattern;       
%--------------------------------------------------------------------------


%-------------------------------------------------------------------------
function pushbuttonFeature_Callback(hObject, eventdata, handles)

%global data
I=imread('当前手写数字.bmp');
data=GetFeature(I);
data=data';

fprintf('当前手写数字的特征如下所示:\n')
for i=1:25
    if data(i)>0.1
        data(i)=1;
    else
        data(i)=0;
    end
   fprintf(num2str(data(i)));
   fprintf('   ');
   
   if i/5==1 || i/5==2 || i/5==3 || i/5==4 || i/5==5
       fprintf('\n');
   end
end
%-------------------------------------------------------------------------

%--------------------------------------------------------------------------
function pushbuttonNUM_Callback(hObject, eventdata, handles)

load template;
num0=pattern(1,1).num;
disp(['数字0的样本数量为:',num2str(num0)])

num1=pattern(1,2).num;
disp(['数字1的样本数量为:',num2str(num1)])

num2=pattern(1,3).num;
disp(['数字2的样本数量为:',num2str(num2)])

num3=pattern(1,4).num;
disp(['数字3的样本数量为:',num2str(num3)])

num4=pattern(1,5).num;
disp(['数字4的样本数量为:',num2str(num4)])

num5=pattern(1,6).num;
disp(['数字5的样本数量为:',num2str(num5)])

num6=pattern(1,7).num;
disp(['数字6的样本数量为:',num2str(num6)])

num7=pattern(1,8).num;
disp(['数字7的样本数量为:',num2str(num7)])

num8=pattern(1,9).num;
disp(['数字8的样本数量为:',num2str(num8)])

num9=pattern(1,10).num;
disp(['数字9的样本数量为:',num2str(num9)])
%-------------------------------------------------------------------------
up2136

3. Simulation conclusion

 

 

Guess you like

Origin blog.csdn.net/ccsss22/article/details/131319087