OpenCV-Python using OCR handwritten data sets run SVM | fifty-six

aims

In this chapter, we will re-recognize handwritten data sets, but instead of using SVM kNN.

Recognize handwritten numbers

In kNN, we directly use the pixel intensity as a feature vector. This time we'll use a gradient orientation histogram (HOG) as a feature vector.

Here, before finding HOG, we use second-order moments of the image skew correction. Therefore, we first define a function Deskew () , the function acquiring a digital image and corrected. The following is Deskew () function:

def deskew(img):
    m = cv.moments(img)
    if abs(m['mu02']) < 1e-2:
        return img.copy()
    skew = m['mu11']/m['mu02']
    M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
    img = cv.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
    return img

The following figure shows a zero is applied to the image offset correction function. Left image is the original image, such as the right shift corrected image.

Next, we have to find HOG descriptors for each cell. To this end, we found a number of Sobel pilot for each unit in the X and Y directions. Then find their size and direction of the gradient at each pixel. The gradients are quantized to 16 integer values. This image is divided into four sub-squares. For each sub-square calculate the right direction major small histogram (16 bin). Thus, each sub-square provides a vector containing 16 values ​​for you. (Four square sub) together provide four such vectors containing a vector of feature values ​​64 for us. This is our feature vectors for training data.

def hog(img):
    gx = cv.Sobel(img, cv.CV_32F, 1, 0)
    gy = cv.Sobel(img, cv.CV_32F, 0, 1)
    mag, ang = cv.cartToPolar(gx, gy)
    bins = np.int32(bin_n*ang/(2*np.pi))    # quantizing binvalues in (0...16)
    bin_cells = bins[:10,:10], bins[10:,:10], bins[:10,10:], bins[10:,10:]
    mag_cells = mag[:10,:10], mag[10:,:10], mag[:10,10:], mag[10:,10:]
    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)     # hist is a 64 bit vector
    return hist

Finally, as in the previous case, we split the first data set as a single large cell. For each number, reserved for the training data units 250, 250 the remaining data reserved for testing. Complete the code below, you can download it from here:

#!/usr/bin/env python
import cv2 as cv
import numpy as np
SZ=20
bin_n = 16 # Number of bins
affine_flags = cv.WARP_INVERSE_MAP|cv.INTER_LINEAR
def deskew(img):
    m = cv.moments(img)
    if abs(m['mu02']) < 1e-2:
        return img.copy()
    skew = m['mu11']/m['mu02']
    M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
    img = cv.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
    return img
def hog(img):
    gx = cv.Sobel(img, cv.CV_32F, 1, 0)
    gy = cv.Sobel(img, cv.CV_32F, 0, 1)
    mag, ang = cv.cartToPolar(gx, gy)
    bins = np.int32(bin_n*ang/(2*np.pi))    # quantizing binvalues in (0...16)
    bin_cells = bins[:10,:10], bins[10:,:10], bins[:10,10:], bins[10:,10:]
    mag_cells = mag[:10,:10], mag[10:,:10], mag[:10,10:], mag[10:,10:]
    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)     # hist is a 64 bit vector
    return hist
img = cv.imread('digits.png',0)
if img is None:
    raise Exception("we need the digits.png image from samples/data here !")
cells = [np.hsplit(row,100) for row in np.vsplit(img,50)]
# First half is trainData, remaining is testData
train_cells = [ i[:50] for i in cells ]
test_cells = [ i[50:] for i in cells]
deskewed = [list(map(deskew,row)) for row in train_cells]
hogdata = [list(map(hog,row)) for row in deskewed]
trainData = np.float32(hogdata).reshape(-1,64)
responses = np.repeat(np.arange(10),250)[:,np.newaxis]
svm = cv.ml.SVM_create()
svm.setKernel(cv.ml.SVM_LINEAR)
svm.setType(cv.ml.SVM_C_SVC)
svm.setC(2.67)
svm.setGamma(5.383)
svm.train(trainData, cv.ml.ROW_SAMPLE, responses)
svm.save('svm_data.dat')
deskewed = [list(map(deskew,row)) for row in test_cells]
hogdata = [list(map(hog,row)) for row in deskewed]
testData = np.float32(hogdata).reshape(-1,bin_n*4)
result = svm.predict(testData)[1]
mask = result==responses
correct = np.count_nonzero(mask)
print(correct*100.0/result.size)

This special method gives us nearly 94 percent accuracy. You can try different values ​​for the various parameters of SVM to check whether you can achieve higher accuracy. Or, you can read the technical paper on this area and try to implement them.

Additional Resources

  1. Histograms of Oriented Gradients Video:https://www.youtube.com/watch?v=0Zib1YEE4LU

Exercise

  1. OpenCV sample comprising digits.py, the method described above it a number of improvements to give improved results. It also contains reference material. Check and understand it.

AI welcomes the attention Pan Chong station blog: http://panchuang.net/

OpenCV Chinese official document: http://woshicver.com/

Welcome attention Pan Chong blog resources Summary station: http://docs.panchuang.net/

Published 372 original articles · won praise 1063 · Views 670,000 +

Guess you like

Origin blog.csdn.net/fendouaini/article/details/105177878