OpenCV practice small project (1): credit card number recognition

1. Write in front

Today, I organized a small project of OpenCV practice. A few days ago, I organized a knowledge note of OpenCV processing images . Later, I will apply this knowledge to practice through some small projects. One is to deepen understanding, and the other is to integrate and connect. as a whole, because I found that if these things are not used, they are actually forgotten very quickly. In addition, I found that these practical small projects are very useful, and some code or image processing skills can be used later, so this is the reason why I want to organize them.

The first practice project is credit card number recognition, which is to give a credit card and make the following effects:

insert image description here
The knowledge used in this project is actually encountered in many other scenarios, such as license plate number recognition and detection, digital recognition, etc., so it feels more practical. But in fact, the knowledge used is not complicated in essence, it is completely the basic image operation of OpenCV sorted out earlier, so how is it done?

The following first analyzes the macro implementation logic of this project, that is, how to think about such a small task in general, and then gives specific methods and code explanations.

2. Implement the logic

Given a credit card, the card number above needs to be output, and the position of the card number needs to be circled in the original image. Essentially, this is a template matching task . If we want the computer to recognize numbers, we need to give a template, such as the following:

insert image description here
In this way, we only need to find the number area on the credit card, and then take the numbers in the number area to match the template one by one, and see what the number is, and then we can identify it. However, for a credit card, we need to find its number area. For a given template, although we have its number area, we have to divide it into numbers one by one in order to perform matching work, so the task, just Turned into three sub-problems of processing credit cards, processing templates and template matching. ,

Reminds me of a text I learned in elementary school, "One step, one more step".

How to process a credit card, find the digital area? The general idea is as follows:

  1. Use the contour detection algorithm to find the approximate contour and circumscribed rectangle of each object, that is, locate each object first
  2. After finding the outline of the object, according to the aspect ratio of the circumscribed rectangle, find the long string of numbers in the middle. Since this outline is relatively long and narrow, it is easier to find.
  3. For this long string of numbers, use morphological operations to make it more prominent and make this part more precise
  4. Next, for this part, perform contour detection again, and divide it into four small blocks. For each small block, perform contour detection again to get each specific number.
  5. For each number, match the template (there is a function available directly), and you will know what it is.

What about dealing with templates? This one is very simple. Contour detection can find these 10 objects, and then assign a value to each object, and then build a dictionary.

The code is explained step by step below.

3. Process the template image

The template image first performs three steps: 读入 -> 转成灰度图 -> 二值化, because the contour detection function receives a binary image.

# 读取模板图像
img = cv2.imread("images/ocr_a_reference.png")   # 读取的时候转灰度 cv2.imread("images/ocr_a_reference.png", 0)
# 转成灰度图
template = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# 二值图像
template = cv2.threshold(template, 10, 255, cv2.THRESH_BINARY_INV)[1]

The result is as follows:
insert image description here

Next, use the contour detection function of cv2 to get the contours of 10 numbers

cv2.findContours()The parameter accepted by the function is a binary image, that is, a black and white image (not a grayscale image), cv2.RETR_EXTERNALonly the outer contour is detected, and cv2.CHAIN_APPROX_SIMPLEonly the end point coordinates are retained

# 最新版opencv只返回两个值了 3.2之后, 不会返回原来的二值图像了,直接返回轮廓信息和层级信息
contourss, hierarchy = cv2.findContours(template.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

len(contourss)  # 10个轮廓

The effect is as follows:
insert image description here
In this way, the outer contour of each number is found, each of which is 10, but it should be noted that the arrangement order of these 10 contours does not necessarily correspond to the contour of 0-9 above, so for insurance For the sake of, we need to sort from small to large according to the coordinate value of the upper left corner of each contour.

# 下面将轮廓进行排序,这是因为必须保证轮廓的顺序是0-9的顺序排列着
def sort_contours(cnts, method='left-to-right'):
    reverse = False
    i = 0
    if method == 'right-to-left' or method == 'bottom-to-top':
        reverse = True
    if method == 'top-to-bottom' or method == 'bottom-to-top':
        i = 1
    
    boundingBoxes = [cv2.boundingRect(c) for c in cnts]  # 用一个最小矩形,把找到的形状包起来x,y,h,w
    
    # 根据每个轮廓左上角的点进行排序, 这样能保证轮廓的顺序就是0-9的数字排列顺序
    (cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes), key=lambda x:x[1][i], reverse=reverse))
    
    return cnts, boundingBoxes 

refCnts = sort_contours(contourss, method='left-to-right')[0]  

In this way, each contour is arranged according to 0-9, then the following idea is very clear, traverse each contour object, and attach the real number to it, that is, the established 数字->轮廓association map.

# 每个轮廓进行数字编号
digits2Cnt = {
    
    }
# 遍历每个轮廓
for i, c in enumerate(refCnts):
    # 计算外接矩形,并且resize成合适大小
    (x, y, w, h) = cv2.boundingRect(c)
    # 单独把每个数字框拿出来 坐标系竖着的是y, 横着的是x
    roi = template[y:y+h, x:x+w] 
    # 重新改变大小
    roi = cv2.resize(roi, (57, 88))
    
    # 框与字典对应
    digits2Cnt[i] = roi

# 把处理好的模板进行保存
pickle.dump(digits2Cnt, open('digits2Cnt.pkl', 'wb'))

There are two points here. First, for each contour, first calculate its circumscribed rectangle, that is, frame it first, and then take out the frame from the original template image, which is each number. Then in order to match the numbers on the credit card later, you need to resize here.

In this way, the template image is processed and a ditits2Cntdictionary is obtained. The key of the dictionary is the numerical value, and the value is the outline object in the template.

4. Process Credit Cards and Match

The credit card part is a little more complicated, because we have to locate the number area on the credit card first, and then enhance this area through some operations, etc.

The first step is to read the image, resize it, and convert it to grayscale.

# 读取图像
base_path = 'images'
file_name = 'credit_card_01.png'
credit_card = cv2.imread(os.path.join(base_path, file_name))
credit_card = resize(credit_card, width=300)
credit_gray = cv2.cvtColor(credit_card, cv2.COLOR_BGR2GRAY)

The effect is as follows:
insert image description here
Next, perform a top hat operation, which highlights brighter areas, and a black hat operation, which highlights darker areas.

# 顶帽操作,突出更明亮的区域

# 初始化卷积核
rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 3))  # 自定义卷积核的大小了
sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

tophat = cv2.morphologyEx(credit_gray, cv2.MORPH_TOPHAT, rectKernel)

The effect is as follows:
insert image description here
Next, edge detection is required to highlight the edges of the objects above. Edge detection There we learned horizontal edge detection, vertical edge detection, and the combination of the two, which often works well. But here it is found that horizontal edge detection alone is sufficient.

# 水平边缘检测  
gradX = cv2.Sobel(tophat, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1)  # 水平边缘检测
# gradX = cv2.convertScaleAbs(gradX)    这个操作会把一些背景边缘也给检测出来,加了一些噪声

# 所以下面手动归一化操作
gradX = np.absolute(gradX)
(minVal, maxVal) = (np.min(gradX), np.max(gradX))
gradX = (255 * ((gradX-minVal) / (maxVal-minVal)))
gradX = gradX.astype('uint8')

# 这里也可以按照之前的常规, 先水平,后垂直,然后合并,但是效果可能不如单独x的效果好

The effect is as follows:
insert image description here
at present, the edge can be found, but if you want to connect the numbers next to each other into pieces, you need to use morphological related operations.

# 闭操作: 先膨胀, 后腐蚀  膨胀就能连成一块了
gradX = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, rectKernel)

The effect is as follows:
insert image description here
Then you will find that although most of the numbers are connected into pieces, there are some black holes in some places, and the color is not particularly ordered and obvious, so the following is converted into a binary image, highlighting the object, and the threshold + closing operation is enhanced.

#THRESH_OTSU会自动寻找合适的阈值,适合双峰,需把阈值参数设置为0  让opencv自动的去做判断,找合适的阈值,这样就能自动找出哪些有用,哪些没用
thresh = cv2.threshold(gradX, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] 
cv_show('thresh',thresh)
#再来一个闭操作
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, sqKernel) #再来一个闭操作

The effect is as follows:

insert image description here
Next, you can easily find the contour through the contour detection algorithm, but if you want to get the contour of the number, you also need to filter according to the aspect ratio.

threshCnts, hierarchy = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = threshCnts
cur_img = credit_card.copy()

# 把轮廓画出来
cv2.drawContours(cur_img, cnts, -1, (0, 0, 255), 3)
cv_show('img', cur_img)

The contours found by the algorithm are as follows:
insert image description here
Next, traverse each contour and lock the four digital contours in the middle:

# 找到包围数字的那四个大轮廓
locs = []
# 遍历轮廓
for i, c in enumerate(cnts):
    # 计算外接矩形
    (x, y, w, h) = cv2.boundingRect(c)
    ar = w / float(h)
    
    # 选择合适的区域, 这里的基本都是四个数字一组
    if ar > 2.5 and ar < 4.0:
        if (w > 40 and w < 55) and (h > 10 and h < 20):
            # 符合
            locs.append((x, y, w, h))

# 轮廓从左到右排序
locs = sorted(locs, key=lambda x: x[0])

The operation here is still to wrap the object with the bounding rectangle first, and then select it. This results in four large outlines.

The next step is very simple:

  1. iterate over each large contour

    1. For each contour, do the same as with the template to get the number
    2. For each number, do template matching
    outputs = []
    
    # 遍历每一个轮廓中的的数字
    for (i, (gX, gY, gW, gH)) in enumerate(locs):
        # 初始化组
        groupOutput = []
        
        # 根据坐标提取每一组
        group = credit_gray[gY-5:gY+gH+5, gX-5:gX+gW+5]  # 有5的一个容错长度
        
        # 对于这每一组,先预处理  
        # 二值化,自动寻找合适阈值,增强对比,更突出有用的部分,即数字
        group = cv2.threshold(group, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
        
        # 计算每一组的轮廓
        digitCnts, hierarchy = cv2.findContours(group.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        digitCnts = sort_contours(digitCnts, method='left-to-right')[0]
        
        # 拿到每一组的每一个数字,然后进行模板匹配
        for c in digitCnts:
            # 找到当前数值的轮廓,resize成合适的大小
            (x, y, w, h) = cv2.boundingRect(c)
            roi = group[y:y+h, x:x+w]
            roi = cv2.resize(roi, (57, 88))
            
            # 模板匹配
            scores = []
            for (digit, digitROI) in digits2Cnt.items():
                result = cv2.matchTemplate(roi, digitROI, cv2.TM_CCOEFF)
                (_, score, _, _) = cv2.minMaxLoc(result)
                scores.append(score)
            
            # 得到合适的数字
            # 这是个列表,存储的每个小组里面的数字识别结果
            groupOutput.append(str(np.argmax(scores)))
        
        # 画出来
        cv2.rectangle(credit_card, (gX - 5, gY - 5), (gX + gW + 5, gY + gH + 5), (0, 0, 255), 1)
        cv2.putText(credit_card, "".join(groupOutput), (gX, gY - 15), cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 0, 255), 2)
        
        # 合并到最后的结果里面
        outputs.extend(groupOutput)
    
  2. output result

    # 打印结果
    print("Credit Card Type: {}".format(FIRST_NUMBER[outputs[0]]))
    print("Credit Card #: {}".format("".join(outputs)))
    cv2.imshow("Image", credit_card)
    

5. General Manager

This project is over here, the overall is relatively simple, but many of the knowledge points involved are more commonly used. Summarized as follows:

  1. 图像的读取 ->转灰度->二值化operate
  2. find contour operation ( cv2.findContours)
  3. Basic morphological operations (top hat, black hat, opening and closing, dilation corrosion)
  4. Edge detection operations (Sobel operator, Sharr operator, etc.)
  5. Contour sorting, it must be noted that the found contour array may be out of order
  6. Draw a bounding rectangle, and then take out a specific object

Of course, it does not involve very complicated logic. It is all the basic functions of Opencv and basic python operations. It can be regarded as a small entry project for image processing.

The code address of this project is https://github.com/zhongqiangwu960812/OpenCVLearning , if you are interested, you can play it.

Guess you like

Origin blog.csdn.net/wuzhongqiang/article/details/123796571