Opencv combat - digital recognition


foreword

After a period of python-opencv learning, some basic usages of opencv in image processing, now that you have learned it, you should apply what you have learned, just like using the knowledge you have learned now to practice it. I saw it on the Internet using opencv To realize the number recognition of the bank card, but because the explanation is too brief, it follows the basic idea of ​​number recognition to realize the number recognition step by step. Because I don't know how to surf the Internet scientifically, the complete code is placed on gitee .

1. Identification principle

I learned about template matching in the previous python-opencv, and the basic idea of ​​bank card number recognition is also based on template matching. I think the whole process is feature extraction and feature matching. By reading in a number template image, it is obtained after basic image processing. The outline of each number, and then read and sort the outline of each number to obtain a template.
insert image description here
Then read in the sample picture, and after a series of image processing, the outline of the number string is obtained, and each string of numbers is split to extract the number outline, and compared with each number in the template, the highest score is the current number value. The specific implementation method is in the code implementation.
insert image description here
Since I couldn't find a suitable sample picture, I made one myself.

2. Code implementation

1. Make a template

It is relatively simple to make a template, because the template images are sorted by size in advance, and the images are also in the form of grayscale images, so there is no need to do too much image processing.

# 导入工具包
from imutils import contours
import numpy as np
import argparse
import cv2
import myutils
# 设置参数
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
ap.add_argument("-t", "--template", required=True,
	help="path to template OCR-A image")
args = vars(ap.parse_args())
# 绘图展示
def cv_show(name,img):
	cv2.imshow(name, img)
	cv2.waitKey(0)
	cv2.destroyAllWindows()
# 读取一个模板图像
img = cv2.imread(args["template"])
#cv_show('img',img)
# 灰度图
ref = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#cv_show('ref',ref)
# 二值图像
ref = cv2.threshold(ref, 10, 255, cv2.THRESH_BINARY_INV)[1]
#cv_show('ref',ref)
# 计算轮廓
#cv2.findContours()函数接受的参数为二值图,即黑白的(不是灰度图),cv2.RETR_EXTERNAL只检测外轮廓,cv2.CHAIN_APPROX_SIMPLE只保留终点坐标
#返回的list中每个元素都是图像中的一个轮廓
refCnts, hierarchy = cv2.findContours(ref.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
#外画轮廓
cv2.drawContours(img,refCnts,-1,(0,0,255),3)
#cv_show('img',img)
#print (np.array(refCnts).shape)
refCnts = myutils.sort_contours(refCnts, method="left-to-right")[0] #排序,从左到右,从上到下
digits = {
    
    }
# 遍历每一个轮廓
for (i, c) in enumerate(refCnts):
	# 计算外接矩形并且resize成合适大小
	(x, y, w, h) = cv2.boundingRect(c)
	roi = ref[y:y + h, x:x + w]
	roi = cv2.resize(roi, (57, 88))
	#cv_show('image',roi)
	# 每一个数字对应每一个模板
	digits[i] = roi

2. Sample identification

From the sample picture, it can be seen that the number strings are not in a regular order, and the distribution is messy. At this time, the image needs to be processed, first convert the image into a grayscale image, and then binarize it. At this time, the image is only a string of numbers.insert image description here

The number string uses the sobel operator to draw the edge of the number string, and through one or two closing operations (expand first, then corrode), the purpose of the closing operation is actually to connect the numbers together. The size of the convolution kernel set earlier also affects the closing operation. For operation, convolution kernels of different sizes have different effects, so that it is convenient to frame the entire number string later. The subsequent operation is actually similar to making a template, frame the number string, divide the number, and compare each number with the template one by one. , the highest score is the value of this number.
.

# 初始化卷积核
rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 9))
sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 9))

#读取输入图像,预处理
image = cv2.imread(args["image"])
#cv_show('image',image)
#image = myutils.resize(image, width=300)
# 灰度图
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#cv_show('gray',gray)
#二值图像
tophat = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV)[1]
#cv_show('tophat',tophat)
# 画边框
gradX = cv2.Sobel(tophat, ddepth=cv2.CV_32F, dx=1, dy=0, #ksize=-1相当于用3*3的
	ksize=-1)
gradX = np.absolute(gradX)
(minVal, maxVal) = (np.min(gradX), np.max(gradX))
gradX = (255 * ((gradX - minVal) / (maxVal - minVal)))
gradX = gradX.astype("uint8")

print (np.array(gradX).shape)
#cv_show('gradX',gradX)
#通过闭操作(先膨胀,再腐蚀)将数字连在一起
gradX = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, rectKernel)
#cv_show('gradX',gradX)
#再来一个闭操作
thresh = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, sqKernel) #再来一个闭操作
#cv_show('thresh',thresh)
# 计算轮廓
threshCnts, hierarchy = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
#画轮廓
cnts = threshCnts
cur_img = image.copy()
cv2.drawContours(cur_img,cnts,-1,(0,0,255),3)
#cv_show('img',cur_img)
locs = []
# 遍历轮廓
for (i, c) in enumerate(cnts):
	# 计算矩形
	(x, y, w, h) = cv2.boundingRect(c)
	ar = w / float(h)
	# 选择合适的区域,根据实际任务来,这里的基本都是四个数字一组
	if (w > 40) and (h > 10):
		#符合的留下来
		locs.append((x, y, w, h))

# 将符合的轮廓从左到右排序
locs = sorted(locs, key=lambda x:x[0])
output = []
# 遍历每一个轮廓中的数字
for (i, (gX, gY, gW, gH)) in enumerate(locs):
	# initialize the list of group digits
	groupOutput = []
	# 根据坐标提取每一个组
	group = tophat[gY :gY + gH , gX :gX + gW ]
	#cv_show('group',group)
	# 预处理
	group = cv2.threshold(group, 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
#	cv_show('group',group)
	# 计算每一组的轮廓
	digitCnts,hierarchy = cv2.findContours(group.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
	digitCnts = contours.sort_contours(digitCnts,
		method="left-to-right")[0]

	# 计算每一组中的每一个数值
	for c in digitCnts:
		# 找到当前数值的轮廓,resize成合适的的大小
		(x, y, w, h) = cv2.boundingRect(c)
		roi = group[y:y + h, x:x + w]
		roi = cv2.resize(roi, (55,87))
		#cv_show('roi',roi)

		# 计算匹配得分
		scores = []

		# 在模板中计算每一个得分
		for (digit, digitROI) in digits.items():
			# 模板匹配
			result = cv2.matchTemplate(roi, digitROI,
				cv2.TM_CCOEFF)
			(_, score, _, _) = cv2.minMaxLoc(result)
			scores.append(score)

		# 得到最合适的数字
		groupOutput.append(str(np.argmax(scores)))

	# 画出来
	cv2.rectangle(image, (gX - 5, gY - 5),
		(gX + gW + 5, gY + gH + 5), (0, 0, 255), 1)
	cv2.putText(image, "".join(groupOutput), (gX, gY - 15),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 0, 255), 2)

	# 得到结果
	output.extend(groupOutput)
cv2.imshow("Image", image)
cv2.waitKey(0)

Effect:
insert image description here

Summarize

In the process of debugging, it is found that the number 8 and the number 6 are often recognized as 0. The reason may be that the circles in 8 and 6 match the template of 0, resulting in a higher score. The subsequent processing is to add a 1 to the template 0. Horizontal, and then I found that there is no 0 in the sample, so I simply didn't add it.insert image description here

Guess you like

Origin blog.csdn.net/Thousand_drive/article/details/124754695
Recommended