Item: Answer sheet recognition
github address
The solution process is as follows
Pretreatment
First perform Canny edge detection on the picture, and then perform the expansion operation. The purpose of the expansion operation is to, if the outer contour of the paper is not very obvious, after the Canny edge detection, the
outer contour of the paper is discontinuous and there are small holes, use the expansion operation to fill the small holes
The results of the processing are as follows:
Contour detection
Extract the contour with the largest area MaxContour
, and perform adaptive contour approximation.epsilon = 0.0001 * 周长
The specific code is as follows:
#步长设置为周长的0.0001倍,一般来说取epsilon = 0.001倍周长
step = 0.0001 * cv2.arcLength(cnts[0], True)
epsilon = step
#不断递增epsilon直到近似所得轮廓正好包含四个点
while len(cnt) != 4:
cnt = cv2.approxPolyDP(cnts[0], epsilon, True)
#步增epsilon
epsilon += step
The processing results are as follows:
Perspective transformation
Before perspective transformation, advanced preprocessing is required, and the four points of the contour 左上、右上、右下、左下
are sorted in the order, the sorting part of the code is as follows:
#将四个轮廓点排序
pts = np.zeros((4, 2), np.float32)
res = np.sum(points, axis=1)
pts[0] = points[np.argmin(res)]
pts[2] = points[np.argmax(res)]
res = np.diff(points, axis=1)
pts[1] = points[np.argmin(res)]
pts[3] = points[np.argmax(res)]
Then find the maximum width and maximum height, the specific code is as follows:
#计算边长
w1 = np.sqrt((pts[0][0] - pts[1][0]) ** 2 + (pts[0][1] - pts[1][1]) ** 2)
w2 = np.sqrt((pts[2][0] - pts[3][0]) ** 2 + (pts[2][1] - pts[3][1]) ** 2)
w = int(max(w1, w2))
h1 = np.sqrt((pts[1][0] - pts[2][0]) ** 2 + (pts[1][1] - pts[2][1]) ** 2)
h2 = np.sqrt((pts[0][0] - pts[3][0]) ** 2 + (pts[0][1] - pts[3][1]) ** 2)
h = int(max(h1, h2))
After all the preprocessing, we can start our last and most important step-perspective transformation, the specific code is as follows:
#目标四个点
dst = np.array([
[0, 0],
[w - 1, 0],
[w - 1, h - 1],
[0, h - 1]
], np.float32)
#透视变换
mat = cv2.getPerspectiveTransform(pts, dst)
paper1 = org1.copy()
paper1 = cv2.warpPerspective(paper1, mat, (w, h))
if show_process:
imshow(paper1)
The results are as follows:
Pretreatment
After obtaining the perspective transformed picture, it is also necessary to perform preprocessing operations. First, in order to eliminate the influence of different exposure levels of different pictures, the picture needs to be adaptive histogram equalization first.
The processing results are as follows:
Then the picture is binarized for contour detection. But finished binarization picture there is a problem, that is painted the answer sheet when, if not painted full,
it may cause inaccurate detection, so in order to make the results more accurate detection, but also the need for closing operation operation , The results after processing are as follows:
Contour detection + contour filtering
First extract all contours, and the results are as follows:
You can see that a lot of contours have been extracted, many of which are contours that we don't need, so we need to use some filtering algorithms to keep the contours (25 ellipses) we need.
The filtering algorithm steps here are as follows:
- First obtain the circumscribed figure of the contour to be detected, if it is a circle, obtain the circumscribed circle of the contour
- May then be filtered by area, when the area of the profile / external pattern area ratio
ratio
satisfies:ratio > 0.8 and ratio < 1.2
time to meet the requirements - May then be filtered according to the circumference, when the circumference outline / perimeter ratio of the external graphics
ratio
satisfied:ratio > 0.8 and ratio < 1.2
when meet the requirements
The specific code is more complicated, as follows:
#用于保存保留下来的轮廓
cntsex = []
#上下边界阈值
thresh_lower = 0.8
thresh_upper = 1.2
eps = 1e-6
show = org1.copy()
for cnt in cnts:
cntcopy = cnt.copy()
#按照h方向坐标对轮廓的所有点排序,找到最大的y
cntcopy = sorted(cntcopy, key=lambda x: x[0][1], reverse=True)
maxy = cntcopy[0][0][1]
#按照w方向坐标对轮廓的所有点排序,找到最大的x
cntcopy = sorted(cntcopy, key=lambda x: x[0][0], reverse=True)
maxx = cntcopy[0][0][0]
#获得椭圆的中心
(x, y), radius = cv2.minEnclosingCircle(cnt)
center = (int(x), int(y))
radius = int(radius)
#获得椭圆的长轴和短轴
a = maxx - x
b = maxy - y
if b == 0:
continue
ratio = a / b;
if ratio > 2 or ratio < 0.5:
continue
if radius == 0:
continue
#面积过滤
areaex = np.pi * a * b
area = cv2.contourArea(cnt)
ratio = area / areaex
if ratio < thresh_upper and ratio > thresh_lower:
cntsex.append(cnt)
show = cv2.drawContours(show, [cnt], 0, (0, 255, 0), 1)
show = cv2.ellipse(show, center, (int(a), int(b)), 0, 0, 360, (0, 0, 255), 1)
After this we have all the contours that are more like ellipses, but this is not enough, because some ellipses used for binding have also been preserved. It can be observed that
the characteristic of these ellipses used for binding is that their area is better than the answer. The ellipse of is much smaller, so we sort all the contours, key = 轮廓的面积
and then filter out the smaller area through a specific algorithm. The specific code is as follows:
#第二次过滤
cnts = []
maxarea = -1e6
for cnt in cntsex:
area = cv2.contourArea(cnt)
if area > maxarea:
maxarea = area
maxgap = 0.5 * maxarea
cntsex = sorted(cntsex, key=lambda x: cv2.contourArea(x), reverse=True)
prvarea = cv2.contourArea(cntsex[0])
cnts.append(cntsex[0])
for i in range(1, len(cntsex)):
if abs(prvarea - cv2.contourArea(cntsex[i])) > maxgap:
break
cnts.append(cntsex[i])
The final processing result is as follows:
Sort + check
Then you need to sort the contours from top to bottom and from left to right. This program completes the detection while sorting. The specific code is as follows:
#对多个轮廓按照从上到下的顺序排序
cnts = sorted(cnts, key=lambda x: x[0][0][1])
rows = int(len(cnts) / 5)
TAB = ['A', 'B', 'C', 'D', 'E']
ANS = []
#检查每一行(即每一题)的答案
for i in range(rows):
subcnts = cnts[i*5:(i+1)*5]
subcnts = sorted(subcnts, key=lambda x: x[0][0][0])
total = []
for (j, cnt) in enumerate(subcnts):
mask = np.zeros(paper1.shape, dtype=np.uint8)
cv2.drawContours(mask, [cnt], -1, 255, -1) #-1表示填充
mask = cv2.bitwise_and(paper1, paper1, mask=mask)
total.append(cv2.countNonZero(mask))
idx = np.argmax(np.array(total))
ANS.append(TAB[idx])
print(ANS)
The processing results are as follows: