Halcon study notes OCR series-ring characters, italics

This article is mainly to introduce some of the pictures I have experienced that are difficult to extract OCR part, so as to introduce some special processing methods.
The first : differential Gaussian diff_of_gauss (approximately Laplacian Gaussian)
original image is as follows: the general method basically can not extract the corresponding characters.
Insert picture description here
Then we can directly get a good effect picture through the difference Gaussian operator, the code and effect picture are as follows:

read_image (Image, 'C:/Users/Administrator/Desktop/3.bmp')
rgb1_to_gray (Image, GrayImage)
*差分高斯
diff_of_gauss (GrayImage, DiffOfGauss, 3, 1.6)
threshold (DiffOfGauss, Regions, 2, 12)

Insert picture description here
This effect can be processed directly, referring to my previous OCR program, and the result can be obtained directly, I will not write it here.

The second type : circular characters, the main idea is to straighten the circular part through polar coordinate conversion, and the rest is normal character reading.
Original image:
Insert picture description here

read_image (Image, 'C:/Users/Administrator/Desktop/环形字符.png')
rgb1_to_gray (Image, GrayImage)
get_image_size (GrayImage, Width, Height)
emphasize (GrayImage, ImageEmphasize, Width, Height, 1)
threshold (ImageEmphasize, Regions, 0, 21)
connection (Regions, ConnectedRegions)
select_shape_std (ConnectedRegions, SelectedRegions, 'max_area', 70)
fill_up (SelectedRegions, RegionFillUp)
*分别两次膨胀
dilation_circle (RegionFillUp, RegionDilation, 20)
dilation_circle (RegionFillUp, RegionDilation1, 70)
*这里是求出两个膨胀区域的最小外接圆的圆心和半径,为之后的极坐标转换做准备
smallest_circle (RegionDilation, InnerRow, InnerCol, InnerRadius)
smallest_circle (RegionDilation1, OuterRow, OuterCol, OuterRadius)
*求出两个膨胀区域的差异部分,就是求出一个字体圆环部分
difference (RegionDilation1, RegionDilation, RegionDifference)
reduce_domain (ImageEmphasize, RegionDifference, ImageReduced)
*这里就是极坐标转换的算子了,就是将环形部分拉直,方便读取OCR
polar_trans_image_ext (ImageReduced, PolarTransImage, OuterRow, OuterCol, rad(-30), rad(-120), InnerRadius+16, OuterRadius, Width, Height/8, 'nearest_neighbor')
mirror_image (PolarTransImage, ImageMirror, 'row')
gray_range_rect (ImageMirror, ImageResult, 7, 7)
binary_threshold (ImageResult, Region, 'max_separability', 'light', UsedThreshold)
connection (Region, ConnectedRegions1)
select_shape (ConnectedRegions1, SelectedRegions1, ['height','area'], 'and', [54.14,912.03], [100,10000])
partition_rectangle (SelectedRegions1, Partitioned, 40, 70)
sort_region (Partitioned, SortedRegions, 'character', 'true', 'row')
invert_image (ImageResult, ImageInvert)
read_ocr_class_mlp ('Industrial_0-9A-Z_NoRej.omc', OCRHandle)
do_ocr_multi_class_mlp (SortedRegions, ImageInvert, OCRHandle, Class, Confidence)
dev_display (Image)
set_tposition (3600, 61, 63)
write_string (3600, Class)

Font circle part renderings:
Insert picture description here
circular font straightening renderings:
Insert picture description here
final result figure:
Insert picture description here
the circle part of the general font, two morphological processing and one difference, you can get it directly, this routine method should generally be Find the edge. The normal processing of reading OCR in this ring is mainly the polar_trans_image_ext operator, and the others are some preprocessing methods.

The third type of italics processing
Original image: The
Insert picture description here
code is as follows:

read_image (Image, 'C:/Users/Administrator/Desktop/斜体字练习.png')
dev_close_window()
get_image_size (Image, Width, Height)
dev_open_window (0, 0, Width, Height, 'black', WindowHandle)
dev_display (Image)
fast_threshold (Image, Region, 0, 128, 20)
*获取字体区域偏转的角度
text_line_slant (Region, Image, 45, -0.523599, 0.523599, SlantAngle)
*生成矩阵
hom_mat2d_identity (HomMat2DIdentity)
*这里是获取字体转正的矩阵,SlantAngle这个只是之前获取的字体斜了多少度,那么要转正的话就是纠正
*这个偏转角度,所以就要向相反方向偏正
hom_mat2d_slant (HomMat2DIdentity, -SlantAngle, 'x', 0, 0, HomMat2DSlant)
affine_trans_image (Image, ImageAffinTrans, HomMat2DSlant, 'nearest_neighbor', 'false')
fast_threshold (ImageAffinTrans, Region1, 0, 90, 20)
connection (Region1, ConnectedRegions1)

read_ocr_class_mlp ('DotPrint_0-9A-Z.omc', OCRHandle)

dilation_rectangle1 (Region1, RegionDilation, 2, 5)
connection (RegionDilation, ConnectedRegions)
partition_rectangle (ConnectedRegions, Partitioned, 32, 45)
intersection (Partitioned, ConnectedRegions1, RegionIntersection)
sort_region (RegionIntersection, SortedRegions, 'character', 'true', 'row')
do_ocr_multi_class_mlp (SortedRegions, ImageAffinTrans, OCRHandle, Class, Confidence)

Corrected picture:
Insert picture description here
the result of processing:
Insert picture description here

The above are three common methods that require special processing to read OCR normally. Of course, there are some more rare items, such as the kind of lettering and the color of the body, and the font is sunken or raised, if you encounter this kind of situation, first consider highlighting the edge by low-distance and low-angle lighting The light source setting is limited, then you can consider the luminosity stereo method to solve it (when the diff_of_gauss effect is not working).

Guess you like

Origin blog.csdn.net/weixin_44506305/article/details/112371542