Correction of slanted text lines based on opencv

Here is mainly to record my exploration of the method of correcting the slanted font text line in my work. There is not much nonsense, let's take a look.

table of Contents

1. Algorithm flow chart

Second, algorithm implementation

2.1, pretreatment

2.2, the level is blurred

2.3, vertical projection

2.4. Statistical calculation of the inclination angle of the short slash

2.5, tilt correction

Third, the final result of the algorithm implementation

The correction of the slanted text line font is mainly divided into two parts, one part is the detection of the tilt angle, and the other part is the tilt correction. Among them, the detection of the tilt angle is extremely important, and it is related to the subsequent correction.

1. Algorithm flow chart

Second, algorithm implementation

2.1, pretreatment

Original Picture:

Here are mainly grayscale, filtering and binarization, and the results are as follows:

2.2, the level is blurred

Horizontal blur is also called run smoothing. It is an algorithm that converts continuous black points with a length less than a certain threshold into white points on the image. After the image is processed by the fuzzy algorithm, the connected components with similar distances on the image will form a larger connected area.

def horizontal_blur(image):
    '''水平平滑'''
    dst=image

    #计算图像的字符宽度
    hor_vec=np.sum(dst,axis=0)

    width=hor_vec.shape[0]
    left=right=0
    for i in range(width):
        if hor_vec[i]!=0:
            left=i
            break

    for j in range(width-1,-1,-1):
        if hor_vec[j]!=0:
            right=j
            break

    char_width=right-left+1 if right-left+1>0 else 10

    # 计算游程平滑阈值
    thres=char_width//10

    h,w=dst.shape
    for r in range(h):
        c=0
        while c<w and dst[r,c]==0:
            c+=1

        max_w=0
        for i in range(w-1,-1,-1):
            if dst[r,i] !=0:
                max_w=i
                break

        start=0
        end=0
        flag=True
        for j in range(c,max_w):
            if flag and dst[r,j]==0:
                start=j
                flag=False
            if not flag and dst[r,j]!=0:
                end=j

                if end>start and end-start <= thres:
                    k=start
                    while k<end:
                        dst[r,k]=255
                        k+=1

                flag=True

    return dst

The results after processing are as follows:

2.3, vertical projection

This is easy to calculate, so I won't introduce it here. Look at the result directly:

2.4. Statistical calculation of the inclination angle of the short slash

1. The principle of vertical projection angle measurement

The abstract of italics can be seen as a parallelogram. Generally, for a solid parallelogram image composed of black pixels , the vertical projection histogram is a trapezoid. details as follows:

Then the calculation formula of the inclination angle is:

$\small tan(A)=\frac{y_{1}}{x_{2}-x_{1}}$

$\small \angle A=arctan(\frac{y_{1}}{x_{2}-x_{1}})$

In other words, we can calculate the angle of the slanted font through the second picture above, and the second picture is our projection curve. So we calculate the angle of the projection curve, which is our tilt angle.

But 2.3 in a projection view seen, first of all, the level of blurring in italics are not ideal parallelogram, secondly, a plurality of oblique edges blurred inclination angle information exists and the drawings, but the calculation error and other reasons, there may be some errors In order to obtain more accurate angle measurement information, all the tilt angle information must be statistically analyzed.

It should be noted that the shape of the projection curve at the boundary of the fuzzy area has a great relationship with the structure of the character. For example, the character boundary is not a vertical stroke (English letter X) or the boundary stroke will interfere with the characters obtained by the oblique angle (A, W, etc.). Therefore, not all the slopes of the short slashes are the slopes of italics. Generally, the number of correct polylines is greater than the incorrect ones. Based on this, this paper uses voting to obtain the correct angle when designing the algorithm, that is, for all the obtained angles, which angle appears the most frequently through voting, and this angle is the angle of inclination you are looking for.

2. Code implementation

The code is not detailed here, because the code belongs to the company, so can it be posted again, friends in need can chat privately, and I will help you solve it.

2.5, tilt correction

The correction of oblique characters in the image is actually the spatial rotation transformation of pixel coordinates. Assuming that the slanted character is twisted to the right , the X axis coincides with the horizontal axis , and the direction of the Y axis is the twisting direction of the character. If the character image is rotated horizontally so that the direction of the Y axis is perpendicular to the horizontal axis, the distortion of the image is corrected . Taking a slanted character " 中 " as an example , the space transformation formula is derived as follows:

When rotating, the part of the character below point A moves horizontally to the right , and the upper part moves horizontally to the left. Taking point B as an example , set the rotated coordinate values to ( i, j), it is easy to know that the ordinate does not change when moving horizontally to the right, and the abscissa increases BC, then the transformation formula is :

In the same way , the left shift part transformation formula is :

However, the coordinate value of the point in the image is still an integer, so the result obtained after transformation must be rounded, which will cause errors and inevitably cause image distortion. In this paper, bilinear interpolation is used to reduce the distortion caused by rounding , and the corrected binary image is smoothed to eliminate the burr points caused by interpolation.

The specific implementation is as follows:

def bilinear_interpolation(image,angle,center,y_min,y_max):
    '''对倾斜字体进行双线性插值'''
    center_x, center_y=center
    h,w=image.shape
    dst=np.zeros((h,w),dtype=np.uint8)

    for r in range(h):
        for c in range(w):
            #计算原图上的坐标
            i = r
            if r<center_y:
                j=c+(center_y-r)/math.tan(angle)
            elif r>center_y:
                j=c-(r-center_y)/math.tan(angle)
            else:
                j=c

            #计算源图上的四个近邻点
            x_0=max(int(np.floor(j)),0)
            y_0=max(int(np.floor(i)),0)
            x_1=min(x_0+1,w-1)
            y_1=min(y_0+1,h-1)

            #双线性插值
            if (x_0 >=x_1) or (y_0>=y_1):
                continue

            value0=((x_1-j)*image[y_0,x_0]+(j-x_0)*image[y_0,x_1])
            value1=((x_1-j)*image[y_1,x_0]+(j-x_0)*image[y_1,x_1])
            dst[r,c]=int(((y_1-i)*value0+(i-y_0)*value1))

    return dst


def correct_slanted_fonts(image,mask,angle):
    '''倾斜字体的矫正'''
    h,w=mask.shape

    # 计算倾斜字体的中心点
    center_x=0
    center_y=0
    num=0
    for r in range(h):
        for c in range(w):
            if mask[r,c]==255:
                center_x+=c
                center_y+=r
                num+=1

    center_x=center_x//num
    center_y=center_y//num

    #计算文本的上下边界
    ver_vec=np.sum(mask,axis=1)
    up=0
    down=0
    h=ver_vec.shape[0]
    for i in range(h):
        if ver_vec[i]!=0:
            up=i
            break
    for i in range(h-1,-1,-1):
        if ver_vec[i]!=0:
            down=i
            break

    #对图像进行双线性插值
    dst=bilinear_interpolation(image,angle,(center_x,center_y),up,down)

    # cv2.namedWindow("test1",0)
    # cv2.imshow("test1", dst)
    # cv2.waitKey(0)
    return dst