How to detect the horizontal and vertical lines of a table and eliminate the noise?

李泓烨 :

I am trying to get the horizontal and vertical lines of the table in an image in order to extract the texts in cells. Here's a picture I use:enter image description here

I use the code below to extract the vertical and horizontal lines:

img = cv2.imread(img_for_box_extraction_path, 0)  # Read the image
(thresh, img_bin) = cv2.threshold(img, 200, 255,
                                  cv2.THRESH_BINARY | cv2.THRESH_OTSU)  # Thresholding the image
img_bin = 255-img_bin  # Invert the image
cv2.imwrite("Image_bin_2.jpg",img_bin)

# Defining a kernel length
kernel_length = np.array(img).shape[1]//140

# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))

# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))

# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))

# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
cv2.imwrite("verticle_lines_2.jpg",verticle_lines_img)

# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
cv2.imwrite("horizontal_lines_2.jpg",horizontal_lines_img)

The pictures below are the horizontal lines and vertical lines: horizontal lines vertical lines

I use the code below to add two image together

# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha

# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
cv2.imwrite("img_final_bin_2.jpg",img_final_bin)

However, I get a picture like this: Summation Result How do I remove the noise and get a better result? Thanks in advance.

nathancy :

Here's a simple method:

Binary image

enter image description here

Detected horizontal

enter image description here

Detected vertical

enter image description here

Combined masks

enter image description here

Lines to be removed in green

enter image description here

Result

enter image description here

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Detect horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,1))
horizontal_mask = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)

# Detect vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,50))
vertical_mask = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=1)

# Combine masks and remove lines
table_mask = cv2.bitwise_or(horizontal_mask, vertical_mask)
image[np.where(table_mask==255)] = [255,255,255]

cv2.imshow('thresh', thresh)
cv2.imshow('horizontal_mask', horizontal_mask)
cv2.imshow('vertical_mask', vertical_mask)
cv2.imshow('table_mask', table_mask)
cv2.imshow('image', image)
cv2.waitKey()

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=27813&siteId=1