Python Opencv Practice - Getting Started Using Tesseract to Recognize Text in Pictures

        Before doing the license plate recognition project, try tesseract to recognize Chinese. For the installation and use of tesseract, please refer to:

Detailed explanation of Python OCR tool pytesseract - Zhihu pytesseract is an OCR tool based on Python. The bottom layer uses Google's Tesseract-OCR engine, which supports recognition of text in images and supports jpeg, png, gif, bmp, tiff and other image formats. This article introduces how to use pytesseract to implement image text recognition. Introduction OCR (Opti… icon-default.png?t=N7T8https://zhuanlan.zhihu.com/p/448253254        

import pytesseract as tst
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

#参考资料
#https://zhuanlan.zhihu.com/p/448253254

original_img = cv.imread("../../SampleImages/chineseCharacters.jpg", cv.IMREAD_COLOR)
#图片转换为灰度图
img = cv.cvtColor(original_img, cv.COLOR_BGR2GRAY)
#二值化
ret,img = cv.threshold(img, 160, 255, cv.THRESH_BINARY)
plt.imshow(img, cmap='gray')
imgH,imgW = img.shape
print(imgH)
print(imgW)

#显示支持的语言列表
print(tst.get_languages(config=''))
#使用image_to_string将图片中的文字转换出来
print(tst.image_to_string(img, lang='chi_sim'))

#使用image_to_boxes返回识别的字符及边框
boxes = tst.image_to_boxes(img, lang='chi_sim')
print(boxes)
#返回值:
# 字符 左下角X 左下角Y 右上角X 右上角Y 
# 例子: 稳 116 616 268 690 0
#绘制边框
#注意,opencv的坐标系以左上角为原点,boxes中的参数是以左下角为原点
for box in boxes.splitlines():
    elements = box.split()
    print(elements)
    x1,y1,x2,y2 = int(elements[1]), int(elements[2]), int(elements[3]), int(elements[4])
    #转换到opencv坐标系
    charHeight = y2 - y1
    y1 = imgH - y1 - charHeight
    y2 = imgH - y2 + charHeight
    print("Opencv character position:" + str(x1) + ' ' + str(y1)  + ' '  + str(x2)  + ' '  + str(y2))
    cv.rectangle(original_img, (x1, y1), (x2, y2), (0,255,0), 2)
plt.imshow(original_img[:,:,::-1])

        The coordinates returned by the image_to_boxes method are based on the lower left corner as the origin, which can be verified from printing.

Guess you like

Origin blog.csdn.net/vivo01/article/details/134043654