Python OCR library: an artifact for automated test verification code recognition!

In interface automation work, we often need to deal with text recognition tasks, and the OCR (Optical Character Recognition, optical character recognition) library can help us extract text from images. There are several commonly used OCR libraries in Python, including pyocr, pytesseractand python- tesseract. EasyOCRThis article compares them and provides some sample code to demonstrate their use in real-world interface automation work.

1、pyocr

PyOCR is a Python library that provides encapsulation of multiple OCR engines. It makes it easy to use different OCR engines for text recognition in Python.

PyOCR supports the following OCR engines:

  • Tesseract: Tesseract is an open source OCR engine developed by Google. It supports multiple languages ​​and performs well in terms of OCR accuracy.

  • Cuneiform: Cuneiform is an open source OCR engine that supports multiple languages ​​and fonts.

  • GOCR: GOCR is an open source OCR engine mainly used to recognize simple text and numbers.

Applicable scene:

  • Text recognition and extraction: Used to extract printed text from images for text processing, search, and analysis.

  • Document Scanning and Conversion: Used to convert scanned paper documents into editable electronic documents.

  • Automated data entry: Used to convert data from images into a computer-readable format for data processing and analysis.

  • Image annotation and classification: Used to extract text information from images in order to label and classify the images.

The steps for using PyOCR for text recognition are as follows:

  • Install the PyOCR library and corresponding OCR engine:pip install pyocr

  • Import the PyOCR library and required OCR engine.

  • Initialize the OCR engine.

  • Open an image file or convert an image to a PIL image object.

  • Use the image_to_string method of the OCR engine for text recognition.

Example: Here is an example using the Tesseract library for text recognition:

import pyocr
import pyocr.builders
from PIL import Image

# 初始化OCR引擎
tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    exit(1)
ocr_tool = tools[0]

# 打开图像文件
image = Image.open('image.jpg')

# 使用OCR引擎进行文本识别
text = ocr_tool.image_to_string(
    image,
    lang='eng',
    builder=pyocr.builders.TextBuilder()
)

# 打印识别结果
print(text)

In this example, pyocr.get_available_tools() is first used to obtain the list of available OCR engines, and then the first available engine is selected for initialization. Then use the PIL library to open the image file, and then use the image_to_string method of the OCR engine for text recognition, while specifying the recognition language and text builder. Finally print the recognition result.

2、pytesseract

pytesseract is a Python library that provides a wrapper for the Tesseract OCR engine. Tesseract is an open source OCR engine developed by Google. pytesseract makes it easy to use Tesseract for text recognition in Python.

pytesseract has the following features:

  • Support multiple languages: pytesseract can recognize text in multiple languages, including English, Chinese, Japanese, etc.

  • Supports multiple image formats: pytesseract can handle a variety of common image formats, such as JPEG, PNG, TIFF, etc.

  • Easy to use: pytesseract provides a simple API that can complete text recognition with just a few lines of code.

The steps for using pytesseract for text recognition are as follows:

  1. Install the pytesseract library and Tesseract OCR engine.

  2. Import the pytesseract library.

  3. Open an image file or convert an image to a PIL image object.

  4. Use the pytesseract library image_to_stringmethod for text recognition.

Here is an example of using pytesseract for text recognition:

import pytesseract
from PIL import Image

# 打开图像文件
image = Image.open('image.jpg')

# 使用pytesseract进行文本识别
text = pytesseract.image_to_string(image)

# 打印识别结果
print(text)

In this example, the PIL library is first used to open the image file, then the pytesseract library image_to_stringmethod is used to recognize the text in the image as text, and finally the recognition result is printed.

It should be noted that before using pytesseract for text recognition, you need to ensure that the Tesseract OCR engine has been correctly installed and configured as one of the system environment variables. This way pytesseract can find and use the Tesseract engine for identification.

3、python-tesseract

python-tesseractis a Python library that provides a wrapper for the Tesseract OCR engine. Tesseract is an open source OCR engine developed by Google. python-tesseractLibrary makes it easy to use Tesseract for text recognition in Python.

python-tesseractHas the following characteristics:

  • Support multiple languages: python-tesseractCan recognize text in multiple languages, including English, Chinese, Japanese, etc.

  • Supports multiple image formats: python-tesseractCan handle multiple common image formats, such as JPEG, PNG, TIFF, etc.

  • Easy to use: python-tesseractProvides a simple API to complete text recognition with just a few lines of code.

python-tesseractThe steps for text recognition are as follows :

  1. Install python-tesseractthe library and Tesseract OCR engine.

  2. Import python-tesseractlibrary.

  3. Open an image file or convert an image to a PIL image object.

  4. Use python-tesseractlibrary image_to_stringmethods for text recognition.

Here is an python-tesseractexample using text recognition:

import pytesseract
from PIL import Image

# 打开图像文件
image = Image.open('image.jpg')

# 使用python-tesseract进行文本识别
text = pytesseract.image_to_string(image)

# 打印识别结果
print(text)

In this example, the PIL library is first used to open the image file, then the python-tesseractlibrary image_to_stringmethod is used to recognize the text in the image as text, and finally the recognition result is printed.

It should be noted that python-tesseractbefore using it for text recognition, you need to ensure that the Tesseract OCR engine has been correctly installed and configured as one of the system environment variables. This way python-tesseractit can be found and identified using the Tesseract engine.

The following is a more complex example that shows how to use python-tesseracttext recognition and perform some post-processing on the recognition results:

import pytesseract
from PIL import Image
import re

# 打开图像文件
image = Image.open('image.jpg')

# 使用python-tesseract进行文本识别
text = pytesseract.image_to_string(image)

# 去除识别结果中的非法字符
cleaned_text = re.sub(r'[^a-zA-Z0-9\s]', '', text)

# 将识别结果按行分割成列表
lines = cleaned_text.split('\n')

# 去除空行
lines = [line.strip() for line in lines if line.strip()]

# 打印识别结果
for line in lines:
    print(line)

In this example, we first use the PIL library to open the image file, and then use python-tesseractthe library's image_to_stringmethods to recognize the text in the image as text. Next, we use regular expressions to remove illegal characters from the recognition results, leaving only letters, numbers, and spaces. Then, we split the recognition results into lists by rows and remove empty rows. Finally, we print the recognition results line by line.

This example shows how to perform some post-processing operations on the recognition results to obtain cleaner and more readable text. Based on actual needs, you can perform more post-processing operations as needed, such as removing specific characters, extracting key information, etc.

4、EasyOCR

EasyOCR is a powerful, open source, easy-to-use OCR library suitable for various text recognition tasks, including document scanning, image processing, natural language processing, etc. It can help developers quickly implement text recognition functions and apply it to various application fields. Compared with other OCR libraries, EasyOCR has the following features:

  1. Multi-language support: EasyOCR supports text recognition in more than 80 languages, including Chinese, English, Japanese, Korean, etc. It can handle mixed text in multiple languages ​​and is suitable for global applications.

  2. High-precision recognition: EasyOCR uses a deep learning model and advanced OCR technology to provide high-precision text recognition results. It is trained and tested on multiple public datasets and has high accuracy and robustness.

  3. Easy to use: EasyOCR provides a simple API that makes text recognition easy. With just a few lines of code, you can convert text from images into usable text.

  4. Cross-platform support: EasyOCR can run on multiple platforms, including Windows, Linux and Mac OS. It supports Python and command line interfaces and can be integrated with other programming languages ​​and tools.

The steps to use EasyOCR for text recognition are as follows:

  1. Install the EasyOCR library: You can install the EasyOCR library using the pip command, for example pip install easyocr.

  2. Import EasyOCR library: Import EasyOCR library in Python code, for example import easyocr.

  3. Create OCR object: Create an OCR object, for example reader = easyocr.Reader(['en', 'zh']), specify the language to be recognized.

  4. Recognize text: Use the OCR object readtextmethod to recognize text in the image, for example result = reader.readtext('image.jpg').

  5. Process the recognition results: Process the recognition results as needed, such as extracting text content, location and confidence, etc.

The following is a simple example of using EasyOCR for text recognition:

import easyocr

# 创建OCR对象
reader = easyocr.Reader(['en', 'zh'])

# 识别文字
result = reader.readtext('image.jpg')

# 处理识别结果
for (text, bbox, confidence) in result:
    print(f'Text: {text}, Bbox: {bbox}, Confidence: {confidence}')

In this example, we first create an OCR object and specify the languages ​​to be recognized as English and Chinese. Then, we use readtextthe method of the OCR object to perform text recognition on the image file and return a list containing the recognition results. Finally, we loop through the recognition results and print the content, location, and confidence of each text.

5. Summary

This article introduces several commonly used OCR libraries in Python and provides corresponding code examples. These libraries can help us perform text recognition in interface automation work, thereby achieving more automated functions and tasks. Based on actual needs, you can choose an OCR library that suits you and combine it with other tools and technologies to complete more complex interface automation work.

Guess you like

Origin blog.csdn.net/davice_li/article/details/132553458