python+Tesseract OCR realizes text recognition in screenshots

1. Download and install tesseract-ocr

1. Download

The following are commonly used URLs about Tesseract
Download address: https://digi.bib.uni-mannheim.de/tesseract/Official
website: https://github.com/tesseract-ocr/tesseract
Official documentation: https://github .com/tesseract-ocr/tessdoc
language package address: https://github.com/tesseract-ocr/tessdata

Insert image description here

2. Install tesseract-ocr

(1) Select language

Insert image description here

(2) Start installation

Insert image description here

(3) Agree to the license

Insert image description here

(4) Select the user to install

Insert image description here

(5) Select the language pack to be installed

The language pack will be automatically downloaded from the server during the installation process. (It is not recommended to check the download language pack here, because the download speed is too slow. This tutorial will introduce how to expand the language pack later, but if you have already circumvented the wall, you can ignore this suggestion.)

Just leave it as default.
Insert image description here

(6) Installation location

Insert image description here

(7) Start installation

Insert image description here

(8) Installation completed

3. Install language pack

(1) Download and install

https://github.com/tesseract-ocr/tessdata

The project is large, you can download Simplified Chinese on demand:

Insert image description here
Store the downloaded files in this directory:D:\Program Files\Tesseract-OCR\tessdata

Insert image description here
Note: If you are unable to access the Internet scientifically, you can download the Simplified Chinese language pack from here:https://download.csdn.net/download/A_art_xiang/88334913

(2) Test

Enter the Tesseract OCR installation directory:

# 查看版本
PS D:\Program Files\Tesseract-OCR> .\tesseract.exe -v
tesseract v5.3.0.20221214
 leptonica-1.78.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found libarchive 3.5.0 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5 libzstd/1.4.5
 Found libcurl/7.77.0-DEV Schannel zlib/1.2.11 zstd/1.4.5 libidn2/2.0.4 nghttp2/1.31.0
# 查看安装的语言包
PS D:\Program Files\Tesseract-OCR> .\tesseract.exe --list-langs
List of available languages in "D:\Program Files\Tesseract-OCR/tessdata/" (4):
chi_sim
chi_sim_vert
eng
osd

2. Python screenshot recognition text

1. Install necessary packages

pip install pyautogui
pip install pytesseract

2. Screen capture text recognition

import pyautogui
import pytesseract

# 设置Tesseract的安装路径(如果它不在默认的系统路径中)
pytesseract.pytesseract.tesseract_cmd = 'D:/Program Files/Tesseract-OCR/tesseract.exe'

# 截取屏幕截图
screenshot = pyautogui.screenshot()

# 定义区域范围(左上角x坐标,左上角y坐标,右下角x坐标,右下角y坐标)
region = (100, 100, 300, 200)

# 从屏幕截图中使用指定区域创建一个新的图像对象
custom_screenshot = screenshot.crop(region)

# 将图像对象转换为灰度图像,以帮助提高文本识别的准确性
custom_screenshot = custom_screenshot.convert('L')

# 使用pytesseract进行文字识别
text = pytesseract.image_to_string(custom_screenshot)

# 打印识别的文本
print(text)

3. Accuracy

The accuracy in English is okay, but the accuracy in Chinese. . . It’s hard to describe in one word. It should be possible to improve accuracy through training.

References

https://blog.csdn.net/weixin_51571728/article/details/120384909

Guess you like

Origin blog.csdn.net/A_art_xiang/article/details/132848802