Tip: There are many pictures in this article, please pay attention to the traffic on the mobile phone.
Article Directory
foreword
Using python for image recognition, there are many ways to recognize and extract the text in the image, but if you want to do something simpler, you can use the tesseract recognition engine to achieve it, and one line of code can extract the image text.
1. Configuration environment
1. Install python dependencies
This program uses two python libraries, pytesseract and PIL, so install them first.
run the following command
pip install Pillow
pip install pytesseract
If no error is reported in python, it means that the program is installed successfully.
2. Install the recognition engine
After installing the above two dependencies, the corresponding recognition engine is required. click to download
We directly use the latest version built on May 10.
Install the tesseract recognition engine(可跳过)
After the download is complete, open the program to install, first select the language, choose English here English
, and then clickok
The next thing is next
, click to I Agree
agree to the agreement,
install for all users, and then click next
, as shown in the figure, and
then install the Chinese language pack 用来识别中文
, you need to slide to the bottom, select Chinese, I have selected both horizontal simplified Chinese and vertical simplified Chinese , click next after completion,
select the installation path, it is recommended to install to other than the C drive, and then click next
here to install install
,
Wait for the installation to complete
After the installation is complete, click next
, and then click finish
to complete the installation,
Verify that the installation was successful
Add an environment variable, which is the path of the folder you installed to, add it directly to the path,
and then run it on the command line tesseract -v
. If it is the same as the figure below, it means that you have successfully installed it.
2. Use steps
1. Import library
from PIL import Image
import pytesseract
2. Extract image text
Encapsulate a line of code for reading pictures into a function,
def read_image(name):
print(pytesseract.image_to_string(Image.open(name), lang='chi_sim'))
main
Just call it directly in the function ,
def main():
read_image('1657158527412.jpg')
3. Operation effect
Take the following image as an example,
The operation effect is as follows,
Summarize
This article introduces the python call of tesseract, that is, the pytesseract library. There are some other contents that are not involved, but only involve the extraction of text from pictures. If you are interested in it, you can explore it in depth, and hope to discuss it with me. .
full code
from PIL import Image
import pytesseract
def read_image(name):
print(pytesseract.image_to_string(Image.open(name), lang='chi_sim'))
def main():
read_image('img.png')
if __name__ == '__main__':
main()