hello
First install pycharm, the language I use is python, and use Tesseract, the installation tutorial Tesseract OCR installation process_Qingdu Xianke's blog-CSDN blog first select the website Tesseract User Manual | tessdocTesseract documentationhttps://tesseract-ocr .github.io/tessdoc/Home.html Because what I need is windows, I choose the following. You can arrange your own download content according to your actual situation: Then select the version: install it yourself after downloading. The author works with pycharm and will add text recognition project exercises later... https://blog.csdn.net/qq_41059950/article/details/122890276 First create a folder in pycharm, click settings in File
Here we install opencv-python, and pytesseract. At this point, the preparation activities are basically completed.
Then create a .py file and start our project.
I assume that you have a little basic knowledge of opencv and python.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
#添加的是你的tesseract的绝对路径,还要加上他的exe执行文件
Read a photo below. As long as it contains English letters and numbers, it’s up to you. This is mine.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('Rescources/textone.png')
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
print(pytesseract.image_to_string(img))
cv2.imshow('img',img)
cv2.waitKey(0)
First, you can use the pytesseract.image_to_string() function to detect the English letters and numbers on the picture
Secondly, we can print out the coordinates of each number or letter through the function pytesseract.image_to_boxes() to prepare for the subsequent steps.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('Rescources/textone.png')
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
#print(pytesseract.image_to_string(img))
print(pytesseract.image_to_boxes(img))
cv2.imshow('img',img)
cv2.waitKey(0)
Next, complete character detection first. We need to add a box to the recognized English and numbers.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('Rescources/textone.png')
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
#print(pytesseract.image_to_string(img))
#print(pytesseract.image_to_boxes(img))
### Detecting Characters 检测字符
Himg,Wimg,_ = img.shape
boxes = pytesseract.image_to_boxes(img)
for box in boxes.splitlines():
#print(box)
box = box.split(' ')
#print(box)
x,y,w,h = int(box[1]),int(box[2]),int(box[3]),int(box[4])#坐标是以左下角为中心,所以下面计算坐标要换算
cv2.rectangle(img,(x,Himg-y),(w,Himg-h),(0,0,255),2)
cv2.putText(img,box[0],(x,Himg-y+20),cv2.FONT_HERSHEY_DUPLEX,1,(0,50,255),2)
cv2.imshow('img',img)
cv2.waitKey(0)
This step is basically not difficult, it's just about making a fuss about the obtained coordinates.
Of course, it is not enough for us to just complete this operation. We need to recognize words.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('Rescources/textone.png')
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
#print(pytesseract.image_to_string(img))
#print(pytesseract.image_to_boxes(img))
### Detecting Words 检测单词
Himg,Wimg,_ = img.shape
boxes = pytesseract.image_to_data(img)
print(boxes)
for x,b in enumerate(boxes.splitlines()):#如果是单词被识别出来,会返回12个参数
if x!=0:
b = b.split()
if len(b)==12:#判断是否返回的是单词,利用是否是十二个参数
x,y,w,h = int(b[6]),int(b[7]),int(b[8]),int(b[9])
cv2.rectangle(img, (x,y), (w+x, h+y), (0, 0, 255), 2)
cv2.putText(img, b[11], (x,y), cv2.FONT_HERSHEY_DUPLEX, 1, (0, 50, 255), 2)
cv2.imshow('img',img)
cv2.waitKey(0)
You can also change the configuration to freely choose what you recognize, for example, only recognize numbers. This is what the specific parameters of ome and psm represent.
cong = r'--oem 3 --psm 6 outputbase digits'#添加命令
boxes = pytesseract.image_to_data(img,config=cong)
Just add and modify these in the above program
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('Rescources/textone.png')
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
#print(pytesseract.image_to_string(img))
#print(pytesseract.image_to_boxes(img))
### Detecting Words 检测单词
Himg,Wimg,_ = img.shape
cong = r'--oem 3 --psm 6 outputbase digits'#添加命令
boxes = pytesseract.image_to_data(img,config=cong)
print(boxes)
for x,b in enumerate(boxes.splitlines()):
if x!=0:
b = b.split()
if len(b)==12:#判断是否返回的是单词,利用是否是十二个参数
x,y,w,h = int(b[6]),int(b[7]),int(b[8]),int(b[9])
cv2.rectangle(img, (x,y), (w+x, h+y), (0, 0, 255), 2)
cv2.putText(img, b[11], (x,y), cv2.FONT_HERSHEY_DUPLEX, 1, (0, 50, 255), 2)
cv2.imshow('img',img)
cv2.waitKey(0)
ok, that’s it for this small project, see you next time.