Offline video OCR recognition

sudo apt-get install libleptonica-dev libtesseract-dev
sudo apt-get install tesseract-ocr-chi-sim
python -m pip  install video-ocr

Windows installation method:
Download and install
https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64 -setup-5.3.3.20231005.exe

download

wget https://github.com/simonflueckiger/tesserocr-windows_build/releases/download/tesserocr-v2.6.0-tesseract-5.3.1/tesserocr-2.6.0-cp311-cp311-win_amd64.whl
pip install tesserocr-2.6.0-cp311-cp311-win_amd64.whl
git clone https://github.com/PinkFloyded/video-ocr.git
cd video-ocr
notepad setup.py

Remove the version dependency and modify it as follows:

 install_requires=[
        "tesserocr",
        "scipy",
        "opencv-python",
        "numpy",
        "tqdm",
        "click",
        "Pillow",
    ],

Install after

python setup.py install

If you encounter

RuntimeError: Failed to init API, possibly an invalid tessdata path: ./

You need to set the environment variable TESSDATA_PREFIX to C:\Program Files\Tesseract-OCR\tessdata\

By default, it can only recognize English, so you need to change the package.

View the default location:

Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import video_ocr
>>> video_ocr.__file__
'/home/catcatyu/.local/lib/python3.10/site-packages/video_ocr.py'
>>>
nano /home/catcatyu/.local/lib/python3.10/site-packages/video_ocr.py

Modify line 124 to add the lang=chi_sim parameter.

def _ocr(frame):
    pil_image = Image.fromarray(frame.image)
    text = tesserocr.image_to_text(pil_image,lang="chi_sim") #这行
    frame.text = text
    pbar.update()
    return frame

Use later

video-ocr --sample_rate 10  1.mp4

can be recognized.
Effect:

Insert image description here
Insert image description here

Use--sample_rate parameter to improve accuracy. The bigger the number, the better

Guess you like

Origin blog.csdn.net/fjh1997/article/details/134336025