rapid_latex_ocr: Faster and more useful formula image to latex conversion tool

Rapid Latex OCR

PyPI SemVer2.0

  • rapid_latex_ocris a tool to convert formula images to latex format.
  • The reasoning code in the warehouse is changed from LaTeX-OCR , the models have all been converted to ONNX format, and the reasoning code has been simplified, making the reasoning faster and easier to deploy.
  • The warehouse only has codes based on ONNXRuntimeor OpenVINOinference onnx format, and does not contain training model codes. If you want to train your own model, please move to LaTeX-OCR .
  • If it helps you, please give a little star ⭐ or sponsor a cup of coffee (click the link in Sponsor at the top of the page)
  • All friends are welcome to actively contribute to make this tool better.

use

  1. Install

    1. pip install rapid_latext_ocrlibrary. Because packaging the model into the whl package exceeds the pypi limit (100M), the model needs to be downloaded separately.

      pip install rapid_latex_ocr
      
    2. Download the model ( Google Drive | Baidu Netdisk ). When initializing, just specify the model path. See the next part for details.

      model name size
      image_resizer.onnx 37.1M
      encoder.onnx 84.8M
      decoder.onnx 48.5M
  2. use

    • The script uses:
      from rapid_latex_ocr import LatexOCR
      
      image_resizer_path = 'models/image_resizer.onnx'
      encoder_path = 'models/encoder.onnx'
      decoder_path = 'models/decoder.onnx'
      tokenizer_json = 'models/tokenizer.json'
      model = LatexOCR(image_resizer_path=image_resizer_path,
                      encoder_path=encoder_path,
                      decoder_path=decoder_path,
                      tokenizer_json=tokenizer_json)
      
      img_path = "tests/test_files/6.png"
      with open(img_path, "rb") as f:
          data = f.read()
      
      result, elapse = model(data)
      
      print(result)
      # {\frac{x^{2}}{a^{2}}}-{\frac{y^{2}}{b^{2}}}=1
      
      print(elapse)
      # 0.4131628000000003
      
    • command line use
      $ rapid_latex_ocr -h
      usage: rapid_latex_ocr [-h] [-img_resizer IMAGE_RESIZER_PATH]
                          [-encdoer ENCODER_PATH] [-decoder DECODER_PATH]
                          [-tokenizer TOKENIZER_JSON]
                          img_path
      
      positional arguments:
      img_path              Only img path of the formula.
      
      optional arguments:
      -h, --help            show this help message and exit
      -img_resizer IMAGE_RESIZER_PATH, --image_resizer_path IMAGE_RESIZER_PATH
      -encdoer ENCODER_PATH, --encoder_path ENCODER_PATH
      -decoder DECODER_PATH, --decoder_path DECODER_PATH
      -tokenizer TOKENIZER_JSON, --tokenizer_json TOKENIZER_JSON
      
      $ rapid_latex_ocr tests/test_files/6.png \
          -img_resizer models/image_resizer.onnx \
          -encoder models/encoder.onnx \
          -dedocer models/decoder.onnx \
          -tokenizer models/tokenizer.json
      # ('{\\frac{x^{2}}{a^{2}}}-{\\frac{y^{2}}{b^{2}}}=1', 0.47902780000000034)
      
  3. Input and output description

    • Input ( Union[str, Path, bytes]) : An image containing only formulas.
    • Output ( Tuple[str, float]) : (识别结果, 耗时), see the following example for details:
      (
         '{\\frac{x^{2}}{a^{2}}}-{\\frac{y^{2}}{b^{2}}}=1',
         0.47902780000000034
      )
      

For details, you can move to: RapidLatexOCR

Guess you like

Origin blog.csdn.net/shiwanghualuo/article/details/131745242