Guide to using the most powerful formula recognition tool in Linux systems—Pix2Text

Ubuntu installation and configuration Pix2Text (formula recognition)

1. First introduction to Pix2Text

  Pix2Text is a free and open source formula recognition tool developed using Python language. Pix2Text has a built-in text recognition module and a formula recognition module, so it can be used as both a regular formula recognition tool and a text recognition tool.

  Pix2Text’s official code base:

2. Main problems solved in this tutorial

Build an environment   on the Linux system where formula recognition and text recognition needs can be easily achieved using shortcut keys + mouse.

3. Install Pix2Text and pyperclip

  Installing Pix2Text on a Linux system is very simple. First, open Conda中the python virtual environment activated by the terminal ( Condaignore this step if you are not using it).

conda activate 环境名字

  Then install pix2textthe package. Douban's mirror site is used here, and other domestic open source mirror sites can also be used (Tsinghua University's mirror site was slower in the test).

pip install pix2text -i https://pypi.doubanio.com/simple

  Basically, the installation can be successful according to the above method pix2text. However, there are exceptions when gccthe compiler is not installed on your system. This will cause pix2textthe package installation to fail. gccThe solution to this problem is also very simple, that is, install the compiler immediately . The installation code is as follows.

sudo apt install gcc

  Finally, re-run the pix2textpackage installation code to successfully install pix2textthe package.

pip install pix2text -i https://pypi.doubanio.com/simple

  The installation pyperclippackage Condacan be installed directly in the virtual environment (it is Condaalso applicable if it is not used).

pip install pyperclip - https://pypi.doubanio.com/simple

4. Model download

  When called for the first time pix2text, it automatically downloads the model file. However, the model files are stored on Github, which causes our download speed to be very impressive ("swinging" between 0~8k). So how to solve this problem? Of course there is a way, that is to use pix2textthe Baidu cloud disk link provided by the developer to download these model files. https://pan.baidu.com/s/1kubZF4JGE19d98NDoPHJzQ?pwd=p2t0#list/path=%2F , extraction code: p2t0.

  There are 3 model files in total, namely mobilenet_v2.zip, , weights.pthand image_resizer.pth. According to official guidance:

  • mobilenet_v2.zip: mobilenet_v2.zipThe folder obtained after decompressing the file is placed ~/.pix2textin the directory;

  • weights.pthFiles and image_resizer.pthfiles: put ~/.pix2text/formulain the directory.

5. Implemented based on Screenshot, Shell, pyperclip and notify-send: screenshot->formula recognition->recognition result written to clipboard->desktop pop-up reminder of task completion

  1. Save the code below as main.pya file

    # coding: utf-8
    
     import os
     import pyperclip as pc
     from pix2text import Pix2Text
    
     #识别图片所在路径及图片文件名
     img_fp = '/home/haijian/Python项目/Pix2Text/formula.png'
     #初始化Pix2Text
     p2t = Pix2Text(analyzer_config=dict(model_name='mfd'))
     #识别图片
     outs = p2t.recognize(img_fp)
     # 如果只需要识别出的文字和Latex表示,可以使用下面行的代码合并所有结果
     only_text = '\n'.join([out['text'] for out in outs])
     #将识别结果复制到系统剪贴板
     pc.copy(only_text)
     #以Linux系统通知的形式告知公式识别完成
     os.system('notify-send "Pix2Text" "完成公式识别,已将公式写入剪贴板。" -i /home/haijian/Python项目/Pix2Text/clipboard.svg -t 1000 ')
    
    
  2. Save the code below as run.sha file

    # 区域截图
     gnome-screenshot -abpf /home/haijian/Python项目/Pix2Text/formula.png # 截图图片保存的目录
     source $HOME/miniconda3/etc/profile.d/conda.sh # Conda的conda.sh文件的地址
     # 激活conda里的Python环境
     conda activate dailyuse
     # 切换工作路径到Python项目路径
     cd /home/haijian/Python项目/Pix2Text
     # 运行Python脚本
     python main.py
    
  3. Download the icon icon at: https://feathericons.com/ . Select clipboardthe icon named to download.

  4. Place main.pythe , run.shand clipboard.svgfiles in the same directory, and modify the corresponding absolute paths in the two files based on the absolute path of the directory.

  5. 系统设置Find it in , click the number 键盘快捷键shown in the picture below , and enter it in the command bar . You can choose the name according to your own preference, and the shortcut keys can also be set according to your own habits. Now you can press your shortcut keys to use for formula recognition or text recognition.+bash /home/haijian/Python项目/Pix2Text/run.sh
    Insert image description hereInsert image description here
    Pix2Text

6. Final effect

  1. Pure formula scene recognition
    (1) Original picture:
    Insert image description here
    (2) Recognition result:

    \bar{n}(\varepsilon)=\frac{1}{\Omega_{R}}\sum_{n=0}^{\infty}n x^{n}=(1-x)\,x\frac{d}{d x}(1-x)^{-1}=\frac{x}{1-x}
    

    (3) Result compilation:
    n ˉ ( ε ) = 1 Ω R ∑ n = 0 ∞ nxn = ( 1 − x ) xddx ( 1 − x ) − 1 = x 1 − x \bar{n}(\varepsilon)= \frac{1}{\Omega_{R}}\sum_{n=0}^{\infty}nx^{n}=(1-x)\,x\frac{d}{dx}(1-x )^{-1}=\frac{x}{1-x}nˉ (e)=OhR1n=0nxn=(1x)xdxd(1x)1=1xx

  2. Plain text scene recognition
    original picture:
    Insert image description here
    recognition results:

    82.3 
    完全电离混合气体
    在恒星内部绝大部分地方,温度非常高而压强非常大,造成物质发生电离。这
    会使得轻的元素失去全部电子,而重的元素失去大部分电子。电离出来的电子造成
    当地的自由粒子数日大大增加,将对恒星物质的热力学性质产生显著影响
    

    Compilation results:
    82.3
    Completely ionized mixed gas
    In most places inside the star, the temperature is very high and the pressure is very high, causing ionization of matter. This
    causes light elements to lose all their electrons and heavy elements to lose most of their electrons. The ionized electrons cause
    a significant increase in local free particles, which will have a significant impact on the thermodynamic properties of stellar matter.

  3. Formula and text mixed scene recognition
    original picture:
    Insert image description here
    recognition results:

    其中c是真空中的光速,将方程(2.40)用原来的自变量写出,就给出了光子气体的
    普朗克(Planck)分布函数
    $$
    \bar{n}=\frac{1}{e^{\varepsilon/k T}-1}=\frac{1}{e^{h\nu/k T}-1}
    $$
    (2.42)
    

    Compilation of results:
    where c is the speed of light in vacuum, writing equation (2.40) with the original independent variables gives the
    Planck distribution function of the photon gas
    n ˉ = 1 e ε / k T − 1 = 1 eh ν / k T − 1 \bar{n}=\frac{1}{e^{\varepsilon/k T}-1}=\frac{1}{e^{h\nu/k T} -1}nˉ=ee / k T11=ehν/kT11
    (2.42)

end!

Guess you like

Origin blog.csdn.net/Hello_Haiwan/article/details/129415142