Ubuntu installation and configuration Pix2Text (formula recognition)
1. First introduction to Pix2Text
Pix2Text is a free and open source formula recognition tool developed using Python language. Pix2Text has a built-in text recognition module and a formula recognition module, so it can be used as both a regular formula recognition tool and a text recognition tool.
Pix2Text’s official code base:
2. Main problems solved in this tutorial
Build an environment on the Linux system where formula recognition and text recognition needs can be easily achieved using shortcut keys + mouse.
3. Install Pix2Text and pyperclip
Installing Pix2Text on a Linux system is very simple. First, open Conda中
the python virtual environment activated by the terminal ( Conda
ignore this step if you are not using it).
conda activate 环境名字
Then install pix2text
the package. Douban's mirror site is used here, and other domestic open source mirror sites can also be used (Tsinghua University's mirror site was slower in the test).
pip install pix2text -i https://pypi.doubanio.com/simple
Basically, the installation can be successful according to the above method pix2text
. However, there are exceptions when gcc
the compiler is not installed on your system. This will cause pix2text
the package installation to fail. gcc
The solution to this problem is also very simple, that is, install the compiler immediately . The installation code is as follows.
sudo apt install gcc
Finally, re-run the pix2text
package installation code to successfully install pix2text
the package.
pip install pix2text -i https://pypi.doubanio.com/simple
The installation pyperclip
package Conda
can be installed directly in the virtual environment (it is Conda
also applicable if it is not used).
pip install pyperclip - https://pypi.doubanio.com/simple
4. Model download
When called for the first time pix2text
, it automatically downloads the model file. However, the model files are stored on Github, which causes our download speed to be very impressive ("swinging" between 0~8k). So how to solve this problem? Of course there is a way, that is to use pix2text
the Baidu cloud disk link provided by the developer to download these model files. https://pan.baidu.com/s/1kubZF4JGE19d98NDoPHJzQ?pwd=p2t0#list/path=%2F , extraction code: p2t0.
There are 3 model files in total, namely mobilenet_v2.zip
, , weights.pth
and image_resizer.pth
. According to official guidance:
-
mobilenet_v2.zip
:mobilenet_v2.zip
The folder obtained after decompressing the file is placed~/.pix2text
in the directory; -
weights.pth
Files andimage_resizer.pth
files: put~/.pix2text/formula
in the directory.
5. Implemented based on Screenshot, Shell, pyperclip and notify-send: screenshot->formula recognition->recognition result written to clipboard->desktop pop-up reminder of task completion
-
Save the code below as
main.py
a file# coding: utf-8 import os import pyperclip as pc from pix2text import Pix2Text #识别图片所在路径及图片文件名 img_fp = '/home/haijian/Python项目/Pix2Text/formula.png' #初始化Pix2Text p2t = Pix2Text(analyzer_config=dict(model_name='mfd')) #识别图片 outs = p2t.recognize(img_fp) # 如果只需要识别出的文字和Latex表示,可以使用下面行的代码合并所有结果 only_text = '\n'.join([out['text'] for out in outs]) #将识别结果复制到系统剪贴板 pc.copy(only_text) #以Linux系统通知的形式告知公式识别完成 os.system('notify-send "Pix2Text" "完成公式识别,已将公式写入剪贴板。" -i /home/haijian/Python项目/Pix2Text/clipboard.svg -t 1000 ')
-
Save the code below as
run.sh
a file# 区域截图 gnome-screenshot -abpf /home/haijian/Python项目/Pix2Text/formula.png # 截图图片保存的目录 source $HOME/miniconda3/etc/profile.d/conda.sh # Conda的conda.sh文件的地址 # 激活conda里的Python环境 conda activate dailyuse # 切换工作路径到Python项目路径 cd /home/haijian/Python项目/Pix2Text # 运行Python脚本 python main.py
-
Download the icon icon at: https://feathericons.com/ . Select
clipboard
the icon named to download. -
Place
main.py
the ,run.sh
andclipboard.svg
files in the same directory, and modify the corresponding absolute paths in the two files based on the absolute path of the directory. -
系统设置
Find it in , click the number键盘快捷键
shown in the picture below , and enter it in the command bar . You can choose the name according to your own preference, and the shortcut keys can also be set according to your own habits. Now you can press your shortcut keys to use for formula recognition or text recognition.+
bash /home/haijian/Python项目/Pix2Text/run.sh
Pix2Text
6. Final effect
-
Pure formula scene recognition
(1) Original picture:
(2) Recognition result:\bar{n}(\varepsilon)=\frac{1}{\Omega_{R}}\sum_{n=0}^{\infty}n x^{n}=(1-x)\,x\frac{d}{d x}(1-x)^{-1}=\frac{x}{1-x}
(3) Result compilation:
n ˉ ( ε ) = 1 Ω R ∑ n = 0 ∞ nxn = ( 1 − x ) xddx ( 1 − x ) − 1 = x 1 − x \bar{n}(\varepsilon)= \frac{1}{\Omega_{R}}\sum_{n=0}^{\infty}nx^{n}=(1-x)\,x\frac{d}{dx}(1-x )^{-1}=\frac{x}{1-x}nˉ (e)=OhR1n=0∑∞nxn=(1−x)xdxd(1−x)−1=1−xx -
Plain text scene recognition
original picture:
recognition results:82.3 完全电离混合气体 在恒星内部绝大部分地方,温度非常高而压强非常大,造成物质发生电离。这 会使得轻的元素失去全部电子,而重的元素失去大部分电子。电离出来的电子造成 当地的自由粒子数日大大增加,将对恒星物质的热力学性质产生显著影响
Compilation results:
82.3
Completely ionized mixed gas
In most places inside the star, the temperature is very high and the pressure is very high, causing ionization of matter. This
causes light elements to lose all their electrons and heavy elements to lose most of their electrons. The ionized electrons cause
a significant increase in local free particles, which will have a significant impact on the thermodynamic properties of stellar matter. -
Formula and text mixed scene recognition
original picture:
recognition results:其中c是真空中的光速,将方程(2.40)用原来的自变量写出,就给出了光子气体的 普朗克(Planck)分布函数 $$ \bar{n}=\frac{1}{e^{\varepsilon/k T}-1}=\frac{1}{e^{h\nu/k T}-1} $$ (2.42)
Compilation of results:
where c is the speed of light in vacuum, writing equation (2.40) with the original independent variables gives the
Planck distribution function of the photon gas
n ˉ = 1 e ε / k T − 1 = 1 eh ν / k T − 1 \bar{n}=\frac{1}{e^{\varepsilon/k T}-1}=\frac{1}{e^{h\nu/k T} -1}nˉ=ee / k T−11=ehν/kT−11
(2.42)