Current Docker container configuration:
- Centos6.8
- python2.6.6
- openssl-1.0.1
Target Docker container configuration:
- Centos6.8
- python3.7.4
- openssl-1.1.1
- selenium 3.141.0
- geckodriver 00:15
- firefox 52
- Pillow 6.1.0
- pytesseract 0.2.7
Installation depends Environment
yum install -y zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel libffi-devel gcc make wget git unzip gcc gcc-c++ libjpeg-devel libpng-devel libgif-devel
Create a directory holding the installation package
mkdir /usr/local/download
cd /usr/local/download
Installation Python3.7.4
1. Install the latest version of openssl
Problems may be encountered:
centos6.8 default openssl version 1.0.1, while the lowest openssl version Python3.7 need for 1.0.2, so first install the latest version of openssl.
cd /usr/local/download
wget http://www.openssl.org/source/openssl-1.1.1.tar.gz
tar -zxvf openssl-1.1.1.tar.gz
cd openssl-1.1.1
./config --prefix=/usr/local/openssl shared zlib
make && make install
Set the environment variable LD_LIBRARY_PATH
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openssl/lib" >> $HOME/.bash_profile
source $HOME/.bash_profile
This step must be! ! LD_LIBRARY_PATH
Environment variables are mainly used to specify the default look for other path other than the path of the shared library (DLL). When performing functions in the dynamic link .so, if this file is not in the default directory /lib
and /usr/lib
, then you need to specify the environment variablesLD_LIBRARY_PATH
2. Install the python source code
cd /usr/local/download
wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz
tar -xvf Python-3.7.4.tgz
cd Python-3.7.4
# 编译 一定要指定刚才安装的1.1.1版本的openssl!!!
./configure --prefix=/usr/local/python-3.7 --with-openssl=/usr/local/openssl
# 优化(执行该代码后,会编译安装到 /usr/local/bin/ 下,且不用添加软连接或环境变量)
#./configure --enable-optimizations(不要执行,执行后python在导入ssl等相关包时会报错)
# 编译和安装
make && make install
# 备份原来的python
mv /usr/bin/python /usr/bin/python.bak
# 软连接
ln -s /usr/local/python-3.7/bin/python3 /usr/bin/python
ln -s /usr/local/python-3.7/bin/pip3 /usr/bin/pip
# 修改yum文件(因为yum是python2写的)
vi /usr/bin/yum
将第一行python改为python2.6
如果存在/usr/libexec/urlgrabber-ext-down,则将其中第一行的python改为python2.6
# 配置pip源(豆瓣)
[root@localhost ~]# cd
[root@localhost ~]# mkdir .pip
[root@localhost ~]# cd .pip
[root@localhost .pip]# vi pip.conf
#写入如下内容:
[global]
index-url=http://pypi.douban.com/simple
trusted-host = pypi.douban.com
Installation tesseract
# 先安装leptonica
cd /usr/local/download
wget http://www.leptonica.org/source/leptonica-1.72.tar.gz
tar xvzf leptonica-1.72.tar.gz
cd leptonica-1.72/
./configure
make && make install
# 安装tesseract
cd /usr/local/download
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.zip
unzip 3.04.zip
cd tesseract-3.04/
./configure
make && make install
ldconfig
# pip安装pytesseract
pip install pytesseract
# 安装语言包
在https://github.com/tesseract-ocr/tessdata 下载对应语言的模型文件
由于目前只需要识别手机号码和英文,只下载一个eng.traineddata文件即可,
将模型文件移动到/usr/local/share/tessdata
然后即可进行识别
# 示例
import pytesseract
from PIL import Image
image = Image.open('bb.png')
code = pytesseract.image_to_string(image)
print(code)
Installation selenium + firefox + geckodriver
Installation selenium
pip install selenium
# 查看版本
pip show selenium
Installation geckodriver
cd /usr/local/download
wget https://github.com/mozilla/geckodriver/releases/download/v0.15.0/geckodriver-v0.15.0-linux64.tar.gz
tar xvzf geckodriver-*.tar.gz
rm -f /usr/bin/geckodriver
# 软链接必须用绝对路径
ln -s /usr/local/download/geckodriver /usr/bin/geckodriver
Install firefox
cd /usr/local/download
wget http://www.rpmfind.net/linux/centos/6.10/os/x86_64/Packages/firefox-52.8.0-1.el6.centos.x86_64.rpm
yum install -y firefox-52.8.0-1.el6.centos.x86_64.rpm
Install Chinese fonts
# 新建字体目录 chinese:
mkdir /usr/share/fonts/chinese
# 将windows系统盘 c:\windows\fonts\中的字体直接上传至 centos 的 /usr/share/fonts/chinese目录下即可
chmod -R 755 /usr/share/fonts/chinese
yum -y install ttmkfdir
ttmkfdir -e /usr/share/X11/fonts/encodings/encodings.dir
# 修改fonts.conf的Font directory list,即字体列表,在这里需要把我们添加的中文字体位置加进去:
vi /etc/fonts/fonts.conf
<dir>/usr/share/fonts/chinese</dir>
# 刷新内存中的字体缓存,这样就不用reboot重启了:
fc-cache
# 最后再次通过fc-list看一下字体列表:
fc-list
Installation xvfb
There are a good tool used in Linux xvfb, which is an X server running on may be used without the physical hardware and the display input device
a,安装必需的软件包
[cat@localhost ~]# sudo yum install xdg-utils xorg-x11-server-Xvfb xorg-x11-xkb-utils
a,安装xvfb的绑定
[cat@localhost ~]# pip install xvfbwrapper pyvirtualdisplay
pip install the required package
#安装包
pip install requests
pip install Pillow
pip install httplib2
pip install excel
Reference:
When installing centos solve python3.7 No module named _ssl
No public concern 西加加先生
Fun with Python .