In Linux write reptiles (a) in Python

 Reference books: "Python3 web crawler developed combat" in April 2018 the first edition

System: Ubuntu 18.04.2 LTS

Background: The Tesseract has been installed as well as multi-language pack tessdata

Installation command: pip3 install tesserocr pillow

Error:

Collecting tesserocr
Using cached https://files.pythonhosted.org/packages/92/2d/05a7f8387e93c192919b508e4f4936f232bd3d2ca388b9130ae538a9f9ad/tesserocr-2.4.0.tar.gz
Collecting pillow
Using cached https://files.pythonhosted.org/packages/d2/c2/f84b1e57416755e967236468dcfb0fad7fd911f707185efc4ba8834a1a94/Pillow-6.0.0-cp36-cp36m-manylinux1_x86_64.whl
Building wheels for collected packages: tesserocr
Running setup.py bdist_wheel for tesserocr ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-n7t6st2b/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpn73hfamcpip-wheel- --python-tag cp36:
Supporting tesseract v4.0.0-beta.1
Configs from pkg-config: {'include_dirs': ['/usr/include'], 'libraries': ['lept', 'tesseract'], 'cython_compile_time_env': {'TESSERACT_VERSION': 60397825}}
/usr/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'tesserocr' extension
creating build
creating build/temp.linux-x86_64-3.6
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include -I/usr/include/python3.6m -c tesserocr.cpp -o build/temp.linux-x86_64-3.6/tesserocr.o -std=c++11 -DUSE_STD_NAMESPACE
tesserocr.cpp:42:10: fatal error: Python.h: No such file or directory
#include "Python.h"
^~~~~~~~~~
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

----------------------------------------
Failed building wheel for tesserocr
Running setup.py clean for tesserocr
Failed to build tesserocr
Installing collected packages: tesserocr, pillow
Running setup.py install for tesserocr ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-n7t6st2b/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-7bsa_hbd-record/install-record.txt --single-version-externally-managed --compile --user --prefix=:
Supporting tesseract v4.0.0-beta.1
Configs from pkg-config: {'include_dirs': ['/usr/include'], 'libraries': ['lept', 'tesseract'], 'cython_compile_time_env': {'TESSERACT_VERSION': 60397825}}
/usr/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'tesserocr' extension
creating build
creating build/temp.linux-x86_64-3.6
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include -I/usr/include/python3.6m -c tesserocr.cpp -o build/temp.linux-x86_64-3.6/tesserocr.o -std=c++11 -DUSE_STD_NAMESPACE
tesserocr.cpp:42:10: fatal error: Python.h: No such file or directory
#include "Python.h"
^~~~~~~~~~
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

 

Solution: Replace the new installation command  sudo apt install tesseract-ocr

(PS: this difference with the original version of the book version of the command might be, this version is not a pillow friendly version.)

(PPS: PillowPillow is the friendly PIL fork by Alex Clark and Contributors. PIL is thePython Imaging Library by Fredrik Lundh and Contributors.)

 

It reads as follows:

Linux
To install Tesseract 4.x you can simply run the following command on your Ubuntu 18.xx bionic:

sudo apt install tesseract-ocr
If you wish to install the Developer Tools which can be used for training, run the following command:

sudo apt install libtesseract-dev
The following instructions are for building on Linux, which also can be applied to other UNIX like operating systems.

Dependencies
A compiler for C and C++: GCC or Clang
GNU Autotools: autoconf, automake, libtool
pkg-config
Leptonica
libpng, libjpeg, libtiff

Ubuntu
If they are not already installed, you need the following libraries (Ubuntu 16.04/14.04):

sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install pkg-config
sudo apt-get install libpng-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev
if you plan to install the training tools, you also need the following libraries:

sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev

 

Original Address:  https://github.com/tesseract-ocr/tesseract/wiki/Compiling

Guess you like

Origin www.cnblogs.com/chowkaiyat/p/10958834.html