EasyOCR is an optical character recognition (OCR) tool implemented by PyTorch .
Install EasyOCR
In a command window, use pip to install the EasyOCR stable release.
pip install easyocr
Use EasyOCR
import easyocr
reader = easyocr.Reader(
['ch_sim', 'en'],
gpu=False,
model_storage_directory='model/.',
user_network_directory='model/.',
)
result = reader.readtext('examples/chinese.jpg')
When the above code is executed, the detection and recognition model will be automatically downloaded to the specified directory through the network.
['ch_sim', 'en'],
: Specifies the recognized languagegpu=False,
: Set whether to use GPU (EasyOCR runs more efficiently on GPU, set when there is no GPU or GPU memory is insufficientFalse
)model_storage_directory='model/.',
: The storage path of the detection and recognition model (it is stored in~/.EasyOCR/model
the directory by default if it is not set)
The recognition result result
is a list, and each item in the list is a 3
recognition result of length , for example ([[189, 75], [469, 75], [469, 165], [189, 165]], '愚园路', 0.3754989504814148)
, they are the bounding box , the detected text and the confidence value respectively.
easyocr-server
EasyOCR Server is a tool for extracting text from images. It is a general-purpose OCR that can read both natural scene text and dense text in documents. 80+ languages are currently supported and expanding.
installation steps
Step 0. Download easyocr-server from GitHub and install it.
git clone https://github.com/hekaiyou/easyocr-server.git
Step 1. Install easyocr , bottle and gevent modules using PyPI.
cd easyocr-server
pip install -r requirements.txt
verify installation
python main.py
- Browser: http://localhost:8080/ocr/
- CMD:
curl http://localhost:8080/ocr/ -F "language=en" -F "img_file=@examples/english.png"
After successful verification, you should be able to see the inference results printed in your browser.
Deploy the service via Docker
We provide a Dockerfile to build the image.
docker build -t easyocr-server:latest .
run it.
docker run -it -v {
DATA_DIR}:/workspace/model -p 8083:8080 easyocr-server:latest
language support
Language | Code Name |
---|---|
He asks | abc |
Adyghe | ady |
Afrikaans | of |
Angika | The |
Arabic | ar |
Assamese | as |
Avar | ava |
Azerbaijani | the |
Belarusian | be |
Bulgarian | bg |
Bihari | bh |
Bhojpuri | from |
Bengali | bn |
Bosnian | bs |
Simplified Chinese | ch_sim |
Traditional Chinese | ch_tra |
Chechen | That |
Czech | cs |
Welsh | cy |
Danish | and |
Right away | but |
German | of |
English | in |
Spanish | es |
Estonian | et |
Persian (Farsi) | fa |
French | fr |
Irish | ga |
Goan Konkani | gom |
Hindi | hi |
Croatian | hr |
Hungarian | hu |
Indonesian | id |
Ingush | inh |
Icelandic | is |
Italian | it |
Japanese | and |
Kabardian | kbd |
Kannada | kn |
Korean | is |
Kurdish | to |
Latin | the |
Lak | lbe |
Lezghian | speed |
Lithuanian | lt |
Latvian | lv |
harder | mah |
Maithili | May |
Maori | mi |
Mongolian | mn |
Marathi | mr |
Malay | ms |
Maltese | mt |
Nepali | it is |
Newari | new |
Dutch | nl |
Norwegian | no |
Occitan | oc |
Pali | pi |
Polish | pl |
Portuguese | pt |
Romanian | ro |
Russian | ru |
Serbian (cyrillic) | rs_cyrillic |
Serbian (latin) | rs_latin |
Nagpuri | sck |
Slovak | sk |
Slovenian | sl |
Albanian | sq |
Swedish | sv |
Swahili | sw |
Tamil | facing |
Tabassaran | tab |
Telugu | the |
Thai | th |
Tajik | tjk |
Tagalog | tl |
Turkish | tr |
Uyghur | and |
Ukranian | uk |
Urdu | ur |
Uzbek | to |
Vietnamese | vi |
Modify easycr-server
The core code of the above easycr-server project is in GitHub: easyocr-server/ocr.py , which can be modified according to actual needs.