[Open source] provides individual pre-training results ocr annotation tool

ocr annotation tool

Provide pre-training model marked results
using python-flask framework developed image annotation tool, the main idea is in the process of labeling the first call for free api Baidu / Ali, tagging, and manual verification or correction target rectangle, at the same time supports tagging single image multiple targets. Finishing idea is simple, but the implementation is still very troublesome. It took about two weeks (but is modified on the open source projects come).

Original Features

  • B / S interactively
  • Support people while labeling (label range can be assigned different label people or different people different label categories)
  • Using the category selection mode, without having to manually input category work
  • Support drag and drop marked correction area
  • Support keyboard arrows labeled sample switch

Add features

  • Increase the interface Baidu api / personal use api calls
  • Api test results support the adjustment Baidu
  • Lmdb make use of the database, do the key with the name of the picture to save
  • Support single picture, mark multiple targets

Instructions

  1. The requirements.txtinstallation environment dependence, some of which may not rely on the use, can also be installed with the corresponding flask python version, according to another given installation dependent
$ cd od-annotation
$ pip3 install -r requirements.txt       # use pip
# or create a new conda enviroment
conda env create -f enviroment.yaml
  1. You will need to mark the picture on: 'dataset / images /'

  2. Start / stop / restart markup tools:

$ cd od-annotation
$ python3 app.py --start|stop|restart  # 前台进程方式运行
$ python3 app.py --start|restart --daemon  # 以后台进程方式(重新)启动
  1. Access http://localhost:5000to start tagging. The whole labeling process is: 1 first call Baidu api; 2. the Baidu results drawn, and adjust; 3. Save results / next. Wherein the rectangular frame can be moved left, and lower right corner can be resized rectangular frame. Right-click on the rectangle you can delete the current dimension of the rectangular frame. 当前样本标注状态It gets updated label information. 所有样本标注状态With the original author, does not adjust its display, there may be errors, it does not matter.

  2. Click the left and right direction buttons or the keyboard arrow keys to switch labeled samples. Automatic annotation results submitted to switch, or manually click 保存a button to submit the results marked.

  3. Annotation files dataset/labels/, the database is lmdb. Label format is: {'img_name': 'x1,y1,x2,y2,label\nx12,y12,x22,y22,label2\n'}a rectangle marked, using the coordinates of the upper left and lower right

Changing the pre-training API

Return by the same format, modify app.py line 81 to line 102

reference

borrowed code from od-annotation,thanks!

Code

github:https://github.com/chenjun2hao/ocr_annotation
CSDN:https://download.csdn.net/download/u011622208/12086826

Published 244 original articles · won praise 147 · views 280 000 +

Guess you like

Origin blog.csdn.net/u011622208/article/details/103873152