CTPN training process large data sets its own vernacular record

A. Algorithm understanding

  10,000 words omitted here. . . . . . . . . . . . . .

II. Training and understand source code

  Configure the following three steps:

  Create __init__.py file in the folder and utils utils \ bbox folder

  Execute python setup.py install in utils \ folder under bbox

  Copy the file to .pyd utils \ bbox folder

  1. Data Description: Handwritten picture region detection among the pictures themselves much marked 385, but the handwriting area in the picture about thousands, Faster Rcnn training is also useful to the back.
  2. Data Format: First, I used the wizard handwritten annotation areas exist json file ,, josn resolve extract pictures name tag and coordinates presence in txt format: 768,1622,1124,1622,1124,1750,768,1750, chinese, ## #, a total of four corner coordinates of eight, there is a label, all with the chinese, then split_label.py split the data into a small box, the format: left and right corners only 188,399,191,430 coordinates. The image data folders and label folders (data tag after the split) mlt stored in a folder can be.
  3. main / trian.py to training the core code:
    bbox_pred, cls_pred, cls_prob = model.model (input_image) generating a prediction block, category scores and probabilities
    lstm_output = Bilstm (rpn_conv, 512, 128, 512, scope_name = 'BiLSTM') double LSTM, considering characteristics before and after contact
    = lstm_fc bbox_pred (lstm_output, 512, 10 *. 4, SCOPE_NAME = "bbox_pred") 
    cls_pred = lstm_fc (lstm_output, 512, 10 * 2, SCOPE_NAME = "cls_pred") full connection candidate frame prediction and classification scores
    total_loss, model_loss, rpn_cross_entropy, rpn_loss_box = model.loss (bbox_pred, cls_pred, input_bbox, 
    various losses input_im_info) model, the cross entropy loss, loss RPN
  4. demo.py prediction, comprising small block merger.

Guess you like

Origin www.cnblogs.com/lzq116/p/12106925.html