A. Algorithm understanding
10,000 words omitted here. . . . . . . . . . . . . .
II. Training and understand source code
Configure the following three steps:
Create __init__.py file in the folder and utils utils \ bbox folder
Execute python setup.py install in utils \ folder under bbox
Copy the file to .pyd utils \ bbox folder
- Data Description: Handwritten picture region detection among the pictures themselves much marked 385, but the handwriting area in the picture about thousands, Faster Rcnn training is also useful to the back.
- Data Format: First, I used the wizard handwritten annotation areas exist json file ,, josn resolve extract pictures name tag and coordinates presence in txt format: 768,1622,1124,1622,1124,1750,768,1750, chinese, ## #, a total of four corner coordinates of eight, there is a label, all with the chinese, then split_label.py split the data into a small box, the format: left and right corners only 188,399,191,430 coordinates. The image data folders and label folders (data tag after the split) mlt stored in a folder can be.
- main / trian.py to training the core code:
bbox_pred, cls_pred, cls_prob = model.model (input_image) generating a prediction block, category scores and probabilities
lstm_output = Bilstm (rpn_conv, 512, 128, 512, scope_name = 'BiLSTM') double LSTM, considering characteristics before and after contact
= lstm_fc bbox_pred (lstm_output, 512, 10 *. 4, SCOPE_NAME = "bbox_pred")
cls_pred = lstm_fc (lstm_output, 512, 10 * 2, SCOPE_NAME = "cls_pred") full connection candidate frame prediction and classification scorestotal_loss, model_loss, rpn_cross_entropy, rpn_loss_box = model.loss (bbox_pred, cls_pred, input_bbox,
various losses input_im_info) model, the cross entropy loss, loss RPN - demo.py prediction, comprising small block merger.