Use dbnet to segment barcode and text (code + model) + knowledge distillation + tensorrt reasoning + use pyzbar and zxing for barcode analysis

1. DBnet

1. Code link

Split barcode and text code: github link: https://github.com/zonghaofan/dbnet_torch (provide model)

2. Paper reading

model:

                                            model diagram

Differentiable binarization

The general segmentation model takes a fixed threshold to binarize the final output result. The innovation of this article is to learn the binarized threshold, as shown in (a) in the above figure.

By adding the differentiable module, the threshold can be trained, which can better distinguish the background and the glued text.

P:probability map

T:threshold map

B^:approximate binary map

Loss function:
 

There are three main parts of loss: Ls is the loss of the text instance after shrinking, Lb is the loss of the shrinking text instance after binarization, Lt is the loss of the binarization threshold map, both Ls and Lb use bceloss with OHEM, and Lt uses L1loss.

Note that the speed given in the paper only includes forward propagation and post-processing, so it actually includes pre-processing, and the speed is not so fast.

Some results show

   

2. Knowledge distillation

Where T is the temperature, the output value of the softmax layer is directly used as the soft target. When the probability distribution entropy of the softmax output is relatively small, the value of the negative label is very close to 0, and the contribution to the loss function is very small, so small that it can be ignored. So the variable "temperature" comes in handy. When T is large, the output probability of softmax can be softened. The more smooth the distribution, the greater the entropy of the distribution, the information carried by the negative label will be relatively enlarged, and the model training will pay more attention to the negative label. That is to learn from negative labels with partial information content --> the temperature should be higher to prevent the influence of noise in the negative labels --> the temperature should be lower.

Idea: Use resnet50 (teacher) to train first, and use the trained resnet50 (teacher) to jointly train the resnet18 (student) small model. The experiment proves that f1score is one point higher than training resnet18 alone .

See the code on github.

python train_word_industry_res50.py trains the teacher model;

python train_word_industry_res18_kd.py trains the student model.

3. Torch model->onnx->tensorrt

Idea: Use torch.onnx to convert .pth to .onnx format, and use tensorrt for reasoning. For the code, see model_to_onnx.py in github.

Four. Analyze the barcode c++ version and python version

1.c++ version of zxing , see this link

The python call form is:

#coding:utf-8
"""用c++编译的zxing进行解析条形码"""
import subprocess
import os
import time
import sysos.path.join(os.path.dirname(__file__)))


def zxing_parse_code(imgpath):
    zxing_bin_path = os.path.join(os.path.dirname(__file__), "zxing")
    assert os.path.exists(zxing_bin_path), "zxing bin file not exist!"

    command = '{} --test-mode {}'.format(zxing_bin_path, imgpath)
    process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
    process.wait()
    output = process.communicate()[0].decode("utf-8").replace(' ', '').split('\n')
    # print(output)
    try:
        if 'Detected:' in output[1]:
            return output[1][9:]
        else:
            return None
    except:
        return None

2. Installation environment:

ubuntu:
apt-get install zbar-tools

apt-get install python-jpype

centos:

yum install zbar-devel

pip install pyzbar

pip install zxing

3. Code case 

#coding:utf-8
import pyzbar.pyzbar as pyzbar
import time
import shutil
import zxing
import cv2

def parse_code(codeimg, reader):
    """
    输入矫正过的条形码图片输出解析结果
    :param codeimg: 矫正过的条形码图片
    :return: 条形码解析结果
    """
    gray = cv2.cvtColor(codeimg, cv2.COLOR_BGR2GRAY)
    gray_h, gray_w = gray.shape
    barcodes1 = pyzbar.decode(gray)
    # barcodes2 = pyzbar.decode(np.rot90(np.rot90(gray)))
    # print('==barcodes2:', barcodes2)
    def parse_results(barcode):
        # for barcode in barcodes:
        # 提取条形码的位置
        # (x, y, w, h) = barcode.rect
        # 字符串转换
        barcodeData = barcode.data.decode("utf-8")
        return barcodeData

    if len(barcodes1):
        barcodeData = parse_results(barcodes1[0])
        if len(barcodeData) >= 10:#条形码位数大于10位
            return barcodeData
    else:


        if gray_h>gray_w:
            cv2.imwrite('./out_clip.jpg', np.rot90(codeimg)[...,::-1])
        else:
            cv2.imwrite('./out_clip.jpg', codeimg[...,::-1])
        barcode = reader.decode('./out_clip.jpg')
        # print('==barcode:', barcode)
        try:
            return barcode.raw
        except:
            return None

def debug_parse_code():
    reader = zxing.BarCodeReader()
    path = './5.png'
    img = cv2.imread(path)
    code_res = parse_code(img, reader)
    print('==code_res:', code_res)

if __name__ == '__main__':
    debug_parse_code()

Guess you like

Origin blog.csdn.net/fanzonghao/article/details/107199538