yolov3-tiny's darknet weight transfer to onnx

foreword

I have been trying to repair the onnx model of yolov3-tiny before, and today I finally solved the last bug. If you want to enjoy the results directly, you can download it directly from my github warehouse . The instructions are all written. This article is mainly for everyone Share your ideas and process, hoping to inspire more people.

Necessary explanation

In this paper, the darknet weights are used to directly convert the onnx model.
1. The pytorch model of ultralytics is not used , because it adopts the same style as yolov5, and the five-dimensional vector involved in the middle is not suitable for deployment.
2. Did not directly download the yolov3 tiny model in the model zoo officially maintained by onnx , because that is a dynamic size, not a static one.
3. The model in this article modifies the pad method of the conv layer and maxpool layer , because the auto_pad option is not supported during the deployment process, and the pad needs to be fixed.
4. The onnx of yolov3tiny, which can be directly downloaded from the Internet, has a problem and cannot be deduced . It is too old and requires the caffee framework. For details, see onnx that needs to be repaired .
5. The conversion code in this article is modified from the yolo to onnx provided in tensorrt_demos . If you don’t have special needs like mine, you just need to download it and replace the yolo_to_onnx.py in my warehouse. Because the accuracy of the model will be lost, see the figure below for a detailed comparison:
After my modification:
insert image description here
Before modification:
insert image description here

modify the code

1. At the beginning, I modified the implementation of maxpool . It used to be auto_pad='SAME_UPPER'. Here I introduced a count global variable to judge the number of nodes to modify. Because the pads of different nodes are different, it needs to be judged by if. Here is The last node needs to be dealt with separately.

def _make_maxpool_node(self, layer_name, layer_dict):
        """Create an ONNX Maxpool node with the properties from
        the DarkNet-based graph.

        Keyword arguments:
        layer_name -- the layer's name (also the corresponding key in layer_configs)
        layer_dict -- a layer parameter dictionary (one element of layer_configs)
        """
        global count #modify
        count +=1 #modify
        stride = layer_dict['stride']
        kernel_size = layer_dict['size']
        previous_node_specs = self._get_previous_node_specs()
        inputs = [previous_node_specs.name]
        channels = previous_node_specs.channels
        kernel_shape = [kernel_size, kernel_size]
        strides = [stride, stride]
        assert channels > 0
        #modify
        if count !=6:
            maxpool_node = helper.make_node(
                'MaxPool',
                inputs=inputs,
                outputs=[layer_name],
                ceil_mode = 0,
                kernel_shape=kernel_shape,
                strides=strides,
                pads =[0,0,0,0],
                name=layer_name,
            )
        else:
            maxpool_node = helper.make_node(
                'MaxPool',
                inputs=inputs,
                outputs=[layer_name],
                ceil_mode = 0,
                kernel_shape=[3,3],
                strides=strides,
                pads =[1,1,1,1],
                name=layer_name,
            )
        #modify
        self._nodes.append(maxpool_node)
        return layer_name, channels

insert image description here
For the first five nodes, the kernel_shape is 2,2, the pads are 4 zeros, and the strides are 2,2. Some people may wonder whether other sizes can be used. In fact, other combinations can also be used, but the effect will be Not bad, and may increase inference time.
The sixth node, which is the last maxpool node on the right, is as follows:
insert image description here
Frankly speaking, I also tried it myself. First, strides cannot be larger than kernel_shape, and second, pads1, 1, 1, 1 and 0, 0, 0, 0 will add four more lines of output. I have to try the specific addition and subtraction. I also tried the kernel with the first five nodes being 3, 3, and the last is 1, 1. Although it can correspond to it, the effect is very poor, I guess It should be that the size of the receptor nucleus is too small.

2. Modify the conv layer . In fact, modification 1 has completed the purpose of repair, but I am not satisfied, because I found that the size suddenly shrank when using the manufacturer's tool to generate the model, and it couldn't match it anyway. After eliminating the suspicion of maxpool and leakyrelu, there are still resize and conv layers Yes, check the manual, although the manufacturer said that auto_pad=SAME_LOWER is supported , but I intuitively feel that there is a problem with it, so I modified the following code to run it:

def _make_conv_node(self, layer_name, layer_dict):
        """Create an ONNX Conv node with optional batch normalization and
        activation nodes.

        Keyword arguments:
        layer_name -- the layer's name (also the corresponding key in layer_configs)
        layer_dict -- a layer parameter dictionary (one element of layer_configs)
        """
        previous_node_specs = self._get_previous_node_specs()
        inputs = [previous_node_specs.name]
        previous_channels = previous_node_specs.channels
        kernel_size = layer_dict['size']
        stride = layer_dict['stride']
        filters = layer_dict['filters']
        batch_normalize = False
        if layer_dict.get('batch_normalize', 0) > 0:
            batch_normalize = True

        kernel_shape = [kernel_size, kernel_size]
        weights_shape = [filters, previous_channels] + kernel_shape
        conv_params = ConvParams(layer_name, batch_normalize, weights_shape)

        strides = [stride, stride]
        dilations = [1, 1]
        weights_name = conv_params.generate_param_name('conv', 'weights')
        inputs.append(weights_name)
        if not batch_normalize:
            bias_name = conv_params.generate_param_name('conv', 'bias')
            inputs.append(bias_name)
        #modify
        if kernel_shape == [3,3]:
            pads = [1,1,1,1]
        else:
            pads = [0,0,0,0]
        #modify
        conv_node = helper.make_node(
            'Conv',
            inputs=inputs,
            outputs=[layer_name],
            kernel_shape=kernel_shape,
            strides=strides,
            pads=pads, #modify
            dilations=dilations,
            name=layer_name
        )
        self._nodes.append(conv_node)
        inputs = [layer_name]
        layer_name_output = layer_name

It’s too long to copy later, and I changed the pads method in the same way, but this time I made a very short judgment statement, because I found that the yolov4tiny I converted before had the same right structure as yolov3tiny, among which kernel_shape When it is 3,3, the pads are always 1111, and when it is 2,2, it is always 0000.

Summarize

In the conversion of yolov4tiny before, onnxsim was used to simplify the model, thereby avoiding the operator that cannot be converted, but this time onnxsim did not take effect, and only the BN layer was simplified. In this case, understanding the operator The specific operation situation is very necessary, you need to re-export and debug by yourself, and you cannot expect to modify the onnx model generated by others.

Guess you like

Origin blog.csdn.net/weixin_43945848/article/details/128628835