Visual study notes 8 - the use of darknet and analysis of cfg files

Series Article Directory



foreword

darkent is a lightweight deep learning training framework written in c and cuda and supports GPU acceleration. You can understand that darknet, like tensorflow, pytorch, caffe, and mxnet, is the bottom layer for running models. Like resnet and yolo, it is a model structure and a training network.

darknet is a deep learning framework written by the author of YOLO (see YOLO original text 2.2), and later in YOLO9000, a 19-layer convolutional network based on ResNet magic modification, called Darknet-19, was mentioned in YOLOv3 A deeper Darknet-53, both of which are backbone networks for feature extraction. Things like yolov2, v3, v4, and v7 can all be implemented on the Darknet framework (yolov5 uses pytorch). Its main feature is easy installation, no dependencies (OpenCV can be used), very good portability, and supports both CPU and GPU computing methods.

The use of the underlying C language is conducive to the improvement of operating efficiency, and the dependence on some algorithm libraries is conducive to reducing the volume. It has good portability and independence and can be used for embedded and other low-computing and low-cost boards (not including training).

Compared with the big frameworks such as TensorFlow and pytorch, darknet is not so powerful, but darknet also has its own advantages:
1. Easy to install: select the additional items you need (cuda, cudnn, opencv, etc.) in the makefile and directly make Yes, the installation can be completed in a few minutes;
2. No dependencies: the entire framework is written in C language and does not depend on any library. Even the author of opencv has written a function that can replace it; 3. The structure is clear and the source
code Easy to view and modify: the basic files of the framework are in the src folder, and some detection and classification functions defined are in the example folder, you can directly view and modify the source code as needed; 4. Friendly python interface: although
darknet It is written in c language, but it also provides a python interface. Through the python function, you can use python to directly call the trained model in .weight format; 5. Easy to transplant: the framework is very
simple to deploy locally on the machine, and can According to the machine situation, using cpu and gpu, especially for local deployment of detection and recognition tasks, darknet will be extremely convenient.


1. NoMachine, FileZilla and servers

Some that are not the focus of this article but will be used:

NoMachine installation and use
NoMachine is used to connect to the server remotely, provided that your server allows external network connections.

FileZilla installation and use
FileZilla is used for local and server connections for file transfer, provided that your server allows external network connections.

The server uses multiple users to avoid affecting the main user.


2. Install and configure darknet

1. Download

github official website AB version

(Design with a sense of dark net and black technology)
darknet magic circle
darknet official website

2. Compile

  1. First of all, you must have your own deep learning environment. You can refer to my previous blog to configure the specific environment installation configuration , and then modify the Makefile value according to your own configuration environment.
    insert image description here

Problem 1: NVCC problem

nvcc -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=[sm_50,compute_50] -gencode arch=compute_52,code=[sm_52,compute_52] -Iinclude/ -Isrc/ -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options “-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DOPENCV -DGPU -DCUDNN” -c ./src/convolutional_kernels.cu -o obj/convolutional_kernels.o
/bin/sh: 1: nvcc: not found
Makefile:89: recipe for target ‘obj/convolutional_kernels.o’ failed
make: *** [obj/convolutional_kernels.o] Error 127

Solution:

Enter the darknet directory, edit the Makefile, and modify the NVCC path.
insert image description here

  1. Execute the command under the darknet folder
make
  1. compile successfully
    insert image description here

Question 2: yolo_console_dll.cpp:(.text.startup+0x2ec) problem

When make compiles darknet, an error is reported

在函数‘main’中: yolo_console_dll.cpp:(.text.startup+0x2ec):对‘Detector::Detector(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, int, int)’未定义的引用 collect2: error: ld returned 1 exit status Makefile:173: recipe for target 'uselib' failed make: *** [uselib] Error 1

Solution:

Refer to #7654
into the Makefile

note

#$(CPP) -std=c++11 $(COMMON) $(CFLAGS) -o $@ src/yolo_console_dll.cpp $(LDFLAGS) -L ./ -l:$(LIBNAMESO)

change to

$(CPP) -std=c++11 $(COMMON) $(CFLAGS) -o $@ src/yolo_console_dll.cpp $(LDFLAGS) -L ./ $(LIBNAMESO)

3. Dataset and model preparation

1. Data set download

Source: Baseline System of the Smart Blind Guide Group of the 15th China Computer Design Competition

structure:

WisdomGuide______annotations______train_list.txt
		     |			      |___val_list.txt
		     |			      |___instance_train.json
		     |			      |___instance_val.json
		     |_______train________JPEGImages___图片文件
 		     |                |___labels___标签文件
		     |________val________JPEGImages___图片文件
		                      |___labels___标签文件

Modify the distribution of the data set, create new train, val, JPEGImages and labels folders, where train stores the training set, val stores the test set, JPEGImages stores pictures, labels stores label files, train_list.txt is the path of all training pictures, val_list.txt are all test image paths. The establishment and distribution of these files are regular and have a reason. If you don't want to modify the code, use this data set format.

Note that for the picture folder JPEGImages and the label folder labels, if you want to separate the pictures and labels into different folders, then the name of the folder must be JPEGImages and labels, if you want to mix pictures and labels in the same folder Then don't use these two names.

The train_list.txt, val_list.txt, label txt files are generated by the script below.

2. json to txt script and path aggregation script

json_txt.py:
Change to your own path


import os
import json
from tqdm import tqdm
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--json_path', default='/home/nh666/llw/ORB_FAR/WisdomGuide/annotations/instance_val.json', type=str, help="input: coco format(json)")
parser.add_argument('--save_path', default='/home/nh666/llw/ORB_FAR/WisdomGuide/annotations/val_txt', type=str, help="specify where to save the output dir of labels")
arg = parser.parse_args()


def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = box[0] + box[2] / 2.0
    y = box[1] + box[3] / 2.0
    w = box[2]
    h = box[3]

    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


if __name__ == '__main__':
    json_file = arg.json_path  # COCO Object Instance 类型的标注
    ana_txt_save_path = arg.save_path  # 保存的路径

    data = json.load(open(json_file, 'r'))
    if not os.path.exists(ana_txt_save_path):
        os.makedirs(ana_txt_save_path)

    id_map = {
    
    }  # coco数据集的id不连续!重新映射一下再输出!
    for i, category in enumerate(data['categories']):
        id_map[category['id']] = i

    # 通过事先建表来降低时间复杂度
    max_id = 0
    for img in data['images']:
        max_id = max(max_id, img['id'])
    # 注意这里不能写作 [[]]*(max_id+1),否则列表内的空列表共享地址
    img_ann_dict = [[] for i in range(max_id + 1)]
    for i, ann in enumerate(data['annotations']):
        img_ann_dict[ann['image_id']].append(i)

    for img in tqdm(data['images']):
        filename = img["file_name"]
        img_width = img["width"]
        img_height = img["height"]
        img_id = img["id"]
        head, tail = os.path.splitext(filename)
        ana_txt_name = head + ".txt"  # 对应的txt名字,与jpg一致
        f_txt = open(os.path.join(ana_txt_save_path, ana_txt_name), 'w')
        '''for ann in data['annotations']:
            if ann['image_id'] == img_id:
                box = convert((img_width, img_height), ann["bbox"])
                f_txt.write("%s %s %s %s %s\n" % (id_map[ann["category_id"]], box[0], box[1], box[2], box[3]))'''
        # 这里可以直接查表而无需重复遍历
        for ann_id in img_ann_dict[img_id]:
            ann = data['annotations'][ann_id]
            box = convert((img_width, img_height), ann["bbox"])
            f_txt.write("%s %s %s %s %s\n" % (id_map[ann["category_id"]], box[0], box[1], box[2], box[3]))
        f_txt.close()

path_synthesis.py:
Change to your own path

# -*- coding: utf-8 -*-
import time
import os
import shutil

# 获取所有文件路径集合

def readFilename(path, allfile):
    filelist = os.listdir(path)

    for filename in filelist:
        filepath = os.path.join(path, filename)
        if os.path.isdir(filepath):
            readFilename(filepath, allfile)
        else:
            allfile.append(filepath)
    return allfile


if __name__ == '__main__':
    # 文件夹路径
    path1 = "/home/nh666/llw/ORB_FAR/WisdomGuide/val"
    allfile1 = []
    allfile1 = readFilename(path1, allfile1)
    allname1 = []
    # 文件路径
    txtpath = "/home/nh666/llw/ORB_FAR/WisdomGuide/annotations/val_list.txt"
    for name in allfile1:
        print(name)
        file_cls = name.split("/")[-1].split(".")[-1]
        # 后缀名
        if file_cls == 'png':
            print(name.split("/")[-1])
            with open(txtpath, 'a+') as fp:
                fp.write("".join(name) + "\n")

3. File modification

1. The .names file

Copy a darknet/data/voc.names file and rename it to voc_blind.names, which stores the training category, but it will not really mark why or what during training, but mark it as the serial number 01234, so in the file The sorting of category names is very important, to correspond to your tag txt file.

insert image description here

2. .data file

Copy a darknet/cfg/voc.data file and rename it to voc_blind.data, which stores the number of classes, train/valid dataset path, names category name path, and backup model path, which need to be modified according to your own location.
insert image description here

3. .cfg file

A few days ago, yolov7 was released, and it was found that darknet has also been implemented. I originally planned to use v7, but found that the training could not be performed during training, and an error was reported.

 cuDNN status Error in: file: ./src/convolutional_kernels.cu : () : line: 555 : build time: Oct 26 2022 - 16:19:38 
 cuDNN Error: CUDNN_STATUS_BAD_PARAM
Darknet error location: ./src/dark_cuda.c, cudnn_check_error, line #204
cuDNN Error: CUDNN_STATUS_BAD_PARAM: Resource temporarily unavailable

Solution ideas:
There are three main ideas

  1. Change the version of CUDA and CUDNN, which version is not clear
  2. Do not use the -map command during training, do not calculate mAP
  3. Change and adjust the value of batch and subdivisions, it is not clear how much it is

I chose to change the batch and subdivisions to 64 in the cfg to solve this problem. This change seems to be related to your training hardware, such as single card, multi-card, cpu, gpu, etc.

batch=64
subdivisions=64  # changed from 16 to 64
width=416
height=416

As for CUDA and CUDNN, I checked the discussion area in github . It seems that I need to change the cuDNN version, but I am training on the server and cannot change it casually, so I use yolov4 instead.

4. yolov4-tiny.cfg

Copy a darknet/cfg/yolov4-tiny.cfg file and rename it to yolov4_blind.cfg, change the settings:
subdivisions is a multiple of 8, if the GPU memory is large, subdivisions can be filled with 8, and the memory hours can be filled with 32.
max_batches is the maximum number of training times, which is not directly related to the number of samples. It is recommended to be the number of samples × 2000 to avoid fitting problems.
The steps are changed to 80% and 90% of max_batches.

subdivisions=8
max_batches = 10000
steps=8000,9000

Ctrl+F searches for the "yolo" keyword, there are a few to change, and continue to change:

filters=30 //Formula calculation (5+categories)×3
activation=linear
[yolo]
mask = 3,4,5
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
classes =5

4. Training and testing

1. Terminal training

single card

Enter the darknet folder, enter the terminal, and execute the following command:

 sudo ./darknet detector train cfg/voc_blind.data cfg/yolov4_blind.cfg -map

Training result map
insert image description here

Doka

First check whether it is a multi-card gpu

#静态查询
nvidia-smi  

 #动态查询 每10秒检查gpu使用情况
watch -n 10 nvidia-smi 

Check to see if those cards are not occupied. If they are occupied, it means that someone else is training or something, especially on the server. Don't mess with other people's things.

For example, I have four cards, of which 0,1 card GPU-Util accounts for 0%, then use 0,1 card to be idle.

sudo ./darknet detector train cfg/voc_blind.data cfg/yolov4_blind.cfg  -gpus 0,1 -map

Multiple cards are faster than single cards

insert image description here
Parameter explanation:

The table will display some information about the graphics card. The first line is the version information, the second line is the title bar, and the third line is the specific graphics card information. If there are multiple graphics cards, there will be multiple lines, and the information value of each line corresponds to Information about the corresponding position of the title bar.

* GPU:编号,这里是012,3

* Fan:风扇转速,在0100%之间变动,第一个是32%

* Name:显卡名,这里都是GeForce

* Temp:显卡温度,第一个是55摄氏度

* Perf:性能状态,从P0到P12,P0性能最大,P12最小

* Persistence-M:持续模式的状态开关,该模式耗能大,但是启动新GPU应用时比较快,这里是off

* Pwr:能耗

* Bus-Id:GPU总线

* Disp.A:表示GPU的显示是否初始化

* Memory-Usage:显存使用率

* GPU-Util:GPU利用率,第一个是22%,第二个6%

* Compute M.:计算模式

One thing to note is that memory usage and GPU usage are two different things, similar to memory and CPU, and the usage of the two indicators does not necessarily correspond to each other.

error 1

CUDA Error: out of memory
Darknet error location: ./src/dark_cuda.c, check_error, line #69
CUDA Error: out of memory: No such file or directory

It means that the memory is insufficient, there are two possibilities: either your GPU memory is full, you can use nvidia-smi to see, or the batch and subdivisions values ​​​​of the .cfg file may be too large, you can change it to 64>32>16> 8>1 and so on.

2. Save the model

The best model is the model that achieved the best performance during training. It is usually the model that achieves the best performance on the validation set.

The final model is the model saved at the end of the training process. It is usually the last model saved after training is complete.

The last model is the last saved model during training. It is usually the model saved at some point during the training process, and not necessarily the best performing model.

Therefore, these three models represent the models saved at different times during the training process, which can be used to compare the performance of the models and select the best model for inference.

Under dackup, where it thinks best is the best.
insert image description here

3. Test

1. Picture test

sudo ./darknet detector test cfg/voc_blind.data cfg/yolov4_blind.cfg backup/yolov4_blind_best.weights

After running, you will also need to enter the image path.
insert image description here

2.val verification

sudo ./darknet detector valid cfg/voc_blind.data cfg/yolov4_blind.cfg backup/yolov4_blind_best.weights

After running, the verification file of the verification set val will be generated.
insert image description here

3. Real-time detection

source code

insert image description here

Five, cfg file interpretation

Take the resnet50.cfg model file as an example:

#darknet中用于定义网络结构的组件,它包含了一系列参数,用于指定网络的输入大小、batch大小、学习率等,还可以指定网络的类型和激活函数,以及指定网络的权重初始化方式。
[net]
# Training
#每次训练时,darknet将从训练数据集中抽取128个样本进行训练。
batch=128
#每次训练时,使用4个mini-batch,每个mini-batch的大小等于batch_size/4,这样可以提高训练速度和精度,训练时间可能会延长。
subdivisions=4

# Testing
# batch=1
# subdivisions=1

#样本数据设置
#输入图片的高度和宽度
height=256
width=256
#在图像预处理时,将图像裁剪为448x448像素大小。,可以提高网络的训练速度和准确率。
max_crop=448
#图像的通道数,即RGB图像的通道数
channels=3
#momentum梯度下降算法中的一个超参数,可以帮助模型更快地收敛。通过记录每次更新的梯度,在每次迭代时将其累积到当前梯度中来实现。
#每次更新梯度时,模型会将上一次梯度的90%累积到当前梯度中。
momentum=0.9
#学习率衰减系数,即每次训练后学习率会乘以衰减系数,以减小学习率,减少模型过拟合。
decay=0.0005

#训练方式设置
#在训练神经网络之前,使用1000个训练数据来初始化权重,以确保神经网络能够正确地从训练开始,前1000层的权重都将被设置为预训练的权重,而后面的层则被随机初始化。
burn_in=1000
#每次迭代模型时更新后参数与原参数之间的比例。0.1表示每次更新后参数与原参数之间的比例为10%。
learning_rate=0.1
#使用poly来调整学习率。,可以帮助模型更快地收敛。
#学习率优化策略,根据训练步骤的不同改变学习率,从而更有效地训练神经网络有效地提高模型的性能。
policy=poly
#表示使用的是ReLU激活函数,使用4倍的梯度下降加速训练。power越大,激活函数越陡峭,模型越容易收敛,但也可能会导致过拟合。
power=4
#神经网络训练的最大次数,1600000次表示训练过程将会运行1600000次
max_batches=1600000

#图像优化处理设置
#使用7个不同的角度(0°,45°,90°,135°,180°,225°,270°)来增强图像,以提高模型的鲁棒性。
angle=7
#hue=.1是图像的色调参数,表示色调偏移量为0.1,可以控制图像的色彩深浅,以及色彩组合的调整。
hue=.1
#色彩饱和度,它是影响图像色彩的一个参数,取值范围为[0,1],取值越大,图像的色彩饱和度越高。
saturation=.75
#图像的曝光度。它是一种图像增强技术,可以改变图像中每个像素的亮度,以改善模型的泛化能力。
exposure=.75
#图像的宽高比,aspect=.75表示要求输入的图像宽高比必须是0.75
aspect=.75

#卷积操作参数设置
#darknet中的一个层,它可以实现卷积操作,以提取图像的特征。卷积层是神经网络中最重要的层之一,它可以提取图像中的特征,并将它们传递给下一层。
[convolutional]
#启用batch normalization,这意味着每个卷积层都会在每次训练时进行规范化,以提高模型的准确性和效率。
batch_normalize=1
#每一次卷积过后的输出通道数,也就是卷积滤波器的数量。每一层卷积层都有一个卷积核,每个卷积核都会产生一个输出通道,filters=64表示每一层卷积层的卷积核个数和输出通道数为64。
filters=64
#卷积核尺寸
size=7
#每次卷积操作的步长为2,即每次卷积操作后,特征图的尺寸减少一半
stride=2
#在每一层卷积操作之前,先在输入图像的四周填充1个像素的边缘,这样可以避免边缘信息的损失。
pad=1
#Leaky ReLU激活函数,它允许一定程度的梯度流失,这有助于防止神经网络中出现的梯度消失问题,从而提高模型的准确性和精确性。
activation=leaky

#最大下采样池化层,用于提取图像中的最大值,以减少图像的尺寸,并保留最重要的特征,可以减少计算量,提高模型的准确性,并且可以防止过拟合。
[maxpool]
#卷积核尺寸为2
size=2
#每次卷积操作的步长为2,即每次卷积操作后,特征图的尺寸减少一半
stride=2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#线性激活函数,即没有任何激活函数。这意味着神经元的输出值与输入值之间没有任何变化,这种情况通常用于输出层,因为输出层通常需要线性激活函数来计算预测结果。这种激活函数可以有效地减少网络的复杂性,并可以保证网络的稳定性。
activation=linear

#残差网络中的一种融合做加法技术,它允许在相同层之间建立一条直接的路径,从而将输入信号直接传递到输出层。这样可以减少计算量,提高网络的性能。
[shortcut]
#从第四层开始,从而实现残差网络的特征提取。
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky


# Conv 4
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

#Conv 5
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
filters=1000
size=1
stride=1
pad=1
activation=linear

#均值池化层,它是一种特殊类型的池化层,可以将输入特征图的每个通道求均值,并将输出转换为1x1xC的特征图
[avgpool]

#激活函数,它将输入的任意实数值映射到0到1之间的概率值,用于分类问题。它可以将多个输入值映射到唯一的输出概率值,因此可以用于多类分类问题。
[softmax]
#使用的是单组卷积,即只有一个组卷积层,而不是多组卷积层。这种架构的网络在深度学习中被称为ResNet,它使用了残差连接来提升模型的准确率。
groups=1

#使用的是平方损失函数,该函数可以衡量预测值和实际值之间的差异,从而帮助模型改进。
[cost]
#使用SSE(Streaming SIMD Extensions)指令集来加速神经网络的计算。SSE是一种用于改善CPU的流水线性能的指令集,可以更有效地运行多个数学运算。
type=sse

Guess you like

Origin blog.csdn.net/qq_45848817/article/details/127529504