Introduction to the Darknet Framework

Retraining for end-to-end systems for facial expressions

 

 

 

 

Appendix 1:

Source code analysis of darknet deep learning framework: detailed Chinese annotations, covering framework principles and implementation syntax analysis

https://github.com/hgpvision/darknet

darknet is a relatively lightweight open source deep learning framework based entirely on C and CUDA. Its main features are easy installation, no dependencies (OpenCV can be used), very good portability, and supports both CPU and GPU computing methods .

 

Compared to TensorFlow, darknet is not that powerful, but this has become an advantage of darknet:

  1. darknet is completely implemented in C language, without any dependencies, of course OpenCV can be used, but it is only used to display pictures, for better visualization;

  2. darknet supports CPU (so it doesn't matter if there is no GPU) and GPU (CUDA/cuDNN, of course it is better to use GPU);

  3. It is precisely because it is lighter and does not have a powerful API like TensorFlow, so it gives me the feeling that it has another flavor of flexibility, which is suitable for researching the bottom layer, and it can be improved and extended from the bottom layer more conveniently. ;

  4. The implementation of darknet is similar to the implementation of caffe. I am familiar with darknet, and I believe it will be helpful to get started with caffe;

 

 

Appendix 2:

Author: Zhihu User
Link : https://www.zhihu.com/question/51747665/answer/145607615
Source: Zhihu The
copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

The three most important struct definitions in darknet are network_state, network, layer; the new version of network_state has been incorporated into network.

代码可以先忽略 gpu 部分,不同种类的网络层都是通过 layer 里面的函数指针 forward backward 和 update 定义本种类的执行规则。如 connected layer 就有 forward_connected_layer backward_connected_layer update_connected_layer 三个方法,gru layer 等也是一样;

原子运算只在 blas.c 和 gemm.c 里,网络的运算在 network.c 中,最重要的是 train_network_datum ,train_networks, train_network_batch 和 network_predict;

train_network_datum 是输入数据用 float_pair , 就是 float *x , float *y 结对;

train_networks 是在 network_kernel.cu 里,以并发线程方式进行训练,参数是 data ;

有一点, darknet 在CPU模式下是单线程的,在多块GPU显卡模式下,train_networks支持多卡运行,而且这里也是改造成分布式多主机darknet运行的入口,可以看到训练出的权重数据合并和scale。

train_network_datum 顺序执行 forward_network { 逐层正向网络 } backward_network { 逐层逆向网络 },满足次数下(*net.seen %subdivisions)执行一次 update_network( ,,, rate, momentum, decay);

对于用户定义的网络参数文件处理在 parse_network_cfg, 读入训练结果通过 load_weights

主干就是这些了。

如果需要处理特别需求的数据源,需要参考 data.c 入手。

对 cfg 配置文件,重点调整(当然是全部参数都很重要,可能都要调整),训练时调整重点的全局参数: decay momentum learning_rate 这三个是与收敛速度有关的。policy 是weights 策略的, inputs batch(及相关的subdivisions) ouputs 是与数据吞吐维度相关的,最新版本好像ouputs这里有改正。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325681674&siteId=291194637