darknet 源码阅读笔记（一）

最近在看yolo，顺便做一下笔记备忘。

一直在跟踪图片和标签是在哪里读入的。

在data.c中

int fill_truth_detection(const char *path, int num_boxes, float *truth, int classes, int flip, float dx, float dy, float sx, float sy,
    int net_w, int net_h)

会替换图片路径为标签路径，并读取标签。

char labelpath[4096];
replace_image_to_label(path, labelpath);

int count = 0;
int i;
box_label *boxes = read_boxes(labelpath, &count);

在darknet.h中有box_label的定义：

// data.h
typedef struct box_label {
    int id;
    float x, y, w, h;
    float left, right, top, bottom;
} box_label;

试到这里查看信息：

box_label *read_boxes(char *filename, int *n)
{
    box_label* boxes = (box_label*)xcalloc(1, sizeof(box_label));
    FILE *file = fopen(filename, "r");
    if (!file) {
        printf("Can't open label file. (This can be normal only if you use MSCOCO): %s \n", filename);
        //file_error(filename);
        FILE* fw = fopen("bad.list", "a");
        fwrite(filename, sizeof(char), strlen(filename), fw);
        char *new_line = "\n";
        fwrite(new_line, sizeof(char), strlen(new_line), fw);
        fclose(fw);
        if (check_mistakes) getchar();

        *n = 0;
        return boxes;
    }
    float x, y, h, w;
    int id;
    int count = 0;
    while(fscanf(file, "%d %f %f %f %f", &id, &x, &y, &w, &h) == 5){
        boxes = (box_label*)xrealloc(boxes, (count + 1) * sizeof(box_label));
        boxes[count].id = id;
        boxes[count].x = x;
        boxes[count].y = y;
        boxes[count].h = h;
        boxes[count].w = w;
        boxes[count].left   = x - w/2;
        boxes[count].right  = x + w/2;
        boxes[count].top    = y - h/2;
        boxes[count].bottom = y + h/2;
        ++count;
    }
    fclose(file);
    *n = count;
    return boxes;
}

由上面代码可知四个边界left,right,top,bottom是通过中心坐标x,y和宽高w,h计算得到的。

由上述代码可知从假如想要改变yolo定位方式为四个顶点坐标的方式则需要添加新的标签结构体，并修改相应的代码，预计工作量会比较大。

知道数据怎样进来后再看代码中是如何使用这些数据的，重点要看的是梯度的反向传播。

yolov2 源码解析

参考：YOLO v2 损失函数源码分析

参考：yolo v2 损失函数源码解读

darknet 源码阅读笔记（一）

猜你喜欢