Darknet source code reading notes (1)

I was watching yolo recently, and by the way, make some notes.

Keep track of where the pictures and tags are read.

In data.c

int fill_truth_detection(const char *path, int num_boxes, float *truth, int classes, int flip, float dx, float dy, float sx, float sy,
    int net_w, int net_h)

Will replace the image path with the label path, and read the label.

char labelpath[4096];
replace_image_to_label(path, labelpath);

int count = 0;
int i;
box_label *boxes = read_boxes(labelpath, &count);

There is the definition of box_label in darknet.h:

// data.h
typedef struct box_label {
    int id;
    float x, y, w, h;
    float left, right, top, bottom;
} box_label;

Try here to view the information:

box_label *read_boxes(char *filename, int *n)
{
    box_label* boxes = (box_label*)xcalloc(1, sizeof(box_label));
    FILE *file = fopen(filename, "r");
    if (!file) {
        printf("Can't open label file. (This can be normal only if you use MSCOCO): %s \n", filename);
        //file_error(filename);
        FILE* fw = fopen("bad.list", "a");
        fwrite(filename, sizeof(char), strlen(filename), fw);
        char *new_line = "\n";
        fwrite(new_line, sizeof(char), strlen(new_line), fw);
        fclose(fw);
        if (check_mistakes) getchar();

        *n = 0;
        return boxes;
    }
    float x, y, h, w;
    int id;
    int count = 0;
    while(fscanf(file, "%d %f %f %f %f", &id, &x, &y, &w, &h) == 5){
        boxes = (box_label*)xrealloc(boxes, (count + 1) * sizeof(box_label));
        boxes[count].id = id;
        boxes[count].x = x;
        boxes[count].y = y;
        boxes[count].h = h;
        boxes[count].w = w;
        boxes[count].left   = x - w/2;
        boxes[count].right  = x + w/2;
        boxes[count].top    = y - h/2;
        boxes[count].bottom = y + h/2;
        ++count;
    }
    fclose(file);
    *n = count;
    return boxes;
}

It can be seen from the above code that the four boundaries left, right, top, and bottom are calculated by the center coordinates x, y and the width and height w, h.

From the above code, it can be seen that if you want to change the yolo positioning method to the four-vertex coordinate method, you need to add a new label structure and modify the corresponding code. It is expected that the workload will be relatively large.

 

After knowing how the data comes in, look at how the data is used in the code. The key point is the back propagation of the gradient.

 

yolov2 source code analysis

 

Reference: YOLO v2 loss function source code analysis

Reference: Interpretation of yolo v2 loss function source code

 

 

 

 

Guess you like

Origin blog.csdn.net/juluwangriyue/article/details/109047636