Practical CenterNet, train and test cat face key point detection data set


Here mainly about the record, debugging CenterNet used to train the cat face critical point detection process. Because most of the Internet now uses CenterNet for target detection, but I think that CenterNet is obviously not the best for target detection. For target detection, you can see the mmdetection framework. You can refer to my blog:
mmdetection actual combat, training playing card data set ( VOC format) and test and calculate mAP to
try to follow this blog, all can run successfully!
Paper address: Objects as Points [ Domestic mirroring ]
Code address: https://github.com/xingyizhou/CenterNet
Blog reference: throw away the anchor! Real CenterNet-Interpretation of Objects as Points paper

One, local configuration

ubuntu18.04.3 + cuda10.0 + cudnn7.4.2 + PyTorch1.2 + torchvision0.4 + python3.6

Two, environment construction

  1. In fact, the official document INSTALL.md has already said in detail. Although the author uses pytorch0.4, it can run on pytorch1.x.
  2. It is also very detailed in issues/7 . I basically refer to this issue for installation. After installing the environment, you can refer to the test section inside to test the demo.
  3. There is just one thing to pay attention to, that is, when compiling DCNv2, you have to download it from this tree , and then pay attention to not copying to CenterNet/src/lib/models/networksreplace, but to delete the entire DCNv2 folder rm -rf DCNv2, and then copy the downloaded file into it, otherwise the compilation will fail.
  4. The following modified and used code files, the official pre-training model are packaged here , no need to download the winning code of the game.
  5. If you have any questions, welcome to the comment area, but I will not introduce more here.

Three, data preparation

In fact, this is an AI research club competition . The introduction and data set can be downloaded. The winning code is used here , but he did not write any introduction and give the trained model, and even the log log (only a little bit he runs The content that I copied and the json format file left behind at the time made me bite the bullet and read it for several days, so I’ll record it here, just follow me below):

  1. The data set is only decompressed train,test,train.csv, it needs to be train.csvplaced in the environment just created CenterNet/data, run generate_train_val_txt.py(also placed under data) to generate train.txt,test.txt, here is the code: generate_train_val_txt.py
import csv
import random

with open('train.csv','r') as csvfile:
    reader = csv.reader(csvfile)
    column = [row[0] for row in reader]

total_file = column[1:]

train_percent = 0.85
num = len(total_file)
print("total images numbers:", num)  # 10548

list = range(num)
tr = int(num * train_percent)
train = random.sample(list, tr)
print("train size:", tr)

ftrain = open('train.txt', 'w')
fval = open('val.txt', 'w')

for i in list:
    name = total_file[i] + '\n'
    if i in train:
        ftrain.write(name)
    else:
        fval.write(name)

ftrain.close()
fval.close()
print("write finished!")
  1. Then run generate_coco_json.pythe training data set to generate a COCO format file train.json,test.json. The code refers to how to convert your own data set to coco format in target detection and key point detection tasks. If you want to understand the COCO format, you can look at the label format of the COCO data set . Also put the code here: generate_coco_json.py
# *_* : coding: utf-8 *_*

'''
datasets process for object detection project.
for convert customer dataset format to coco data format,
'''

import traceback
import argparse
import json
import cv2
import csv
import os

__CLASS__ = ['__background__', 'CatFace']   # class dictionary, background must be in first index.

def argparser():
    parser = argparse.ArgumentParser("define argument parser for pycococreator!")
    parser.add_argument("-r", "--image_root", default='D:\\catface\\train\\', help="path of root directory")
    parser.add_argument("-p", "--phase_folder", default=["train", "val"], help="datasets split")

    return parser.parse_args()

def MainProcessing(args):
    '''main process source code.'''
    annotations = {
    
    }    # annotations dictionary, which will dump to json format file.
    image_root = args.image_root
    phase_folder = args.phase_folder
    with open('train.csv', 'r') as f:
        reader = csv.reader(f)
        result = list(reader)
        result = result[1:]

    # coco annotations info.
    annotations["info"] = {
    
    
        "description": "customer dataset format convert to COCO format",
        "url": "http://cocodataset.org",
        "version": "1.0",
        "year": 2020,
        "contributor": "ezra",
        "date_created": "2020/03/15"
    }
    # coco annotations licenses.
    annotations["licenses"] = [{
    
    
        "url": "https://www.apache.org/licenses/LICENSE-2.0.html",
        "id": 1,
        "name": "Apache License 2.0"
    }]
    # coco annotations categories.
    annotations["categories"] = []
    for cls, clsname in enumerate(__CLASS__):
        if clsname == '__background__':
            continue
        annotations["categories"].append(
            {
    
    
                "id": cls,
                "name": clsname,
                "supercategory": "Cat",
            }
        )
        for catdict in annotations["categories"]:
            if "CatFace" == catdict["name"]:
                catdict["keypoints"] = [0, 1, 2, 3, 4, 5, 6, 7, 8]
                catdict["skeleton"] = [[0,1],[1,2],[0,2],[3,4],[4,5],[5,6],[6,7],[7,8],[8,3]]

    for phase in phase_folder:
        annotations["images"] = []
        annotations["annotations"] = []
        fphase = open(phase + '.txt', 'r')
        step = 0
        for id, line in enumerate(fphase.readlines()):
            line = line.strip("\n")
            file_name = line + '.jpg'
            images_id = int(line)
            height, width, _ = cv2.imread(image_root + file_name).shape
            v = [2, 2, 2, 2, 2, 2, 2, 2, 2]
            point_str = result[images_id][1:]
            point = [int(k) for k in point_str]
            for j in range(9):
                if min(point[2*j:2*j+2]) < 0:
                    v[j] = 1
            keypoint = [point[0],  point[1],  v[0], point[2],  point[3],  v[1],
                        point[4],  point[5],  v[2], point[6],  point[7],  v[3],
                        point[8],  point[9],  v[4], point[10], point[11], v[5],
                        point[12], point[13], v[6], point[14], point[15], v[7],
                        point[16], point[17], v[8]]
            bw = max(point[0::2]) - min(point[0::2]) + 10
            bh = max(point[1::2]) - min(point[1::2]) + 10
            if (min(point[0::2]) - 5) < 0:
                x1 = 0
            else:
                x1 = (min(point[0::2]) - 5)
            if (min(point[1::2]) - 5) < 0:
                y1 = 0
            else:
                y1 = (min(point[1::2]) - 5)
            annotations["images"].append(
                {
    
    
                    "file_name": file_name,
                    "height": height,
                    "width": width,
                    "id": images_id
                }
            )
            # coco annotations annotations.
            annotations["annotations"].append(
                {
    
    
                    "id": id + 1,
                    "num_keypoints": 9,
                    "keypoints": keypoint,
                    "area": bw * bh,
                    "iscrowd": 0,
                    "image_id": images_id,
                    "bbox": [x1, y1, bw, bh],
                    "category_id": 1,
                    "segmentation": [],
                }
            )
            step += 1
            if step % 100 == 0:
                print("processing {} ...".format(step))

        json_path = phase+".json"
        with open(json_path, "w") as f:
            json.dump(annotations, f)


if __name__ == "__main__":
    print("begining to convert customer format to coco format!")
    args = argparser()
    try:
        MainProcessing(args)
    except Exception as e:
        traceback.print_exc()
    print("successful to convert customer format to coco format")

Pay attention, don’t ask me how the bbox and area are calculated like this, I just see his data get the law, the accurate calculation method is to use cocoapi , because the following tasks do not require these two attributes , So it's okay.
3. then datacreated under cocothe folder in which you create two folders: annotationsand images, the generated train.jsonand val.jsonplaced annotationsunder the extract from the previous data folder trainand testare placed imagesunder the file directory so long to do this:

├── CenterNet/data
│   ├── coco
│   │   ├── annotations
│   │   │   ├── train.json
│   │   │   ├── val.json
│   │   ├── images
│   │   │   ├── train
│   │   │   ├── test

Fourth, start training

Here, just replace the coco_hp.pysum in the package file with the opts.pycorresponding CenterNet/src/lib/datasets/dataset/coco_hp.pysum CenterNet/src/lib/opts.py, and then you can run the following command [windows may also need to set -num_workers 0, batch_size depends on your own hardware]:

python main.py multi_pose --exp_id dla_1x_catface --dataset coco_hp --lr 5e-4 --lr_step '17,27' --num_epochs 37 --batch_size 16 --gpus 0  --load_model ../models/multi_pose_dla_3x.pth

Put a screenshot of training

Five, test

  1. Here just replace the post_process.pysum multi_pose.pyand sum in the package file with the demo.pycorresponding CenterNet/src/lib/utils/post_process.pysum CenterNet/src/lib/detectors/multi_pose.pyand sumCenterNet/src/demo.py
  2. Change the coco_hp.pynumber 31行to:
self.img_dir = os.path.join(self.data_dir, 'images/test')
  1. Run the following command:
python demo.py multi_pose --demo '/home/lsm/文档/CenterNet/data/coco/images/test' --exp_id dla_1x_catfacetest --dataset coco_hp --load_model /home/lsm/文档/CenterNet/exp/multi_pose/dla_1x_catface/model_best.pth

Will be in /home/lsm/文档/CenterNet/exp/multi_pose/dla_1x_catfacetestthe next generation a result.csv file, submitted after practice sessions which you can see the results:
AI Yanxishe cat face key point detection practice results
this result than the first-place prize race before, but also the provider of this code is still a little high, that I also train Better:
AI Research Institute cat face key point detection bonus competition results in the top three
4. CenterNet/src/lib/utils/debugger.pyChange the following two properties inside:

self.num_joints = 9
self.edges = [[0, 1], [1, 2], [0, 2], [3, 4],
              [4, 5], [5, 6], [6, 7],[7, 8], [8, 3]]

Then run the following command:

python demo.py multi_pose --demo /home/lsm/文档/CenterNet/data/coco/images/test/54.jpg --load_model /home/lsm/文档/CenterNet/exp/multi_pose/dla_1x_catface/model_best.pth --debug 4

You will get the following picture:
Cat face 1
Cat face 2

六、references

The feeling after finishing it is that I really admire myself bitterly looking at the code. I haven't debugged the CenterNet code before , but I made it based on the code of the little kite chasing partner !
My own paper is almost the same, the code will only run through, not particularly in-depth, thank you for the guidance of the following blog, you can also go to them to see some loss curve drawing and other functions:
How CenterNet trains its own data set
(the latest version) Train your own data set on CenterNet?
Detailed Centernet training own data set (win10 + cuda10 + pytorch1.0.1)

Guess you like

Origin blog.csdn.net/laizi_laizi/article/details/104939170