[Jetson target detection SSD-MobileNet application example] (3) Training your own detection model and reasoning test


The last step before starting training is to create a label map and edit the training configuration file.

Related configuration file preparation

First, create a new folder in the \research\object_detection directory train_xxxxxto store configuration files. Here, xxxxx is named by itself, which is easy to distinguish later. Create a new folder trainingto store the training output files.
Then copy ssd_mobilenet_v3_small_coco_2020_01_14the configuration file in the pre-trained model downloaded and decompressed to the folder. Move the train.record and test.record files generated above to the \object_detection\data directory. Create a new file in the \object_detection\data directory , and in a text editor, copy or enter the label mapping in the following format. This step is to define the annotation map, which tells the trainer what each object is by defining a mapping from class name to class ID. The specific format is (continue to follow the content of the previous blog as an example):ssd_mobilenet_v3_small_coco_2020_01_14.configtrain_xxxxx

name.pdtxt

item {
  id: 1
  name: 'a1'
}

item {
  id: 2
  name: 'b2'
}

item {
  id: 3
  name: 'c3'
}

item {
  id: 4
  name: 'd4'
}

item {
  id: 5
  name: 'e5'
}

The ID number in the annotation map needs to be consistent with the one defined in generate_tfrecord.py. It also needs to be consistent with the order in predefined_classes.txt when labeling pictures.

Configure training parameters

Finally, the object recognition training pipeline must be configured. It defines which models and parameters will be used for training. This is the last step before starting training.
Open the file with a text editor ssd_mobilenet_v3_small_coco_2020_01_14.config, mainly modify the number of categories and samples, and add the file path to the training data.
1. Modify the path
Find and modify the following key items:

fine_tune_checkpoint: "ssd_mobilenet_v3_small_coco_2020_01_14/model.ckpt"
train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "data/sposemap.pbtxt"
}
eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/test.record"
  }
  label_map_path: "data/sposemap.pbtxt"
  shuffle: false
  num_readers: 1
}

2. Modify parameters
Modify num_classes to the number of categories of objects you want to recognize.

num_classes: 1

Modify num_examples to the number of pictures you have in the \images\test path

eval_config: {
  num_examples: 173
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 8
}

You can also modify batch_size; modify num_steps, which is the maximum number of training iterations;
set the pre-training model: the last two sentences fine_tune_checkpoint and from_detection_checkpoint are to find the checkpoint from the pre-trained model. If there is a problem with the local configuration, you can choose to delete these two lines , which is equivalent to training from scratch.

start training

Execute under the research/object_detection path:

python model_main.py --pipeline_config_path=training_xxxx/ssd_mobilenet_v3_small_coco_2020_01_14.config --model_dir=training --num_train_steps=80000 --num_eval_steps=800 --alsologtostderr

--pipeline_config_path: the address of the configuration file

--model_dir: The address of the generated file

--num_train_steps: set the number of training steps

--num_eval_steps: Set the number of evaluation steps

If everything is set up correctly, TensorFlow will initiate training. Initialization will take a while before training officially starts. When training starts, it will look like this:
insert image description here
Loss values ​​are reported at each training step. It will change from high to low during training. I recommend allowing your model to train until the loss stays below a low value, which takes about 40,000 steps, or 2 hours (depending on how powerful your CPU and GPU are). Note: If you use a different model, the value of the loss value will be different. The loss value for MobileNet-SSD starts around 20 and should be trained to stay below 2.

training visualization

You can observe the training process through TensorBoard. Open a new Anaconda Prompt window, activate the py37 virtual environment, modify the path to ...\research\object_detection, and run the following command:

tensorboard --logdir=training

The training process periodically saves checkpoints every five minutes or so. You can terminate the training via Ctrl+C in the Command Prompt window. I generally wait until a checkpoint is saved before terminating training. You can stop training and restart it later, and it will continue training from the last saved checkpoint. The highest numbered checkpoint will be used to generate the frozen conclusion graph.

Export trained model

In the \object_detection path, run the following command, where "XXXX" in "model.ckpt-XXXX" needs to be replaced with the highest numbered .ckpt file in the training folder:

python export_tflite_ssd_graph.py --pipeline_config_path train_xxxxx/ssd_mobilenet_v3_small_coco_2020_01_14.config --trained_checkpoint_prefix training/model.ckpt-XXXX --output_directory inference_graph

At this point we get the final applicable .pd frozen map file.

dnn module adapted to opencv

When using the cv2::dnn module, it is necessary to convert the .pdtxt file of the exported model. First you need to install opencv correctly on your computer.
Execute in the opencv installation path \opencv\sources\samples\dnn to convert the model structure of tf into the structure of opencv.

python tf_text_graph_ssd.py  --input  yourfilepath....models/research/object_detection/inference_graph/tflite_graph.pb --config yourfilepath......models/research/object_detection/train_xxxxx/ssd_mobilenet_v3_small_coco_2020_01_14.config --output yourfilepath.....models/research/object_detection/inference_graph/ssdmobilenetv3large.pbtxt

Here we got two files:
ssdmobilenetv3large.pbtxt and tflite_graph_large.pb. Using them, we can easily detect

inference detection

Here we copy the two files ssdmobilenetv3large.pbtxt and tflite_graph_large.pb obtained above to a new folder, and create a new file at the same test.pytime my.names.
my.namesIn the file, directly enter the names of your classes in order

a1
b2
c3
d4
e5

test.py

import cv2
import numpy as np
import time

thres = 0.6          # 置信度阈值
nms_threshold = 0.4     # NMS阈值
outputstate = True
output = False
calssid = 0


def waitcapture(capture):   # 等待摄像头开启
    read = 1
    while ( read ):
        ret, imag = capture.read()
        size = imag.shape[1] + imag.shape[2]
        if size > 0:
            read = 0

def putconindence(imag, ind, con, boxs):    # 在原图像上写置信度数据
    for x in ind:
        boxx = boxs[x]
        print("confidence num {} confs is {}".format(x,round(con[x] * 100, 1)))
        cv2.putText(imag, str(round(con[x] * 100, 1)), (boxx[0]+5, boxx[1] + 40), cv2.FONT_HERSHEY_COMPLEX, 0.6,
                (0, 0, 255), 1)

def readclassname(path):    # 读取标签名
    classNames = []
    classFile = path
    with open(classFile, 'rt') as f:
        classNames = f.read().rstrip('\n').split('\n')
    return classNames

def classidoutput(id):
    global output
    global calssid
    global outputstate
    while outputstate :
        if output == True :
            time.sleep(0.5)
            print("image classid: ",calssid)
            time.sleep(1)
            print("IO output is down! ")
            output = False


def main() :
    global outputstate
    global calssname
    global output
    cap = cv2.VideoCapture(0)
    configPath = 'ssdmobilenetv3large.pbtxt'
    weightsPath = 'tflite_graph_large.pb'
    calssnamePath = 'my.names'
    calssname = readclassname(calssnamePath)

    net = cv2.dnn_DetectionModel(weightsPath, configPath)
    net.setInputSize(320, 320)
    net.setInputScale(1.0 / 127.5)
    net.setInputMean((127.5, 127.5, 127.5))
    net.setInputSwapRB(True)

    waitcapture(cap)
    while True:
        ret, img = cap.read()
        classIds, confs, bbox = net.detect(img, confThreshold=thres)
        bbox = list(bbox)
        confs = list(np.array(confs).reshape(1,-1)[0])
        confs = list(map(float,confs))
        # print(classIds,bbox)

        indices = cv2.dnn.NMSBoxes(bbox,confs,thres,nms_threshold)
        putconindence(img, indices,confs,bbox)
        for i in indices:
            #	i = i[0]
	        box = bbox[i]
	        x,y,w,h = box[0],box[1],box[2],box[3]
	        cv2.rectangle(img, (x,y),(x+w,h+y), color=(0, 0, 255), thickness=1)
	        cv2.putText(img,calssname[classIds[i]-1].upper(),(box[0]+5,box[1]+20),cv2.FONT_HERSHEY_COMPLEX,0.6,(255,0,0),1)
            #output = True
        cv2.imshow("Output", img)
        k = cv2.waitKey(10) & 0xFF
        if k == 27 :
            break
    cv2.destroyAllWindows()
    #outputstate = False

if __name__=='__main__':
    main()
#    out  = multiprocessing.Process(target= classidoutput, name= "output")
#    out.start()
#    out.join( )

If everything is normal, here you should see the camera turned on and the object you trained can be recognized.

Code and life:
The beginning is full of hope, but passion can never overcome the long years. Only those who stick to it can reach the distance of time.

Guess you like

Origin blog.csdn.net/weixin_47407066/article/details/126264041