Foreword
The data set of the head is used in the project, and most of the heads in the open image data set in the previous article do not have overhead pictures, which is not suitable for my application scenario. By chance, the brainwash dataset was found in guthub aditya-vora / FCHD-Fully-Convolutional-Head-Detector . Its annotation file format is very different from the yolo annotation format, this article aims to achieve the conversion of the two.
ready
- python3
- cmd
- Brainwash data set network disk link: https://pan.baidu.com/s/1Vgr6jZByU41TPd2tkiPMwA extraction code: rnk3
brainwash
The brainwash data set is a dense head detection data set, which is a data set obtained by marking a group of people appearing in a cafe and then labeling this group of people. It contains three parts, the training set: 10769 images with 81975 heads, and the validation set: 500 images with 3318 heads. Test set: 500 images with 5007 heads. This article only discusses its training set.
Its annotation files are in several idl files, such as the training set in brainwash_train.idl, and their annotation format is as follows: "picture path": annotation. Each box is enclosed in parentheses, the coordinates are separated by commas, and the boxes are separated by commas. Each picture is on its own line, and the end is separated by a semicolon.
"brainwash_11_13_2014_images/04231500_640x480.png": (316.0, 132.0, 332.0, 150.0), (201.0, 163.0, 221.0, 185.0), (136.0, 167.0, 156.0, 186.0), (349.0, 144.0, 372.0, 166.0), (606.0, 249.0, 639.0, 290.0);
"brainwash_11_13_2014_images/04232000_640x480.png": (485.0, 233.0, 527.0, 273.0), (48+3.0, 198.0, 513.0, 230.0), (291.0, 199.0, 326.0, 239.0), (208.0, 168.0, 242.0, 202.0), (137.0, 168.0, 160.0, 189.0), (197.0, 165.0, 215.0, 186.0), (317.0, 131.0, 332.0, 149.0);
"brainwash_11_24_2014_images/00000500_640x480.jpg": (385.0, 132.0, 399.0, 143.0), (152.0, 162.0, 168.0, 181.0), (120.0, 171.0, 140.0, 196.0), (468.0, 171.0, 490.0, 190.0);
"brainwash_11_24_2014_images/00001000_640x480.jpg": (160.0, 156.0, 176.0, 175.0), (121.0, 175.0, 140.0, 196.0), (357.0, 159.0, 379.0, 184.0);
The next thing to do is to take out each row of data according to this data format and then change it to yolo format:
<Category> <normalized center coordinate x> <normalized center coordinate y> <normalized picture w> <normalized picture h>
<Category> <normalized center coordinate x> <normalized center coordinate y> <normalized picture w> <normalized picture h>
........
<Category> <normalized center coordinate x> <normalized center coordinate y> <normalized picture w> <normalized picture h>
Then put these in a txt file named after the picture.
Format conversion
Split:
After observation, we found that some things are not needed, and some things should be used as file names, and some things should be used as file content, so we should first split it, and then do different treatments for each part. I first divided each line into two parts, namely the image path part and the image annotation part. Note that the two parts are divided by ":", but the end of each line is divided by ";", we can first divide ":" Replace with ";" and then split according to ";", you can remove these two extra symbols.
import os
idl_file_dir = "brainwash_train.idl" #相对地址
txt_files_dir = "txt_files"
if not os.path.exists(txt_files_dir):
os.mkdir(txt_files_dir) #用于存生成的txt文件
f1=open(idl_file_dir,'r+')
lines=f1.readlines()
#print(range(len(lines)))
for i in range(len(lines)):
line = lines[i]
line = line.replace(":",";") #用;替换:
#print(line)
After performing the above operation, the entire label file becomes two parts, the path is stored in line.split (";") [0], and the label is stored in line.split (";") [1]. First process the path part, notice that there are semicolons on both sides of the path, and then the redundant symbol, so delete it first, and then split by "/" to get the picture name, and finally by "." To get the split Picture name without suffix. Of course, this name is also the file name without the suffix of the file marked later.
img_dir = line.split(";")[0]
#print(img_dir)
img_boxs = line.split(";")[1]
img_dir = img_dir.replace('"',"") #删除分号
#print(img_dir)
img_name = img_dir.split("/")[1]
txt_name = img_name.split(".")[0] #得到后缀名与文件名
img_extension = img_name.split(".")[1]
#print(txt_name)
#print(img_extension)
After performing the above operations, the file name and the suffix name are both available, the former will be the file name of the marked file, and the latter will become a filtering condition. Now, let's split the label part, the label part looks like this. First delete all "," ( because there is a space before each "," in the original labeling file, if you replace "," with a space, then there will be more spaces after the final division ). Then delete the brackets of "(", and finally use ")" to separate the different boxes.
(316.0, 132.0, 332.0, 150.0), (201.0, 163.0, 221.0, 185.0), (136.0, 167.0, 156.0, 186.0), (349.0, 144.0, 372.0, 166.0), (606.0, 249.0, 639.0, 290.0);
(485.0, 233.0, 527.0, 273.0), (48+3.0, 198.0, 513.0, 230.0), (291.0, 199.0, 326.0, 239.0), (208.0, 168.0, 242.0, 202.0), (137.0, 168.0, 160.0, 189.0), (197.0, 165.0, 215.0, 186.0), (317.0, 131.0, 332.0, 149.0);
(385.0, 132.0, 399.0, 143.0), (152.0, 162.0, 168.0, 181.0), (120.0, 171.0, 140.0, 196.0), (468.0, 171.0, 490.0, 190.0);
(160.0, 156.0, 176.0, 175.0), (121.0, 175.0, 140.0, 196.0), (357.0, 159.0, 379.0, 184.0);
img_boxs = img_boxs.replace(",","") #删除“,”
#print(img_boxs)
img_boxs = img_boxs.replace("(","") #删除“(”
img_boxs = img_boxs.split(")") #删除“)”
#print(img_boxs)
Up to now, the data has been basically divided. One picture and one img_boxs, each box is a dimension of img_boxs, and the number of boxes is the total dimension of img_boxs. Note that the last item of each img_boxs is a space item, and this item is eliminated by only traversing to len (img_boxs) -1 minus one (do not then visit img_boxs [m], list index out of range will appear when m> 0 error).
if(img_extension == 'jpg'):
for n in range(len(img_boxs)-1): #消除空格项影响
box = img_boxs[n]
box = box.split(" ")
#print(box)
#print(box[4])
The last step is to convert brainwash's [xmin, ymin, xmax, ymax] format to yolo's <category> <normalized center coordinate x> <normalized center coordinate y> <normalized picture w> <normalized Picture h>, you need to calculate the normalized coordinates, normalized width and height, and then add the annotation file of an image to the txt file named after the picture name, and the data type conversion is completed. The result is shown below. See the appendix for the complete code.
with open(txt_files_dir+"/"+txt_name+".txt",'a') as f:
f.write(' '.join(['0', str((float(box[1]) + float(box[3]))/(2*640)),str((float(box[2]) + float(box[4]))/(2*480)),str((float(box[3]) - float(box[1]))/640),str((float(box[4]) - float(box[2]))/480)])+'\n')
yolo_mark verification
The label and picture of yolo_mark are in the same folder, and we are now in a different folder and the picture, and the label here is filtered, so the script I wrote in another article can be very It is easy to merge the contents of the two folders together based on the label. Then open yolo_mark verification, yolo_mark use method see my another article . Randomly refer to two, you can see that there is no problem with the label, indicating that the conversion is successful.
summary
The conversion between data labeling formats is nothing more than splitting the data according to certain rules, and then calculating and rearranging the data to get another kind of labeling data. Above, if a friend is destined to read this article, it is really an honour. If you have any questions, please leave a message to discuss below. I will reply after I see it.
reference
https://blog.csdn.net/sinat_35907936/article/details/88911770
https://blog.csdn.net/sinat_35907936/article/details/89605978
https://blog.csdn.net/sinat_35907936/article/details/89086081
http://arxiv.org/abs/1506.04878
appendix
import os
idl_file_dir = "brainwash_train.idl"
txt_files_dir = "txt_files"
if not os.path.exists(txt_files_dir):
os.mkdir(txt_files_dir)
f1=open(idl_file_dir,'r+')
lines=f1.readlines()
#print(range(len(lines)))
for i in range(len(lines)):
line = lines[i]
line = line.replace(":",";")
#print(line)
img_dir = line.split(";")[0]
#print(img_dir)
img_boxs = line.split(";")[1]
img_dir = img_dir.replace('"',"")
#print(img_dir)
img_name = img_dir.split("/")[1]
txt_name = img_name.split(".")[0]
img_extension = img_name.split(".")[1]
#print(txt_name)
#print(img_extension)
img_boxs = img_boxs.replace(",","")
#print(img_boxs)
img_boxs = img_boxs.replace("(","")
img_boxs = img_boxs.split(")")
#print(img_boxs)
if(img_extension == 'jpg'):
for n in range(len(img_boxs)-1):
box = img_boxs[n]
box = box.split(" ")
#print(box)
#print(box[4])
with open(txt_files_dir+"/"+txt_name+".txt",'a') as f:
f.write(' '.join(['0', str((float(box[1]) + float(box[3]))/(2*640)),str((float(box[2]) + float(box[4]))/(2*480)),str((float(box[3]) - float(box[1]))/640),str((float(box[4]) - float(box[2]))/480)])+'\n')