Sometimes we get a data set and find that the xml file format is as follows:
<?xml version="1.0" ?>
<doc>
<path>C:\Users\Administrator\Desktop\test\000000000074.jpg</path>
<outputs>
<object>
<item>
<name>dog</name>
<bndbox>
<xmin>64</xmin>
<ymin>271</ymin>
<xmax>361</xmax>
<ymax>385</ymax>
</bndbox>
</item>
</object>
</outputs>
<time_labeled>1692452783787</time_labeled>
<labeled>true</labeled>
<size>
<width>640</width>
<height>426</height>
<depth>3</depth>
</size>
</doc>
Or in the following json format
{"path":"C:\\Users\\Administrator\\Desktop\\test\\000000000042.jpg","outputs":{"object":[{"name":"dog","bndbox":{"xmin":228,"ymin":32,"xmax":576,"ymax":286}}]},"time_labeled":1692452770011,"labeled":true,"size":{"width":640,"height":478,"depth":3}}
Obviously the xml file format is not VOC format, the real VOC format is similar to the following
<annotation>
<folder>VOC</folder>
<filename>000000000074.jpg</filename>
<path>C:\Users\Administrator\Desktop\test\000000000074.jpg</path>
<source>
<database>FIRC</database>
</source>
<size>
<width>640</width>
<height>426</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>dog</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>64</xmin>
<ymin>271</ymin>
<xmax>361</xmax>
<ymax>385</ymax>
</bndbox>
</object>
</annotation>
Everyone knows that VOC or yolo format is the most commonly used data set format for target detection. Obviously, the annotation wizard format cannot be directly used for training, and must be converted to VOC format or yolo format. Because a small tool has been developed to support VOC or yolo format conversion, first we open the software
Drag and drop the marked xml file or json file to the list and click start to convert automatically. For details, refer to the video tutorial explanation: