Several representations of bounding box in target detection
1. xyxy type (x_min, y_min, x_max, y_max) - VOC border notation
stored in multiple .xml files
2. tlwh type (x_min, y_min, width, height) - coco border notation
stored in a .json file
3.xywh type (x_center, y_center, width, height) - yolo border notation
stored in multiple .txt files
There is also a way to mark multiple json files, which I forgot.
When training your own data with different networks, you need to convert the data into data that the network can handle. For example, from VOC to json, from multiple json to COCO, from VOC to txt.