Explain the VOC dataset in detail

Supporting video link for this video: https://www.bilibili.com/video/BV1ZL4y1p7Cz/

Let's first introduce a classic dataset: the VOC dataset. The acronym for Visual Object Class, its official address is http://host.robots.ox.ac.uk/pascal/VOC/.

Although everyone is more inclined to use the COCO dataset we will introduce later, the VOC dataset is also very important, and you can often see it in some papers.

You will find that many previous datasets were accompanied by competitions. The organizer of the competition provides the pictures of the training data set and the annotations of the training data set to the contestants, and also provides the pictures of the test data set without label information to the contestants. Of course, the annotation information of these test data set images is in the hands of the competition organizer. Then, the contestants fiddle with the training data set with labeled information, and then use the fiddled model to predict the test data set without labeled information, and submit the predicted labeled information to the competition in the form specified by the competition. Afterwards, the competition party will compare the prediction results submitted by the contestants with the marked results in their own hands, and then calculate the correct rate of the data submitted by the contestants, and finally rank the prediction results of each contestant.

The VOC data set also appeared with the competition, because the competition was discontinued in 2012, so the year when the data set was updated also ended in 2012.

file

As you can see in the above picture, there are competitions from 2005 to 2012, and each competition provides the data set of the corresponding year. In this way, the VOC dataset actually includes 8-year datasets from 2005 to 2012. 8 data sets, how should we choose which year data set.

Let me give you a conclusion first, and then let's talk about why we chose the data sets of these years.

The conclusion is that people are using more data sets from 2007 and 2012. As for the reason, let me tell you.

file

In 2005, the organizer of the VOC competition provided the VOC 2005 data set. At this time, the data set only had 4 target categories, and the number of data sets was only about 1,500. At this time, the VOC 2015 dataset is relatively small in terms of the number of pictures in the dataset and the types of targets. By 2006, the VOC 2006 data set had 10 categories, and the number was about 2600.

However, in 2007, the VOC 2007 dataset suddenly had 20 categories, and the number of datasets also increased to about 9900.

file

The VOC 2007 dataset was a huge turning point. The data set at this time can meet the training requirements of most models in terms of the size of the data volume and the category of the target. Therefore, you will find that many models use the VOC 2007 data set to train the network model.

However, in 2008, the VOC competition party started to do things again. They remade the dataset, so the VOC 2008 dataset has 20 categories and the number of images is around 4000. Much less than the VOC 2007 dataset can be found. From 2009 to 2012, the number of data sets in 2008 was continuously expanded. By 2012, the number of pictures in the data set was as high as 11,000.

Therefore, it is conceivable that the VOC 2012 dataset is very attractive to researchers.

The following figure is a good illustration of the changes in the VOC data set for each year.

file

In the figure above, different colors represent the content of the data set. For example, the colors in 2005 and 2006 are different, indicating that the content of the data set in 2005 and 2006 is completely different. It can be seen that 2005-07 is a stage. At this time, the content of each year's data set is different. But from the introduction just now, it can be found that the 2007 dataset is superior in terms of the number and types of pictures, so this 2005-07, VOC 2007 dataset successfully won.

And from 2008 to 2012, it was a new stage. In 2008, the VOC dataset re-established a new dataset, and then continued to expand on the basis of this dataset every year. By 11/12, the number and variety of data sets reached their peak. Please note that in 2011 and 2012, the number of images and categories in the data set are the same, but in 2012, some optimizations and improvements have been made to the labeling on the basis of 11 years. Everyone is more accustomed to liking the 12-year data set.

So, in summary, everyone should understand why everyone prefers to use the 2007 and 2012 data sets.

Tutorial on soil mounds, supporting the series of video tutorials on getting started with target detection

Guess you like

Origin blog.csdn.net/xiaotudui/article/details/122163725