简介

根据官网的介绍，Wider Face数据集最早是在2015年公开的（v1.0版本）。该数据集的图片来源是WIDER数据集，从中挑选出了32,203图片并进行了人脸标注，总共标注了393,703个人脸数据。并且对于每张人脸都附带有更加详细的信息，包扩blur（模糊程度）, expression（表情）, illumination（光照）, occlusion（遮挡）, pose（姿态），后面会进一步介绍。在数据集中，根据事件场景的类型分为了61个类。接着根据每个类别按照40% / 10% / 50%的比例划分到训练集，验证集以及测试集中。

0--Parade             29--Students_Schoolkids  48--Parachutist_Paratrooper
10--People_Marching   2--Demonstration         49--Greeting
11--Meeting           30--Surgeons             4--Dancing
12--Group             31--Waiter_Waitress      50--Celebration_Or_Party
13--Interview         32--Worker_Laborer       51--Dresses
14--Traffic           33--Running              52--Photographers
15--Stock_Market      34--Baseball             53--Raid
16--Award_Ceremony    35--Basketball           54--Rescue
17--Ceremony          36--Football             55--Sports_Coach_Trainer
18--Concerts          37--Soccer               56--Voter
19--Couple            38--Tennis               57--Angler
1--Handshaking        39--Ice_Skating          58--Hockey
20--Family_Group      3--Riot                  59--people--driving--car
21--Festival          40--Gymnastics           5--Car_Accident
22--Picnic            41--Swimming             61--Street_Battle
23--Shoppers          42--Car_Racing           6--Funeral
24--Soldier_Firing    43--Row_Boat             7--Cheering
25--Soldier_Patrol    44--Aerobics             8--Election_Campain
26--Soldier_Drilling  45--Balloonist           9--Press_Conference
27--Spa               46--Jockey
28--Sports_Fan        47--Matador_Bullfighter

数据集结构

接着，下载数据集，这里只下载训练集（Training Images），验证集（Validation Images）以及标注文件（Face annotations）：
在这里插入图片描述

下载好后进行解压，并按照如下结构摆放文件：

├── wider_face:  存放数据集根目录
      ├── WIDER_train: 训练集解压后的文件目录
      │          └──  images: 
      │                   ├──  0--Parade:         对应该类别的所有图片
      │                   ├──  ........
      │                   └──  61--Street_Battle: 对应该类别的所有图片
      │
      ├── WIDER_val: 验证集解压后的文件目录
      │          └──  images: 
      │                   ├──  0--Parade:         对应该类别的所有图片
      │                   ├──  ........
      │                   └──  61--Street_Battle: 对应该类别的所有图片
      │
      └── wider_face_split: 标注文件解压后的文件目录
                  ├──  wider_face_train.mat:         训练集的标注文件，MATLAB存储格式
                  ├──  wider_face_train_bbx_gt.txt:  训练集的标注文件，txt格式
                  ├──  wider_face_val.mat:           验证集的标注文件，MATLAB存储格式
                  ├──  wider_face_val_bbx_gt.txt:    验证的标注文件，txt格式
                  ├──  wider_face_test.mat:          测试集的标注文件，MATLAB存储格式
                  ├──  wider_face_test_filelist.txt: 测试的标注文件，txt格式
                  └──  readme.txt:                   标注文件说明

标注文件解析

在标注文件中分.mat和.txt两个版本，随便用哪一个都可以。这里，我们以分析txt格式为例。
首先看下readme.txt文件里的说明：

Attached the mappings between attribute names and label values.

blur:
  clear->0
  normal blur->1
  heavy blur->2

expression:
  typical expression->0
  exaggerate expression->1

illumination:
  normal illumination->0
  extreme illumination->1

occlusion:
  no occlusion->0
  partial occlusion->1
  heavy occlusion->2

pose:
  typical pose->0
  atypical pose->1

invalid:
  false->0(valid image)
  true->1(invalid image)

The format of txt ground truth.
File name
Number of bounding box
x1, y1, w, h, blur, expression, illumination, invalid, occlusion, pose

在说明文件中，给出了详细的标签格式说明：

第一行File name为图片的路径名称
第二行Number of bounding box为该图片中标注人脸的个数
接下来的Number of bounding box行信息为每个人脸的详细信息x1, y1, w, h, blur, expression, illumination, invalid, occlusion, pose

我们进一步看下每个人脸的详细信息x1, y1, w, h, blur, expression, illumination, invalid, occlusion, pose：

其中x1, y1, w, h代表人脸边界框的左上角x、y坐标，以及宽、高信息，注意这里是绝对坐标。
blur代表人脸的模糊程度，0代表清晰，1代表有点模糊，2代表很模糊。
expression代表表情，0代表正常的表情，1代表夸张的表情。
illumination代表光照条件，0代表正常光照，1代表极端的光照条件。
invalid这个参数其实有点迷，我通过绘制了一些invalid人脸图片发现，基本都是很小，很难分辨的人脸（不仔细看，看不出来的那种），个人觉得在使用时可以忽略掉invalid的人脸即为1的情况。
occlusion代表人脸的遮挡程度，0代表没有遮挡，1代表部分遮挡（1%-30%），2代表严重遮挡（30%以上）。
pose代表人脸的姿态，0代表典型姿态，1代表非典型姿态。论文中给出的解释Face is annotated as atypical under two conditions: either the roll or pitch degree is larger than 30-degree; or the yaw is larger than 90-degree.。不好理解的可以看下面图示标注的Atypical pose。

在这里插入图片描述

为了进一步方便大家理解，我们打开wider_face_train_bbx_gt.txt文件，比如第一行0--Parade/0_Parade_marchingband_1_849.jpg代表图片的路径，第二行的1代表在该图片中人脸的数量为1个。第三行449 330 122 149 0 0 0 0 0 0为人脸的详细信息。从第四行开始又是另一张图片，以此类推。

0--Parade/0_Parade_marchingband_1_849.jpg
1
449 330 122 149 0 0 0 0 0 0 
0--Parade/0_Parade_Parade_0_904.jpg
1
361 98 263 339 0 0 0 0 0 0 
0--Parade/0_Parade_marchingband_1_799.jpg
21
78 221 7 8 2 0 0 0 0 0 
78 238 14 17 2 0 0 0 0 0 
113 212 11 15 2 0 0 0 0 0 
134 260 15 15 2 0 0 0 0 0 
163 250 14 17 2 0 0 0 0 0 
201 218 10 12 2 0 0 0 0 0 
182 266 15 17 2 0 0 0 0 0 
245 279 18 15 2 0 0 0 0 0 
304 265 16 17 2 0 0 0 2 1 
328 295 16 20 2 0 0 0 0 0 
389 281 17 19 2 0 0 0 2 0 
406 293 21 21 2 0 1 0 0 0 
436 290 22 17 2 0 0 0 0 0 
522 328 21 18 2 0 1 0 0 0 
643 320 23 22 2 0 0 0 0 0 
653 224 17 25 2 0 0 0 0 0 
793 337 23 30 2 0 0 0 0 0 
535 311 16 17 2 0 0 0 1 0 
29 220 11 15 2 0 0 0 0 0 
3 232 11 15 2 0 0 0 2 0 
20 215 12 16 2 0 0 0 2 0

通过个人分析统计，发现在训练集中总共有12,880张图片，其中有4张是没有人脸信息的。
train中没有人脸目标的样本：

0--Parade/0_Parade_Parade_0_452.jpg
2--Demonstration/2_Demonstration_Political_Rally_2_444.jpg
39--Ice_Skating/39_Ice_Skating_iceskiing_39_380.jpg
46--Jockey/46_Jockey_Jockey_46_576.jpg

验证集中总共有3,226张图片，其中有4张没有人脸信息。
val中没有人脸目标的样本：

0--Parade/0_Parade_Parade_0_275.jpg
7--Cheering/7_Cheering_Cheering_7_426.jpg
37--Soccer/37_Soccer_soccer_ball_37_281.jpg
50--Celebration_Or_Party/50_Celebration_Or_Party_houseparty_50_715.jpg

使用Python解析标签文件

下面是我参考torchvision.dataset中关于解析.txt文件编写的方法。调用parse_wider_txt时，传入data_root指向wider_face的路径，split表示要解析训练集还是验证集的标签文件（传入train或val）

import os
from tqdm import tqdm
import cv2


def parse_wider_txt(data_root: str, split: str):
    """
    refer to: torchvision.dataset.widerface.py
    :param data_root:
    :param split:
    :return:
    """
    assert split in ['train', 'val'], f"split must be in ['train', 'val'], got {split}"

    txt_path = os.path.join(data_root, 'wider_face_split', f'wider_face_{split}_bbx_gt.txt')
    img_root = os.path.join(data_root, f'WIDER_{split}', 'images')
    with open(txt_path, "r") as f:
        lines = f.readlines()
        file_name_line, num_boxes_line, box_annotation_line = True, False, False
        num_boxes, box_counter, idx = 0, 0, 0
        labels = []
        progress_bar = tqdm(lines)
        for line in progress_bar:
            line = line.rstrip()
            if file_name_line:
                img_path = line
                file_name_line = False
                num_boxes_line = True
            elif num_boxes_line:
                num_boxes = int(line)
                num_boxes_line = False
                box_annotation_line = True
            elif box_annotation_line:
                box_counter += 1
                line_split = line.split(" ")
                line_values = [x for x in line_split]
                labels.append(line_values)
                if box_counter >= num_boxes:
                    box_annotation_line = False
                    file_name_line = True

                    if num_boxes == 0:
                        print(f"in {img_path}, no object, skip.")
                    else:
                        # 根据个人意愿，在此加上对应处理方法
                        print(img_path)
                        print(labels)
                        pass
                        # 根据个人意愿，在此加上对应处理方法

                    box_counter = 0
                    labels.clear()
                    idx += 1
                    progress_bar.set_description(f"{idx} images")
            else:
                raise RuntimeError("Error parsing annotation file {}".format(txt_path))


parse_wider_txt("/data/wider_face/",
                "val")

如果想将标注文件转为xml格式，可以看参考我github上的脚本：
https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/others_project/trans_widerface_to_xml

转换后的xml文件样例：

<?xml version="1.0" encoding="utf-8"?>
<annotation>
	<folder>WIDERFACE2017</folder>
	<filename>0--Parade/0_Parade_marchingband_1_45.jpg</filename>
	<source>
		<database>The WIDERFACE2017 Database</database>
		<annotation>WIDERFACE 2017</annotation>
		<image>WIDERFACE</image>
	</source>
	<size>
		<height>681</height>
		<width>1024</width>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<bndbox>
			<xmin>676</xmin>
			<ymin>599</ymin>
			<xmax>717</xmax>
			<ymax>649</ymax>
		</bndbox>
		<name>face</name>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<blur>1</blur>
		<expression>0</expression>
		<illumination>1</illumination>
		<invalid>0</invalid>
		<occlusion>0</occlusion>
		<pose>0</pose>
	</object>
	<object>
		<bndbox>
			<xmin>21</xmin>
			<ymin>544</ymin>
			<xmax>40</xmax>
			<ymax>563</ymax>
		</bndbox>
		<name>face</name>
		<truncated>0</truncated>
		<difficult>1</difficult>
		<blur>2</blur>
		<expression>0</expression>
		<illumination>1</illumination>
		<invalid>0</invalid>
		<occlusion>0</occlusion>
		<pose>0</pose>
	</object>
</annotation>

Wider Face数据集详解

简介

数据集结构

标注文件解析

使用Python解析标签文件

猜你喜欢