TensorFlow|基于深度学习的人脸表情识别系统

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/shillyshally/article/details/84934174

更新(2019-4-12)

上传了模型权重和模型结构,因GItHub不支持25MB以上的文件,因此上传在此处,如果急用可以在此下载,也是作为对我工作的一些支持

地址:https://download.csdn.net/download/shillyshally/11110754

如果不急用可以在下方留下邮箱,我在看博客的时候会回复,但会有一段时间的延迟


更新(2019-1-1)

增加了resnet模型,可在cnn.py中切换


正好在学习tensorflow,使用tensorflow重构了一下之前自己做的那个表情识别系统,直接使用fer2013.csv转tfrecord训练,不需再逐张转为图片,训练更快,代码更精简,支持中断训练之后载入模型继续训练等等

已在github上开源

地址:https://github.com/shillyshallysxy/emotion_classifier/tree/master/emotion_classifier_tensorflow_version

提供给需要这个表情识别系统的tensorflow版本的人


原Keras版本地址:https://blog.csdn.net/shillyshally/article/details/80912854

Keras版本Github地址:https://github.com/shillyshallysxy/emotion_classifier

提供给需要原Keras版本的人


使用TensorFlow搭建并训练了卷积神经网络模型,用于人脸表情识别,训练集和测试集均采用kaggle的fer2013数据集。

达到如下效果:

                    

     打上了小小马赛克的博主。

整个表情识别系统分为两个过程:卷积神经网络模型的训练 与 面部表情的识别。

1.卷积神经网络模型的训练

1.1获取数据集

使用公开的数据集一方面可以节约收集数据的时间,另一方面可以更公平地评价模型以及人脸表情分类器的性能,因此,使用了kaggle面部表情识别竞赛所使用的fer2013人脸表情数据库。图片统一以csv的格式存储。首先用python将csv文件转为单通道灰度图片并根据标签将其分类在不同的文件夹中。

fer2013数据集链接:  https://pan.baidu.com/s/1M6XS8ovXbn8-UfQwcUnvVQ 密码: jueq

1.2预处理数据集

将数据集转化为tfrecord格式

图片直接全部载入内存,每次训练全部载入的过程缓慢,耗时长,而且必然会造成内存巨大的开销,16G的内存全部被占用之后还不够,因此考虑构建一个队列,每次从外部磁盘读取部分数据,shuffle打乱后存放到内存中的队列中,此时内存只需要维护队列大小的空间,并且一次只需要载入部分数据,载入速度快了数十倍,同时训练过程中从内存中读取数据,训练过程的速度未收到影响。

with open(csv_path, 'r') as f:
    csvr = csv.reader(f)
    header = next(csvr)
    rows = [row for row in csvr]
    trn = [row[:-1] for row in rows if row[-1] == 'Training']
    val = [row[:-1] for row in rows if row[-1] == 'PublicTest']
    tst = [row[:-1] for row in rows if row[-1] == 'PrivateTest']


def write_binary(record_name_, labels_images_, height_=default_height, width_=default_width):
    writer_ = tf.python_io.TFRecordWriter(record_name_)
    for label_image_ in tqdm(labels_images_):
        label_ = int(label_image_[0])
        image_ = np.asarray([int(p) for p in label_image_[-1].split()])

        example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    "image/label": tf.train.Feature(int64_list=tf.train.Int64List(value=[label_])),
                    "image/height": tf.train.Feature(int64_list=tf.train.Int64List(value=[height_])),
                    "image/width": tf.train.Feature(int64_list=tf.train.Int64List(value=[width_])),
                    "image/raw": tf.train.Feature(int64_list=tf.train.Int64List(value=image_))
                }
            )
        )
        writer_.write(example.SerializeToString())
    writer_.close()


write_binary(record_path_train, trn)
write_binary(record_path_test, tst)
write_binary(record_path_eval, val)

1.3搭建卷积神经网络模型

接下来就是建立卷积神经网络模型

博主在google的论文Going deeper with convolutions中获得灵感,在输入层之后加入了1*1的卷积层使输入增加了非线性的表示、加深了网络、提升了模型的表达能力,同时基本不增加计算量。之后根据VGG网络的想法,尝试将5*5网络拆分为两层3*3但最后效果并不理想,在多次尝试了多种不同的模型并不断调整之后

最终网络模型结构如下:

种类

步长

填充

输出

丢弃

输入

48*48*1

卷积层1

1*1

1

48*48*32

卷积层2

5*5

1

2

48*48*32

池化层1

3*3

2

23*23*32

卷积层3

3*3

1

1

23*23*32

池化层2

3*3

2

11*11*32

卷积层4

5*5

1

2

11*11*64

池化层3

3*3

2

5*5*64

全连接层1

1*1*2048

50%

全连接层2

1*1*1024

50%

输出

1*1*7

模型的代码:

结构清晰,就不多解释了

import tensorflow as tf
from tensorflow.contrib.layers.python.layers import initializers


class CNN_Model():
    def __init__(self, num_tags_=7, lr_=0.001, channel_=1, hidden_dim_=1024, full_shape_=2304, optimizer_='Adam'):
        self.num_tags = num_tags_
        self.lr = lr_
        self.full_shape = full_shape_
        self.channel = channel_
        self.hidden_dim = hidden_dim_
        self.conv_feature = [32, 32, 32, 64]
        self.conv_size = [1, 5, 3, 5]
        self.maxpool_size = [0, 3, 3, 3]
        self.maxpool_stride = [0, 2, 2, 2]
        # self.initializer = tf.truncated_normal_initializer(stddev=0.05)
        self.initializer = initializers.xavier_initializer()
        self.dropout = tf.placeholder(dtype=tf.float32, name='dropout')
        self.x_input = tf.placeholder(dtype=tf.float32, shape=[None, None, None, self.channel], name='x_input')
        self.y_target = tf.placeholder(dtype=tf.int32, shape=[None], name='y_target')
        self.batch_size = tf.shape(self.x_input)[0]
        self.logits = self.project_layer(self.cnn_layer())
        with tf.variable_scope("loss"):
            self.loss = self.loss_layer(self.logits)
            self.train_step = self.optimizer(self.loss, optimizer_)

    def cnn_layer(self):
        with tf.variable_scope("conv1"):
            conv1_weight = tf.get_variable('conv1_weight', [self.conv_size[0], self.conv_size[0],
                                                            self.channel, self.conv_feature[0]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv1_bias = tf.get_variable('conv1_bias', [self.conv_feature[0]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv1 = tf.nn.conv2d(self.x_input, conv1_weight, [1, 1, 1, 1], padding='SAME')
            conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
            conv1_relu = tf.nn.relu(conv1_add_bias)
            norm1 = tf.nn.lrn(conv1_relu, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm1')
        with tf.variable_scope("conv2"):
            conv2_weight = tf.get_variable('conv2_weight', [self.conv_size[1], self.conv_size[1],
                                                            self.conv_feature[0], self.conv_feature[1]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv2_bias = tf.get_variable('conv2_bias', [self.conv_feature[1]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv2 = tf.nn.conv2d(norm1, conv2_weight, [1, 1, 1, 1], padding='SAME')
            conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
            conv2_relu = tf.nn.relu(conv2_add_bias)
            pool2 = tf.nn.max_pool(conv2_relu, ksize=[1, self.maxpool_size[1], self.maxpool_size[1], 1],
                                   strides=[1, self.maxpool_stride[1], self.maxpool_stride[1], 1],
                                   padding='SAME', name='pool_layer2')
            norm2 = tf.nn.lrn(pool2, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm2')
        with tf.variable_scope("conv3"):
            conv3_weight = tf.get_variable('conv3_weight', [self.conv_size[2], self.conv_size[2],
                                                            self.conv_feature[1], self.conv_feature[2]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv3_bias = tf.get_variable('conv3_bias', [self.conv_feature[2]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv3 = tf.nn.conv2d(norm2, conv3_weight, [1, 1, 1, 1], padding='SAME')
            conv3_add_bias = tf.nn.bias_add(conv3, conv3_bias)
            conv3_relu = tf.nn.relu(conv3_add_bias)
            pool3 = tf.nn.max_pool(conv3_relu, ksize=[1, self.maxpool_size[2], self.maxpool_size[2], 1],
                                   strides=[1, self.maxpool_stride[2], self.maxpool_stride[2], 1],
                                   padding='SAME', name='pool_layer3')
            norm3 = tf.nn.lrn(pool3, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm3')
        with tf.variable_scope("conv4"):
            conv4_weight = tf.get_variable('conv4_weight', [self.conv_size[3], self.conv_size[3],
                                                            self.conv_feature[2], self.conv_feature[3]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv4_bias = tf.get_variable('conv4_bias', [self.conv_feature[3]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv4 = tf.nn.conv2d(norm3, conv4_weight, [1, 1, 1, 1], padding='SAME')
            conv4_add_bias = tf.nn.bias_add(conv4, conv4_bias)
            conv4_relu = tf.nn.relu(conv4_add_bias)
            pool4 = tf.nn.max_pool(conv4_relu, ksize=[1, self.maxpool_size[3], self.maxpool_size[3], 1],
                                   strides=[1, self.maxpool_stride[3], self.maxpool_stride[3], 1],
                                   padding='SAME', name='pool_layer4')
            norm4 = tf.nn.lrn(pool4, depth_radius=5, bias=2.0, alpha=1e-3, beta=0.75, name='norm4')
        return norm4

    def cnn_layer_single(self):
        with tf.variable_scope("conv1"):
            conv1_weight = tf.get_variable('conv1_weight', [self.conv_size[0], self.conv_size[0],
                                                            self.channel, self.conv_feature[0]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv1_bias = tf.get_variable('conv1_bias', [self.conv_feature[0]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv1 = tf.nn.conv2d(self.x_input, conv1_weight, [1, 1, 1, 1], padding='SAME')
            conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
            conv1_relu = tf.nn.relu(conv1_add_bias)
        with tf.variable_scope("conv2"):
            conv2_weight = tf.get_variable('conv2_weight', [self.conv_size[1], self.conv_size[1],
                                                            self.conv_feature[0], self.conv_feature[1]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv2_bias = tf.get_variable('conv2_bias', [self.conv_feature[1]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv2 = tf.nn.conv2d(conv1_relu, conv2_weight, [1, 1, 1, 1], padding='SAME')
            conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
            conv2_relu = tf.nn.relu(conv2_add_bias)
            pool2 = tf.nn.max_pool(conv2_relu, ksize=[1, self.maxpool_size[1], self.maxpool_size[1], 1],
                                   strides=[1, self.maxpool_stride[1], self.maxpool_stride[1], 1],
                                   padding='SAME', name='pool_layer2')
        with tf.variable_scope("conv3"):
            conv3_weight = tf.get_variable('conv3_weight', [self.conv_size[2], self.conv_size[2],
                                                            self.conv_feature[1], self.conv_feature[2]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv3_bias = tf.get_variable('conv3_bias', [self.conv_feature[2]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv3 = tf.nn.conv2d(pool2, conv3_weight, [1, 1, 1, 1], padding='SAME')
            conv3_add_bias = tf.nn.bias_add(conv3, conv3_bias)
            conv3_relu = tf.nn.relu(conv3_add_bias)
            pool3 = tf.nn.max_pool(conv3_relu, ksize=[1, self.maxpool_size[2], self.maxpool_size[2], 1],
                                   strides=[1, self.maxpool_stride[2], self.maxpool_stride[2], 1],
                                   padding='SAME', name='pool_layer3')
        with tf.variable_scope("conv4"):
            conv4_weight = tf.get_variable('conv4_weight', [self.conv_size[3], self.conv_size[3],
                                                            self.conv_feature[2], self.conv_feature[3]],
                                           dtype=tf.float32, initializer=self.initializer)
            conv4_bias = tf.get_variable('conv4_bias', [self.conv_feature[3]], dtype=tf.float32,
                                         initializer=tf.constant_initializer(0.0))
            conv4 = tf.nn.conv2d(pool3, conv4_weight, [1, 1, 1, 1], padding='SAME')
            conv4_add_bias = tf.nn.bias_add(conv4, conv4_bias)
            conv4_relu = tf.nn.relu(conv4_add_bias)
            pool4 = tf.nn.max_pool(conv4_relu, ksize=[1, self.maxpool_size[3], self.maxpool_size[3], 1],
                                   strides=[1, self.maxpool_stride[3], self.maxpool_stride[3], 1],
                                   padding='SAME', name='pool_layer4')
        return pool4

    def project_layer(self, x_in_):
        with tf.variable_scope("project"):
            with tf.variable_scope("hidden"):
                x_in_ = tf.reshape(x_in_, [self.batch_size, -1])
                w_tanh1 = tf.get_variable("w_tanh1", [self.full_shape, self.hidden_dim*2], initializer=self.initializer,
                                          regularizer=tf.contrib.layers.l2_regularizer(0.001))
                b_tanh1 = tf.get_variable("b_tanh1", [self.hidden_dim*2], initializer=tf.zeros_initializer())
                w_tanh2 = tf.get_variable("w_tanh2", [self.hidden_dim*2, self.hidden_dim], initializer=self.initializer,
                                          regularizer=tf.contrib.layers.l2_regularizer(0.001))
                b_tanh2 = tf.get_variable("b_tanh2", [self.hidden_dim], initializer=tf.zeros_initializer())
                output1 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(x_in_, w_tanh1),
                                                          b_tanh1)), keep_prob=self.dropout)
                output2 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(output1, w_tanh2),
                                                          b_tanh2)), keep_prob=self.dropout)
            with tf.variable_scope("output"):
                w_out = tf.get_variable("w_out", [self.hidden_dim, self.num_tags], initializer=self.initializer,
                                        regularizer=tf.contrib.layers.l2_regularizer(0.001))
                b_out = tf.get_variable("b_out", [self.num_tags], initializer=tf.zeros_initializer())
                pred_ = tf.add(tf.matmul(output2, w_out), b_out, name='logits')
        return pred_

    def loss_layer(self, project_logits):
        with tf.variable_scope("loss"):
            loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
                logits=project_logits, labels=self.y_target), name='softmax_loss')
        return loss

    def optimizer(self, loss_, method=''):
        if method == 'Momentum':
            step = tf.Variable(0, trainable=False)
            model_learning_rate = tf.train.exponential_decay(0.01, step,
                                                             100, 0.99, staircase=True)
            my_optimizer = tf.train.MomentumOptimizer(model_learning_rate, momentum=0.9)
            train_step_ = my_optimizer.minimize(loss_, global_step=step, name='train_step')
            print('Using ', method)
        elif method == 'SGD':
            step = tf.Variable(0, trainable=False)
            model_learning_rate = tf.train.exponential_decay(0.1, step,
                                                             200., 0.96, staircase=True)
            my_optimizer = tf.train.GradientDescentOptimizer(model_learning_rate)
            train_step_ = my_optimizer.minimize(loss_, name='train_step')
            print('Using ', method)
        elif method == 'Adam':
            train_step_ = tf.train.AdamOptimizer(self.lr).minimize(loss_, name='train_step')
            print('Using ', method)
        else:
            train_step_ = tf.train.MomentumOptimizer(0.005, momentum=0.9).minimize(loss_, name='train_step')
            print('Using Default')
        return train_step_

1.4训练模型

通过水平翻转,调节亮度饱和度,随机裁切来进行数据增强,结构清晰代码简单,就不多解释了

# 数据增强
def pre_process_img(image):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_brightness(image, max_delta=32./255)
    image = tf.image.random_contrast(image, lower=0.8, upper=1.2)
    image = tf.random_crop(image, [default_height-np.random.randint(0, 4), default_width-np.random.randint(0, 4), 1])
    image = tf.image.resize_images(image, [default_height, default_width])
    return image


def __parse_function_image(serial_exmp_):
    features_ = tf.parse_single_example(serial_exmp_, features={"image/label": tf.FixedLenFeature([], tf.int64),
                                                                "image/height": tf.FixedLenFeature([], tf.int64),
                                                                "image/width": tf.FixedLenFeature([], tf.int64),
                                                                "image/raw": tf.FixedLenFeature([], tf.string)})
    label_ = tf.cast(features_["image/label"], tf.int32)
    height_ = tf.cast(features_["image/height"], tf.int32)
    width_ = tf.cast(features_["image/width"], tf.int32)
    image_ = tf.image.decode_jpeg(features_["image/raw"])
    image_ = tf.reshape(image_, [height_, width_, channel])
    image_ = tf.image.convert_image_dtype(image_, dtype=tf.float32)
    image_ = tf.image.resize_images(image_, [default_height, default_width])
    # image_ = pre_process_img(image_)
    return image_, label_


def __parse_function_csv(serial_exmp_):
    features_ = tf.parse_single_example(serial_exmp_,
                                        features={"image/label": tf.FixedLenFeature([], tf.int64),
                                                  "image/height": tf.FixedLenFeature([], tf.int64),
                                                  "image/width": tf.FixedLenFeature([], tf.int64),
                                                  "image/raw": tf.FixedLenFeature([default_width*default_height*channel]
                                                                                  , tf.int64)})
    label_ = tf.cast(features_["image/label"], tf.int32)
    height_ = tf.cast(features_["image/height"], tf.int32)
    width_ = tf.cast(features_["image/width"], tf.int32)
    image_ = tf.cast(features_["image/raw"], tf.int32)
    image_ = tf.reshape(image_, [height_, width_, channel])
    image_ = tf.multiply(tf.cast(image_, tf.float32), 1. / 255)
    image_ = pre_process_img(image_)
    return image_, label_


def get_dataset(record_name_):
    record_path_ = os.path.join(data_folder_name, data_path_name, record_name_)
    data_set_ = tf.data.TFRecordDataset(record_path_)
    return data_set_.map(__parse_function_csv)


def evaluate(logits_, y_):
    return np.mean(np.equal(np.argmax(logits_, axis=1), y_))


def main(argv):
    with tf.Session() as sess:
        summary_writer = tf.summary.FileWriter(tensorboard_path, sess.graph)
        data_set_train = get_dataset(record_name_train)
        data_set_train = data_set_train.shuffle(shuffle_pool_size).batch(batch_size).repeat()
        data_set_train_iter = data_set_train.make_one_shot_iterator()
        train_handle = sess.run(data_set_train_iter.string_handle())

        data_set_test = get_dataset(record_name_test)
        data_set_test = data_set_test.shuffle(shuffle_pool_size).batch(test_batch_size).repeat()
        data_set_test_iter = data_set_test.make_one_shot_iterator()
        test_handle = sess.run(data_set_test_iter.string_handle())

        handle = tf.placeholder(tf.string, shape=[], name='handle')
        iterator = tf.data.Iterator.from_string_handle(handle, data_set_train.output_types, data_set_train.output_shapes)
        x_input_bacth, y_target_batch = iterator.get_next()

        cnn_model = cnn.CNN_Model()
        x_input = cnn_model.x_input
        y_target = cnn_model.y_target
        logits = tf.nn.softmax(cnn_model.logits)
        loss = cnn_model.loss
        train_step = cnn_model.train_step
        dropout = cnn_model.dropout
        sess.run(tf.global_variables_initializer())

        if retrain:
            print('retraining')
            ckpt_name = 'cnn_emotion_classifier.ckpt'
            ckpt_path = os.path.join(data_folder_name, data_path_name, ckpt_name)
            saver = tf.train.Saver()
            saver.restore(sess, ckpt_path)

        with tf.name_scope('Loss_and_Accuracy'):
            tf.summary.scalar('Loss', loss)
        summary_op = tf.summary.merge_all()

        print('start training')
        saver = tf.train.Saver(max_to_keep=1)
        max_accuracy = 0
        temp_train_loss = []
        temp_test_loss = []
        temp_train_acc = []
        temp_test_acc = []
        for i in range(generations):
            x_batch, y_batch = sess.run([x_input_bacth, y_target_batch], feed_dict={handle: train_handle})
            train_feed_dict = {x_input: x_batch, y_target: y_batch,
                               dropout: 0.5}
            sess.run(train_step, train_feed_dict)
            if (i + 1) % 100 == 0:
                train_loss, train_logits = sess.run([loss, logits], train_feed_dict)
                train_accuracy = evaluate(train_logits, y_batch)
                print('Generation # {}. Train Loss : {:.3f} . '
                      'Train Acc : {:.3f}'.format(i, train_loss, train_accuracy))
                temp_train_loss.append(train_loss)
                temp_train_acc.append(train_accuracy)
                summary_writer.add_summary(sess.run(summary_op, train_feed_dict), i)
            if (i + 1) % 400 == 0:
                test_x_batch, test_y_batch = sess.run([x_input_bacth, y_target_batch], feed_dict={handle: test_handle})
                test_feed_dict = {x_input: test_x_batch, y_target: test_y_batch,
                                  dropout: 1.0}
                test_loss, test_logits = sess.run([loss, logits], test_feed_dict)
                test_accuracy = evaluate(test_logits, test_y_batch)
                print('Generation # {}. Test Loss : {:.3f} . '
                      'Test Acc : {:.3f}'.format(i, test_loss, test_accuracy))
                temp_test_loss.append(test_loss)
                temp_test_acc.append(test_accuracy)
                if test_accuracy >= max_accuracy and save_flag and i > generations // 2:
                    max_accuracy = test_accuracy
                    saver.save(sess, os.path.join(data_folder_name, data_path_name, save_ckpt_name))
                    print('Generation # {}. --model saved--'.format(i))
        print('Last accuracy : ', max_accuracy)
        with open(model_log_path, 'w') as f:
            f.write('train_loss: ' + str(temp_train_loss))
            f.write('\n\ntest_loss: ' + str(temp_test_loss))
            f.write('\n\ntrain_acc: ' + str(temp_train_acc))
            f.write('\n\ntest_acc: ' + str(temp_test_acc))
        print(' --log saved--')


if __name__ == '__main__':
    tf.app.run()

2.人脸表情识别模块

2.1加载模型

# config=tf.ConfigProto(log_device_placement=True)
sess = tf.Session()

saver = tf.train.import_meta_graph(ckpt_path+'.meta')
saver.restore(sess, ckpt_path)
graph = tf.get_default_graph()
name = [n.name for n in graph.as_graph_def().node]
print(name)
x_input = graph.get_tensor_by_name('x_input:0')
dropout = graph.get_tensor_by_name('dropout:0')
logits = graph.get_tensor_by_name('project/output/logits:0')

2.2表情识别

兼带生成测试集以及验证集混淆矩阵的代码,将confusion_matrix值设置为True即生成混淆矩阵False为识别文件夹中的所有图片

img_size = 48
confusion_matrix = False
emotion_labels = ['angry', 'disgust:', 'fear', 'happy', 'sad', 'surprise', 'neutral']
num_class = len(emotion_labels)


def prodece_confusion_matrix(images_, total_num_):
    results = np.array([0]*num_class)
    total = []
    for imgs_ in images_:
        for img_ in imgs_:
            results[np.argmax(predict_emotion(img_))] += 1
        print(results, np.around(results/len(imgs_), decimals=3))
        total.append(results)
        results = np.array([0]*num_class)
    sum = 0
    for i_ in range(num_class):
        sum += total[i_][i_]
    print('acc: {:.3f} %'.format(sum*100./total_num_))
    print('Using ', ckpt_name)


def predict_emotion(face_img_, img_size_=48):
    face_img_ = face_img_ * (1. / 255)
    resized_img_ = cv2.resize(face_img_, (img_size_, img_size_))  # ,interpolation=cv2.INTER_LINEAR
    rsz_img = []
    rsz_img.append(resized_img_[:, :])
    rsz_img.append(resized_img_[2:45, :])
    rsz_img.append(cv2.flip(rsz_img[0], 1))
    for i_, rsz_image in enumerate(rsz_img):
        rsz_img[i_] = cv2.resize(rsz_image, (img_size_, img_size_)).reshape(img_size_, img_size_, 1)
    rsz_img = np.array(rsz_img)
    feed_dict_ = {x_input: rsz_img, dropout: 1.0}
    pred_logits_ = sess.run([tf.reduce_sum(tf.nn.softmax(logits), axis=0)], feed_dict_)
    return np.squeeze(pred_logits_)


def face_detect(image_path, casc_path_=casc_path):
    if os.path.isfile(casc_path_):
        face_casccade_ = cv2.CascadeClassifier(casc_path_)
        img_ = cv2.imread(image_path)
        img_gray_ = cv2.cvtColor(img_, cv2.COLOR_BGR2GRAY)
        # face detection
        faces = face_casccade_.detectMultiScale(
            img_gray_,
            scaleFactor=1.1,
            minNeighbors=1,
            minSize=(30, 30),
        )
        return faces, img_gray_, img_
    else:
        print("There is no {} in {}".format(casc_name, casc_path_))


if __name__ == '__main__':
    if not confusion_matrix:
        images_path = []
        files = os.listdir(pic_path)
        for file in files:
            if file.lower().endswith('jpg') or file.endswith('png'):
                images_path.append(os.path.join(pic_path, file))
        for image in images_path:
            faces, img_gray, img = face_detect(image)
            spb = img.shape
            sp = img_gray.shape
            height = sp[0]
            width = sp[1]
            size = 600
            emotion_pre_dict = {}
            face_exists = 0
            for (x, y, w, h) in faces:
                face_exists = 1
                face_img_gray = img_gray[y:y + h, x:x + w]
                results_sum = predict_emotion(face_img_gray)  # face_img_gray
                for i, emotion_pre in enumerate(results_sum):
                    emotion_pre_dict[emotion_labels[i]] = emotion_pre
                # 输出所有情绪的概率
                print(emotion_pre_dict)
                label = np.argmax(results_sum)
                emo = emotion_labels[int(label)]
                print('Emotion : ', emo)
                # 输出最大概率的情绪
                # 使框的大小适应各种像素的照片
                t_size = 2
                ww = int(spb[0] * t_size / 300)
                www = int((w + 10) * t_size / 100)
                www_s = int((w + 20) * t_size / 100) * 2 / 5
                cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), ww)
                cv2.putText(img, emo, (x + 2, y + h - 2), cv2.FONT_HERSHEY_SIMPLEX,
                            www_s, (255, 0, 255), thickness=www, lineType=1)
                # img_gray full face     face_img_gray part of face
            if face_exists:
                cv2.namedWindow('Emotion_classifier', 0)
                cent = int((height * 1.0 / width) * size)
                cv2.resizeWindow('Emotion_classifier', size, cent)
                cv2.imshow('Emotion_classifier', img)
                k = cv2.waitKey(0)
                cv2.destroyAllWindows()
                # if k & 0xFF == ord('q'):
                #     break
    if confusion_matrix:
        with open(csv_path, 'r') as f:
            csvr = csv.reader(f)
            header = next(csvr)
            rows = [row for row in csvr]
            val = [row[:-1] for row in rows if row[-1] == 'PublicTest']
            # tst = [row[:-1] for row in rows if row[-1] == 'PrivateTest']
        confusion_images_total = []
        confusion_images = {0: [], 1: [], 2: [], 3: [], 4: [], 5: [], 6: []}
        test_set = val
        total_num = len(test_set)
        for label_image_ in test_set:
            label_ = int(label_image_[0])
            image_ = np.reshape(np.asarray([int(p) for p in label_image_[-1].split()]), [img_size, img_size, 1])
            confusion_images[label_].append(image_)
        prodece_confusion_matrix(confusion_images.values(), total_num)

3.效果展示

最后附上实验环境

    系统:win10

    语言:python3.6

    显卡:GTX1080ti

参考文献

  1. Jeon J, Park J C, Jo Y J, et al. A Real-time Facial Expression Recognizer using Deep Neural Network[J].ACM 2016:1-4.
  2. He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016:770-778.
  3. Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2012:1097-1105.
  4. Zeiler M D, Fergus R. Visualizing and Understanding Convolutional Networks[C]// European Conference on Computer Vision. Springer, Cham, 2014:818-833.
  5. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions[C]. 2014
  6. Samaa M, Shohieb. SignsWorld Facial Expression Recognition System (FERS)[J]. Intelligent Automation & Soft Computing,2015
  7. Srivastava, Nitish. Dropout: A simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15.1: 1929-1958.
  8. Jia, Yangqing, Shelhamer, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[J]. 2014:675-678.

猜你喜欢

转载自blog.csdn.net/shillyshally/article/details/84934174