阅读 CVPR 2018 papers on detecting facial action units

List

  •   1. Learning Facial Action Units From Web Images With Scalable Weakly Supervised Clustering
  •   2. Weakly Supervised Facial Action Unit Recognition Through Adversarial Training
  •   3. Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition
  •   4. Classifier Learning With Prior Probabilities for Facial Action Unit Recognition

Weakly-Supervised Attention and Relation Learning for Facial Action Unit Detection. PAMI submission.

CRF for learning relation. 

Dataset

  AFEW dataset: collected from Web, mostly dark, can be wearing glasses.

  EmotioNet

  BP4D

Thoughts

 This is a field already with a lot of researches going on. The bottleneck is still the dataset. BP4D is comprehensive but sort of small-scale. The EmotioNet is large-scale yet image-based and with the restriction on how many images are accessible.

General tasks for facial action units are similarly spatial localization (AU detection), temporal localization (a.k.a., AU event detection, not just the facial event detection) and recognition (including intensity estimation). Particularly for (joint) spatial localization and recognition, there are tons of existing works. What's especially interesting in this set of problems is that there are strong configurations of the action units such as co-occurrence, geometric relation, and so on. As a result, Bayes Net, CRF kind of methods have been introduced. I think the configuration is even stronger than just the deformable-part based model.  

Nowadays, the workhorse for AU recognition is CNN just like for any other recognition problem. Attention mechanisms can be introduced such as the spatial attention learned from weak supervision. (say, the PAMI submission paper listed above.) However, remember AU happens over long no matter it lasts short or long. Over the whole duration of a facial expression, the duration of an activated AU can be short so the temporal detection (AU event detection). However, if the task is recognition, a rough temporal localization might be acceptable. As a result, the temporal attention is helpful, say, by phase discrepancy, or optical flow (salient where the AU musle moves) 

Reading notes

  1. Learning Facial Action Units From Web Images With Scalable Weakly Supervised Clustering

This is from OSU Martinez's group and tested on their EmotioNet dataset. They extend the batch norm on the AlexNet.

  2.Weakly Supervised Facial Action Unit Recognition Through Adversarial Training
  3.Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition
  4. Classifier Learning With Prior Probabilities for Facial Action Unit Recognition

猜你喜欢

转载自blog.csdn.net/eglxiang/article/details/81747226
今日推荐