【CVPR 2023】Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition——CCF A 视觉权威会议

Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition
重新思考动态面部表情识别的学习范式
（可计算情感系列国际顶级成果）

Abstract
Dynamic Facial Expression Recognition (DFER) is a rapidly developing field that focuses on recognizing facial expressions in video format. Previous research has considered non-target frames as noisy frames, but we propose that it should be treated as a weakly supervised problem. We also identify the imbalance of short- and long-term temporal relationships in DFER. Therefore, we introduce the Multi-3D Dynamic Facial Expression Learning (M3DFEL) framework, which utilizes Multi-Instance Learning (MIL) to handle inexact labels. M3DFEL generates 3D-instances to model the strong short-term temporal relationship and utilizes 3DCNNs for feature extraction. The Dynamic Long-term Instance Aggregation Module (DLIAM) is then utilized to learn the long-term temporal relationships and dynamically aggregate the instances. Our experiments on DFEW and FERV39K datasets show that M3DFEL outperforms existing state-of-the-art approaches with a vanilla R3D18 backbone. The source code is available at https://github.com/faceeyes/M3DFEL.

中文摘要
动态面部表情识别（DFER）是一个快速发展的领域，主要是识别视频格式的面部表情。以前的研究将非目标帧视为噪声帧，但我们提出应将其视为一个弱监督问题。我们还发现DFER中短期和长期时间关系的不平衡。因此，我们引入了多三维动态面部表情学习（M3DFEL）框架，它利用多实例学习（MIL）来处理不确切的标签。M3DFEL生成三维实例来模拟强大的短期时间关系，并利用3DCNNs进行特征提取。然后利用动态长期实例聚合模块（DLIAM）来学习长期的时间关系，并动态地聚合实例。我们在DFEW和FERV39K数据集上的实验表明，M3DFEL优于现有的以vanilla R3D18为骨干的先进方法。源代码可在https://github.com/faceeyes/M3DFEL。

查看原文

【CVPR 2023】Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition——CCF A 视觉权威会议

猜你喜欢