Hello everyone, today is the current SOTA paper on speech emotion recognition. The Chinese name of the paper is The Importance of Temporal Modeling: A New Spatiotemporal Emotion Modeling Method for Speech Emotion Recognition . The data sets trained in the paper include English, German and other common speech emotion data sets in speech emotion recognition to compare the effects of accuracy weights~ The number of emotions in each data set is different, you can refer to the following code
CASIA_CLASS_LABELS = ("angry", "fear", "happy", "neutral", "sad", "surprise")#CASIA
EMODB_CLASS_LABELS = ("angry", "boredom", "disgust", "fear", "happy", "neutral", "sad")#EMODB
SAVEE_CLASS_LABELS = ("angry","disgust", "fear", "happy", "neutral", "sad", "surprise")#SAVEE
RAVDE_CLASS_LABELS = ("angry", "calm", "disgust", "fear", "happy", "neutral","sad","surprise")#rav
IEMOCAP_CLASS_LABELS = ("angry", "happy", "neutral", "sad")#iemocap
EMOVO_CLASS_LABELS = (