Multimodal Emotion Recognition Using Non-Inertial Loss Function

Orgil, Jargalsaikhan; Karungaru, Stephen; Terada, Kenji; Shagdar, Ganbold

doi:10.2299/jsp.25.73

直近一年間の累計

アクセス数 : ? 件

ダウンロード数 : ? 件

この文献の参照には次のURLをご利用ください : https://repo.lib.tokushima-u.ac.jp/116447

ID	116447
著者	Orgil, Jargalsaikhan Tokushima University カルンガル, ステファン Tokushima University 徳島大学教育研究者総覧 KAKEN研究者をさがす寺田, 賢治 Tokushima University 徳島大学教育研究者総覧 KAKEN研究者をさがす Shagdar, Ganbold Mongolian University of Science Technology
キーワード	deep emotion recognition emotion recognition emotion body language intonation
資料タイプ	学術雑誌論文
抄録	Automatic understanding of human emotion in a wild setting using audiovisual signals is extremely challenging. Latent continuous dimensions can be used to accomplish the analysis of human emotional states, behaviors, and reactions displayed in real-world settings. Moreover, Valence and Arousal combinations constitute well-known and effective representations of emotions. In this paper, a new Non-inertial loss function is proposed to train emotion recognition deep learning models. It is evaluated in wild settings using four types of candidate networks with different pipelines and sequence lengths. It is then compared to the Concordance Correlation Coefficient (CCC) and Mean Squared Error (MSE) losses commonly used for training. To prove its effectiveness on efficiency and stability in continuous or non-continuous input data, experiments were performed using the Aff-Wild dataset. Encouraging results were obtained.
掲載誌名	Journal of Signal Processing
ISSN	18801013
出版者	Research Institute of Signal Processing
巻	25
号	2
開始ページ	73
終了ページ	85
発行日	2021-03-01
備考	利用は著作権の範囲内に限られる。
EDB ID	372832
出版社版DOI	10.2299/jsp.25.73
出版社版URL	https://doi.org/10.2299/jsp.25.73
フルテキストファイル	jsp_25_2_73.pdf 624 KB
言語	eng
著者版フラグ	出版社版
部局	理工学系