ID | 116967 |
タイトル別表記 | 深層学習を用いたビデオ画像における人間の感情認識に関する研究
|
著者 |
ジャルガルサイハン, オリギル
徳島大学大学院先端技術科学教育部(システム創生工学専攻)
|
キーワード | Emotion recognition
EVM-Transformer network
emotion classification
video to sequence
facial emotion recognition
|
資料タイプ |
学位論文
|
抄録 | From the beginning of this century, Artificial Intelligence (AI) has evolved to handle problems in image recognition, classification, segmentation, etc. AI learning is categorized by supervised, semi-supervised, unsupervised or reinforcement learning. Some researchers have said that the future of AI is selfawareness, which is based on reinforcement learning by rewards based on task success. Moreover, it is said that the reward would be harvested from human reactions, specially emotion recognition. On the other hand, emotion recognition is a new inspiring field, but the lack of enough amount of data for training an AI system is the major problem. Fortunately, in the near future, it will be necessary to correctly recognize human emotions because image and video dataset availability is rapidly increasing.
Emotions are mental reactions (such as anger, fear, etc.) marked by relatively strong feelings and usually causing physical reactions to previous actions in a short time duration focused on specific objects. In this Work, we are focusing on emotion recognition using face, body part, and intonation. As stated earlier, automatic understanding of human emotion in a wild setting using audiovisual signals is extremely challenging. Latent continuous dimensions can be used to accomplish the analysis of human emotional states, behaviors, and reactions displayed in real-world settings. Moreover, Valence and Arousal combinations constitute well-known and effective representations of emotions. In this thesis, a new Non-inertial loss function is proposed to train emotion recognition deep learning models. It is evaluated in wild settings using four types of candidate networks with different pipelines and sequence lengths. It is then compared to the Concordance Correlation Coefficient (CCC) and Mean Squared Error (MSE) losses commonly used for training. To prove its effectiveness on efficiency and stability in continuous or non-continuous input data, experiments were performed using the Aff-Wild dataset. Encouraging results were obtained. The contributions of the proposed method Non-Inertial loss function are as follows: 1.The new loss function allows for Valence and Arousal to be viewed together. 2.Ability to train on less data. 3.Better results. 4.Faster training times. The rest of this thesis explains our motivation, the proposed methods and finally presents our results. |
発行日 | 2022-03-01
|
備考 | 内容要旨・審査要旨・論文本文の公開
|
フルテキストファイル | |
言語 |
eng
|
著者版フラグ |
博士論文全文を含む
|
文科省報告番号 | 甲第3579号
|
学位記番号 | 甲先第421号
|
学位授与年月日 | 2022-03-01
|
学位名 |
博士(工学)
|
学位授与機関 |
徳島大学
|