Total for the last 12 months
number of access : ?
number of downloads : ?
ID 116967
Title Alternative
深層学習を用いたビデオ画像における人間の感情認識に関する研究
Author
Jargalsaikhan, Orgil Tokushima University
Keywords
Emotion recognition
EVM-Transformer network
emotion classification
video to sequence
facial emotion recognition
Content Type
Thesis or Dissertation
Description
From the beginning of this century, Artificial Intelligence (AI) has evolved to handle problems in image recognition, classification, segmentation, etc. AI learning is categorized by supervised, semi-supervised, unsupervised or reinforcement learning. Some researchers have said that the future of AI is self­awareness, which is based on reinforcement learning by rewards based on task success. Moreover, it is said that the reward would be harvested from human reactions, specially emotion recognition. On the other hand, emotion recognition is a new inspiring field, but the lack of enough amount of data for training an AI system is the major problem. Fortunately, in the near future, it will be necessary to correctly recognize human emotions because image and video dataset availability is rapidly increasing.
Emotions are mental reactions (such as anger, fear, etc.) marked by relatively strong feelings and usually causing physical reactions to previous actions in a short time duration focused on specific objects. In this Work, we are focusing on emotion recognition using face, body part, and intonation.
As stated earlier, automatic understanding of human emotion in a wild setting using audiovisual signals is extremely challenging. Latent continuous dimensions can be used to accomplish the analysis of human emotional states, behaviors, and reactions displayed in real-world settings. Moreover, Valence and Arousal combinations constitute well-known and effective representations of emotions. In this thesis, a new Non-inertial loss function is proposed to train emotion recognition deep learning models. It is evaluated in wild settings using four types of candidate networks with different pipelines and sequence lengths. It is then compared to the Concordance Correlation Coefficient (CCC) and Mean Squared Error (MSE) losses commonly used for training. To prove its effectiveness on efficiency and stability in continuous or non-continuous input data, experiments were performed using the Aff-Wild dataset. Encouraging results were obtained.
The contributions of the proposed method Non-Inertial loss function are as follows:
1.The new loss function allows for Valence and Arousal to be viewed together.
2.Ability to train on less data.
3.Better results.
4.Faster training times.
The rest of this thesis explains our motivation, the proposed methods and finally presents our results.
Published Date
2022-03-01
Remark
内容要旨・審査要旨・論文本文の公開
FullText File
language
eng
TextVersion
ETD
MEXT report number
甲第3579号
Diploma Number
甲先第421号
Granted Date
2022-03-01
Degree Name
Doctor of Engineering
Grantor
Tokushima University