Human Emotion Recognition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (31 March 2022) | Viewed by 26600

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Mechatronics and Information Systems, Lodz University of Technology, 90-924 Lodz, Poland
Interests: human behavior analysis; affective computing; universal design
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
The Intelligent Computer Vision (iCV) Research Lab in the Institute of Technology, University of Tartu, 50411 Tartu, Estonia
Interests: machine learning; computer vision; human–computer interaction; emotion recognition; deep learning; human behaviour analysis
Special Issues, Collections and Topics in MDPI journals
Veleučilište u Šibeniku
Interests: data processing; databases; machine learning
DigiMedia Research Center, University of Aveiro, 3810-193 Aveiro, Portugal
Interests: virtual reality; multimedia; digital heritage; UX

E-Mail Website
Guest Editor
Universidade de Aveiro, Aveiro, Portugal
Interests: mixed realities; tangible media

Special Issue Information

Dear Colleagues,

Robotic systems and computers have become a prominent aspect of our lives, and their presence gives rise to unique technologies. Therefore, human–computer interaction becomes more realistic if computers are capable of recognising the expressions of the human during interaction. Emotion and expression recognition are intuitive yet extremely complicated tasks, as they lay at the base of human–computer interaction, ranging from mobile computing and gaming to health monitoring and robotics. Emotions are evoked by different mechanisms such as events, objects, other people, or phenomena that lead to various consequences manifesting in our body. Automatic affect recognition methods utilize various input types, such as facial expressions, speech, gestures and body language, and physical signals such as electroencephalography (EEG), electromyography (EMG), electrodermal activity, etc. Although emotion recognition has been investigated for many years, it is still an active research area because of the growing interest in applications exploiting avatars, animation, neuromarketing, and sociable robots.

This Special Issue invites contributions that address (i) systems and devices for capturing emotions, and (ii) machine-learning techniques of relevance to tackle the issues above. In particular, submitted papers should clearly show novel contributions and innovative applications covering, but not limited to, any of the following topics around emotion recognition:

  • Systems and devices for capturing bio-signals;
  • Affective database creation, experiment datasets;
  • Data pre-processing;
  • Non-intrusive sensor technologies;
  • Emotion recognition using mobile phones and smart bracelet;
  • Smart clothes and emotion;
  • Machine-learning techniques for emotion recognition;   
  • Deep learning for emotion recognition.

Dr. Dorota Kamińska
Prof. Dr. Gholamreza Anbarjafari
Dr. Frane Urem
Prof. Rui Raposo
Dr. Mário Vairinhos
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Systems and devices for capturing bio-signals
  • Affective database creation, experiment datasets
  • Data pre-processing
  • Non-intrusive sensor technologies
  • Emotion recognition using mobile phones and smart bracelet
  • Smart clothes and emotion
  • Machine-learning techniques for emotion recognition
  • Deep learning for emotion recognition

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 1870 KiB  
Article
Gender Neutralisation for Unbiased Speech Synthesising
by Davit Rizhinashvili, Abdallah Hussein Sham and Gholamreza Anbarjafari
Electronics 2022, 11(10), 1594; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11101594 - 17 May 2022
Cited by 4 | Viewed by 1858
Abstract
Machine learning can encode and amplify negative biases or stereotypes already present in humans, resulting in high-profile cases. There can be multiple sources encoding the negative bias in these algorithms, like errors from human labelling, inaccurate representation of different population groups in training [...] Read more.
Machine learning can encode and amplify negative biases or stereotypes already present in humans, resulting in high-profile cases. There can be multiple sources encoding the negative bias in these algorithms, like errors from human labelling, inaccurate representation of different population groups in training datasets, and chosen model structures and optimization methods. Our paper proposes a novel approach to speech processing that can resolve the gender bias problem by eliminating the gender parameter. Therefore, we devised a system that transforms the input sound (speech of a person) into a neutralized voice to the point where the gender of the speaker becomes indistinguishable by both humans and AI. Wav2Vec based network has been utilised to conduct speech gender recognition to validate the main claim of this research work, which is the neutralisation of gender from the speech. Such a system can be used as a batch pre-processing layer for training models, thus making associated gender bias irrelevant. Further, such a system can also find its application where speaker gender bias by humans is also prominent, as the listener will not be able to judge the gender from speech. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

17 pages, 783 KiB  
Article
Machine Learning Models and Videos of Facial Regions for Estimating Heart Rate: A Review on Patents, Datasets, and Literature
by Tiago Palma Pagano, Victor Rocha Santos, Yasmin da Silva Bonfim, José Vinícius Dantas Paranhos, Lucas Lemos Ortega, Paulo Henrique Miranda Sá, Lian Filipe Santana Nascimento, Ingrid Winkler and Erick Giovani Sperandio Nascimento
Electronics 2022, 11(9), 1473; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091473 - 04 May 2022
Cited by 6 | Viewed by 3308
Abstract
Estimating heart rate is important for monitoring users in various situations. Estimates based on facial videos are increasingly being researched because they allow the monitoring of cardiac information in a non-invasive way and because the devices are simpler, as they require only cameras [...] Read more.
Estimating heart rate is important for monitoring users in various situations. Estimates based on facial videos are increasingly being researched because they allow the monitoring of cardiac information in a non-invasive way and because the devices are simpler, as they require only cameras that capture the user’s face. From these videos of the user’s face, machine learning can estimate heart rate. This study investigates the benefits and challenges of using machine learning models to estimate heart rate from facial videos through patents, datasets, and article review. We have searched the Derwent Innovation, IEEE Xplore, Scopus, and Web of Science knowledge bases and identified seven patent filings, eleven datasets, and twenty articles on heart rate, photoplethysmography, or electrocardiogram data. In terms of patents, we note the advantages of inventions related to heart rate estimation, as described by the authors. In terms of datasets, we have discovered that most of them are for academic purposes and with different signs and annotations that allow coverage for subjects other than heartbeat estimation. In terms of articles, we have discovered techniques, such as extracting regions of interest for heart rate reading and using video magnification for small motion extraction, and models, such as EVM-CNN and VGG-16, that extract the observed individual’s heart rate, the best regions of interest for signal extraction, and ways to process them. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

18 pages, 2901 KiB  
Article
Advanced Fusion-Based Speech Emotion Recognition System Using a Dual-Attention Mechanism with Conv-Caps and Bi-GRU Features
by Bubai Maji, Monorama Swain and Mustaqeem
Electronics 2022, 11(9), 1328; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091328 - 22 Apr 2022
Cited by 29 | Viewed by 2825
Abstract
Recognizing the speaker’s emotional state from speech signals plays a very crucial role in human–computer interaction (HCI). Nowadays, numerous linguistic resources are available, but most of them contain samples of a discrete length. In this article, we address the leading challenge in Speech [...] Read more.
Recognizing the speaker’s emotional state from speech signals plays a very crucial role in human–computer interaction (HCI). Nowadays, numerous linguistic resources are available, but most of them contain samples of a discrete length. In this article, we address the leading challenge in Speech Emotion Recognition (SER), which is how to extract the essential emotional features from utterances of a variable length. To obtain better emotional information from the speech signals and increase the diversity of the information, we present an advanced fusion-based dual-channel self-attention mechanism using convolutional capsule (Conv-Cap) and bi-directional gated recurrent unit (Bi-GRU) networks. We extracted six spectral features (Mel-spectrograms, Mel-frequency cepstral coefficients, chromagrams, the contrast, the zero-crossing rate, and the root mean square). The Conv-Cap module was used to obtain Mel-spectrograms, while the Bi-GRU was used to obtain the rest of the spectral features from the input tensor. The self-attention layer was employed in each module to selectively focus on optimal cues and determine the attention weight to yield high-level features. Finally, we utilized a confidence-based fusion method to fuse all high-level features and pass them through the fully connected layers to classify the emotional states. The proposed model was evaluated on the Berlin (EMO-DB), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and Odia (SITB-OSED) datasets to improve the recognition rate. During experiments, we found that our proposed model achieved high weighted accuracy (WA) and unweighted accuracy (UA) values, i.e., 90.31% and 87.61%, 76.84% and 70.34%, and 87.52% and 86.19%, respectively, demonstrating that the proposed model outperformed the state-of-the-art models using the same datasets. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

21 pages, 1263 KiB  
Article
Real-Time Facial Expression Recognition Using Deep Learning with Application in the Active Classroom Environment
by David Dukić and Ana Sovic Krzic
Electronics 2022, 11(8), 1240; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11081240 - 14 Apr 2022
Cited by 11 | Viewed by 3807
Abstract
The quality of a teaching method used in a classroom can be assessed by observing the facial expressions of students. To automate this, Facial Expression Recognition (FER) can be employed. Based on the recognized emotions of students, teachers can improve their lectures by [...] Read more.
The quality of a teaching method used in a classroom can be assessed by observing the facial expressions of students. To automate this, Facial Expression Recognition (FER) can be employed. Based on the recognized emotions of students, teachers can improve their lectures by determining which activities during the lecture evoke which emotions and how these emotions are related to the tasks solved by the students. Previous work mostly addresses the problem in the context of passive teaching, where teachers present while students listen and take notes, and usually in online courses. We take this a step further and develop predictive models that can classify emotions in the context of active teaching, specifically a robotics workshop, which is more challenging. The two best generalizing models (Inception-v3 and ResNet-34) on the test set were combined with the goal of real-time emotion prediction on videos of workshop participants solving eight tasks using an educational robot. As a proof of concept, we applied the models to the video data and analyzed the predicted emotions with regard to activities, tasks, and gender of the participants. Statistical analysis showed that female participants were more likely to show emotions in almost all activity types. In addition, for all activity types, the emotion of happiness was most likely regardless of gender. Finally, the activity type in which the analyzed emotions were the most frequent was programming. These results indicate that students’ facial expressions are related to the activities they are currently engaged in and contain valuable information for teachers about what they can improve in their teaching practice. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

22 pages, 5365 KiB  
Article
Analysis of Physiological Signals for Stress Recognition with Different Car Handling Setups
by Pamela Zontone, Antonio Affanni, Riccardo Bernardini, Leonida Del Linz, Alessandro Piras and Roberto Rinaldo
Electronics 2022, 11(6), 888; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11060888 - 11 Mar 2022
Cited by 10 | Viewed by 2149
Abstract
When designing a car, the vehicle dynamics and handling are important aspects, as they can satisfy a purpose in professional racing, as well as contributing to driving pleasure and safety, real and perceived, in regular drivers. In this paper, we focus on the [...] Read more.
When designing a car, the vehicle dynamics and handling are important aspects, as they can satisfy a purpose in professional racing, as well as contributing to driving pleasure and safety, real and perceived, in regular drivers. In this paper, we focus on the assessment of the emotional response in drivers while they are driving on a track with different car handling setups. The experiments were performed using a dynamic professional simulator prearranged with different car setups. We recorded various physiological signals, allowing us to analyze the response of the drivers and analyze which car setup is more influential in terms of stress arising in the subjects. We logged two skin potential responses (SPRs), the electrocardiogram (ECG) signal, and eye tracking information. In the experiments, three car setups were used (neutral, understeering, and oversteering). To evaluate how these affect the drivers, we analyzed their physiological signals using two statistical tests (t-test and Wilcoxon test) and various machine learning (ML) algorithms. The results of the Wilcoxon test show that SPR signals provide higher statistical significance when evaluating stress among different drivers, compared to the ECG and eye tracking signals. As for the ML classifiers, we count the number of positive or “stress” labels of 15 s SPR time intervals for each subject and each particular car setup. With the support vector machine classifier, the mean value of the number of positive labels for the four subjects is equal to 13.13% for the base setup, 44.16% for the oversteering setup, and 39.60% for the understeering setup. In the end, our findings show that the base car setup appears to be the least stressful, and that our system enables us to effectively recognize stress while the subjects are driving in the different car configurations. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

16 pages, 8176 KiB  
Article
Two-Stage Recognition and beyond for Compound Facial Emotion Recognition
by Dorota Kamińska, Kadir Aktas, Davit Rizhinashvili, Danila Kuklyanov, Abdallah Hussein Sham, Sergio Escalera, Kamal Nasrollahi, Thomas B. Moeslund and Gholamreza Anbarjafari
Electronics 2021, 10(22), 2847; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10222847 - 19 Nov 2021
Cited by 20 | Viewed by 3136
Abstract
Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes [...] Read more.
Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes the problem even more difficult because the discrimination between dominant and complementary emotions is usually weak. We have created a database that includes 31,250 facial images with different emotions of 115 subjects whose gender distribution is almost uniform to address compound emotion recognition. In addition, we have organized a competition based on the proposed dataset, held at FG workshop 2020. This paper analyzes the winner’s approach—a two-stage recognition method (1st stage, coarse recognition; 2nd stage, fine recognition), which enhances the classification of symmetrical emotion labels. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

26 pages, 14735 KiB  
Article
Detection of Mental Stress through EEG Signal in Virtual Reality Environment
by Dorota Kamińska, Krzysztof Smółka and Grzegorz Zwoliński
Electronics 2021, 10(22), 2840; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10222840 - 18 Nov 2021
Cited by 19 | Viewed by 7717
Abstract
This paper investigates the use of an electroencephalogram (EEG) signal to classify a subject’s stress level while using virtual reality (VR). For this purpose, we designed an acquisition protocol based on alternating relaxing and stressful scenes in the form of a VR interactive [...] Read more.
This paper investigates the use of an electroencephalogram (EEG) signal to classify a subject’s stress level while using virtual reality (VR). For this purpose, we designed an acquisition protocol based on alternating relaxing and stressful scenes in the form of a VR interactive simulation, accompanied by an EEG headset to monitor the subject’s psycho-physical condition. Relaxation scenes were developed based on scenarios created for psychotherapy treatment utilizing bilateral stimulation, while the Stroop test worked as a stressor. The experiment was conducted on a group of 28 healthy adult volunteers (office workers), participating in a VR session. Subjects’ EEG signal was continuously monitored using the EMOTIV EPOC Flex wireless EEG head cap system. After the session, volunteers were asked to re-fill questionnaires regarding the current stress level and mood. Then, we classified the stress level using a convolutional neural network (CNN) and compared the classification performance with conventional machine learning algorithms. The best results were obtained considering all brain waves (96.42%) with a multilayer perceptron (MLP) and Support Vector Machine (SVM) classifiers. Full article
(This article belongs to the Special Issue Human Emotion Recognition)
Show Figures

Figure 1

Back to TopTop