Research on Facial Expression Recognition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 November 2022) | Viewed by 34069

Special Issue Editors


E-Mail Website
Guest Editor
Department of Psychology, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Interests: affective computing; biosignal processing; augmented human technology; artificial intelligence

E-Mail Website
Guest Editor
Institute of Psychology, University of Tsukuba, Tsukuba, Ibaraki 305, Japan
Interests: perceptional learning; chemosensory perception and behavior; haptics and peripersonal space representation; facial expression recognition

Special Issue Information

Dear Colleagues,

Facial expressions are a basic communication tool. They are often interpreted as an important cue of affect. However, the same facial movement patterns might not universally represent a felt emotion. Therefore, the debate on the meaning of facial expressions and how technological developments might ethically use this information is still ongoing. Nevertheless, automatically recognizing facial expressions is undoubtedly a key development to advance emotion communication research. This Special Issue welcomes original research papers concerned with both theoretical and applied aspects of facial expression analysis. Both affective computing and psychological contributions are encouraged. Submissions will be considered based on their technological novelty and scientific merit.

Possible topics include but are not limited to the following:

  • theoretical aspects of facial expression production and perception;
  • physiological mechanisms involved in facial expression production and perception;
  • impairments in facial expression recognition;
  • machine learning algorithms involving facial expressions for affective computing;
  • combination and fusion of modalities for facial expression recognition;
  • data-driven approaches to model the perception and production of facial expressions;
  • virtual facial expression synthesis;
  • individual differences such as gender and age in facial expression production and perception;
  • cultural differences in facial expression production and perception;
  • automatic facial expression recognition systems;
  • models on the relationship between facial expressions and posture;
  • context-dependent facial expression recognition;
  • modelling of the spatiotemporal characteristics of facial expressions;
  • multimodal models of emotion combining facial expressions and other behavioral and physiological cues;
  • novel sensing devices for facial expression recognition.

Dr. Monica Perusquia Hernandez
Prof. Dr. Saho AYABE-Kanamura
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • facial expression recognition
  • nonverbal communication
  • emotion
  • affective behavior
  • production and perception modelling
  • artificial intelligence
  • machine learning
  • affective computing
  • human computing

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

19 pages, 2724 KiB  
Article
A Comparative Study of Local Descriptors and Classifiers for Facial Expression Recognition
by Antoine Badi Mame and Jules-Raymond Tapamo
Appl. Sci. 2022, 12(23), 12156; https://0-doi-org.brum.beds.ac.uk/10.3390/app122312156 - 28 Nov 2022
Cited by 2 | Viewed by 1270
Abstract
Facial Expression Recognition (FER) is a growing area of research due to its numerous applications in market research, video gaming, healthcare, security, e-learning, and robotics. One of the most common frameworks for recognizing facial expressions is by extracting facial features from an image [...] Read more.
Facial Expression Recognition (FER) is a growing area of research due to its numerous applications in market research, video gaming, healthcare, security, e-learning, and robotics. One of the most common frameworks for recognizing facial expressions is by extracting facial features from an image and classifying them as one of several prototypic expressions. Despite the recent advances, it is still a challenging task to develop robust facial expression descriptors. This study aimed to analyze the performances of various local descriptors and classifiers in the FER problem. Several experiments were conducted under different settings, such as varied extraction parameters, different numbers of expressions, and two datasets, to discover the best combinations of local descriptors and classifiers. Of all the considered descriptors, HOG (Histogram of Oriented Gradients) and ALDP (Angled Local Directional Patterns) were some of the most promising, while SVM (Support Vector Machines) and MLP (Multi-Layer Perceptron) were the best among the considered classifiers. The results obtained signify that conventional FER approaches are still comparable to state-of-the-art methods based on deep learning. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

16 pages, 9736 KiB  
Article
Learning Robust Shape-Indexed Features for Facial Landmark Detection
by Xintong Wan, Yifan Wu and Xiaoqiang Li
Appl. Sci. 2022, 12(12), 5828; https://0-doi-org.brum.beds.ac.uk/10.3390/app12125828 - 08 Jun 2022
Cited by 1 | Viewed by 1841
Abstract
In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with [...] Read more.
In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with shape constraint naturally embedded in the regressor. However, there are still two remaining challenges that cause the degradation of these methods. First, the initial shape may seriously deviate from the ground truth when presented with a large pose, resulting in considerable noise in the shape-indexed features. Second, extracting local patch features is vulnerable to occlusions due to missing facial context information under severe occlusion. To address the issues above, this paper proposes a facial landmark detection algorithm named Sparse-To-Dense Network (STDN). First, STDN employs a lightweight network to detect sparse facial landmarks and forms a reinitialized shape, which can efficiently improve the quality of cropped patches when presented with large poses. Then, a group-relational module is used to exploit the inherent geometric relations of the face, which further enhances the shape constraint against occlusion. Our method achieves 4.64% mean error with 1.97% failure rate on COFW68 dataset, 3.48% mean error with 0.43% failure rate on 300 W dataset and 7.12% mean error with 11.61% failure rate on Masked 300 W dataset. The results demonstrate that STDN achieves outstanding performance in comparison to state-of-the-art methods, especially on occlusion datasets. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

21 pages, 3306 KiB  
Article
An Approach for Selecting the Most Explanatory Features for Facial Expression Recognition
by Pedro D. Marrero-Fernandez, Jose M. Buades-Rubio, Antoni Jaume-i-Capó and Tsang Ing Ren
Appl. Sci. 2022, 12(11), 5637; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115637 - 01 Jun 2022
Viewed by 1558
Abstract
The objective of this work is to analyze which features are most important in the recognition of facial expressions. To achieve this, we built a facial expression recognition system that learns from a controlled capture data set. The system uses different representations and [...] Read more.
The objective of this work is to analyze which features are most important in the recognition of facial expressions. To achieve this, we built a facial expression recognition system that learns from a controlled capture data set. The system uses different representations and combines them from a learned model. We studied the most important features by applying different feature extraction methods for facial expression representation, transforming each obtained representation into a sparse representation (SR) domain, and trained combination models to classify signals, using the extended Cohn–Kanade (CK+), BU-3DFE, and JAFFE data sets for validation. We compared 14 combination methods for 247 possible combinations of eight different feature spaces and obtained the most explanatory features for each facial expression. The results indicate that the LPQ (83%), HOG (82%), and RAW (82%) features are those features most able to improve the classification of expressions and that some features apply specifically to one expression (e.g., RAW for neutral, LPQ for angry and happy, LBP for disgust, and HOG for surprise). Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

20 pages, 2781 KiB  
Article
Hybrid Approach for Facial Expression Recognition Using Convolutional Neural Networks and SVM
by Jin-Chul Kim, Min-Hyun Kim, Han-Enul Suh, Muhammad Tahir Naseem and Chan-Su Lee
Appl. Sci. 2022, 12(11), 5493; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115493 - 28 May 2022
Cited by 13 | Viewed by 3095
Abstract
Facial expression recognition is very useful for effective human–computer interaction, robot interfaces, and emotion-aware smart agent systems. This paper presents a new framework for facial expression recognition by using a hybrid model: a combination of convolutional neural networks (CNNs) and a support vector [...] Read more.
Facial expression recognition is very useful for effective human–computer interaction, robot interfaces, and emotion-aware smart agent systems. This paper presents a new framework for facial expression recognition by using a hybrid model: a combination of convolutional neural networks (CNNs) and a support vector machine (SVM) classifier using dynamic facial expression data. In order to extract facial motion characteristics, dense facial motion flows and geometry landmark flows of facial expression sequences were used as inputs to the CNN and SVM classifier, respectively. CNN architectures for facial expression recognition from dense facial motion flows were proposed. The optimal weighting combination of the hybrid classifiers provides better facial expression recognition results than individual classifiers. The system has successfully classified seven facial expressions signalling anger, contempt, disgust, fear, happiness, sadness and surprise classes for the CK+ database, and facial expressions of anger, disgust, fear, happiness, sadness and surprise for the BU4D database. The recognition performance of the proposed system is 99.69% for the CK+ database and 94.69% for the BU4D database. The proposed method shows state-of-the-art results for the CK+ database and is proven to be effective for the BU4D database when compared with the previous schemes. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

17 pages, 1078 KiB  
Article
Facial Micro-Expression Recognition Based on Deep Local-Holistic Network
by Jingting Li, Ting Wang and Su-Jing Wang
Appl. Sci. 2022, 12(9), 4643; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094643 - 05 May 2022
Cited by 12 | Viewed by 3140
Abstract
A micro-expression is a subtle, local and brief facial movement. It can reveal the genuine emotions that a person tries to conceal and is considered an important clue for lie detection. The micro-expression research has attracted much attention due to its promising applications [...] Read more.
A micro-expression is a subtle, local and brief facial movement. It can reveal the genuine emotions that a person tries to conceal and is considered an important clue for lie detection. The micro-expression research has attracted much attention due to its promising applications in various fields. However, due to the short duration and low intensity of micro-expression movements, micro-expression recognition faces great challenges, and the accuracy still demands improvement. To improve the efficiency of micro-expression feature extraction, inspired by the psychological study of attentional resource allocation for micro-expression cognition, we propose a deep local-holistic network method for micro-expression recognition. Our proposed algorithm consists of two sub-networks. The first is a Hierarchical Convolutional Recurrent Neural Network (HCRNN), which extracts the local and abundant spatio-temporal micro-expression features. The second is a Robust principal-component-analysis-based recurrent neural network (RPRNN), which extracts global and sparse features with micro-expression-specific representations. The extracted effective features are employed for micro-expression recognition through the fusion of sub-networks. We evaluate the proposed method on combined databases consisting of the four most commonly used databases, i.e., CASME, CASME II, CAS(ME)2, and SAMM. The experimental results show that our method achieves a reasonably good performance. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

11 pages, 1557 KiB  
Article
Comparative Analysis of Emotion Classification Based on Facial Expression and Physiological Signals Using Deep Learning
by SeungJun Oh and Dong-Keun Kim
Appl. Sci. 2022, 12(3), 1286; https://0-doi-org.brum.beds.ac.uk/10.3390/app12031286 - 26 Jan 2022
Cited by 5 | Viewed by 3708
Abstract
This study aimed to classify emotion based on facial expression and physiological signals using deep learning and to compare the analyzed results. We asked 53 subjects to make facial expressions, expressing four types of emotion. Next, the emotion-inducing video was watched for 1 [...] Read more.
This study aimed to classify emotion based on facial expression and physiological signals using deep learning and to compare the analyzed results. We asked 53 subjects to make facial expressions, expressing four types of emotion. Next, the emotion-inducing video was watched for 1 min, and the physiological signals were obtained. We defined four emotions as positive and negative emotions and designed three types of deep-learning models that can classify emotions. Each model used facial expressions and physiological signals as inputs, and a model in which these two types of input were applied simultaneously was also constructed. The accuracy of the model was 81.54% when physiological signals were used, 99.9% when facial expressions were used, and 86.2% when both were used. Constructing a deep-learning model with only facial expressions showed good performance. The results of this study confirm that the best approach for classifying emotion is using only facial expressions rather than data from multiple inputs. However, this is an opinion presented only in terms of accuracy without considering the computational cost, and it is suggested that physiological signals and multiple inputs be used according to the situation and research purpose. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

16 pages, 2757 KiB  
Article
Racial Identity-Aware Facial Expression Recognition Using Deep Convolutional Neural Networks
by Muhammad Sohail, Ghulam Ali, Javed Rashid, Israr Ahmad, Sultan H. Almotiri, Mohammed A. AlGhamdi, Arfan A. Nagra and Khalid Masood
Appl. Sci. 2022, 12(1), 88; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010088 - 22 Dec 2021
Cited by 14 | Viewed by 3476
Abstract
Multi-culture facial expression recognition remains challenging due to cross cultural variations in facial expressions representation, caused by facial structure variations and culture specific facial characteristics. In this research, a joint deep learning approach called racial identity aware deep convolution neural network is developed [...] Read more.
Multi-culture facial expression recognition remains challenging due to cross cultural variations in facial expressions representation, caused by facial structure variations and culture specific facial characteristics. In this research, a joint deep learning approach called racial identity aware deep convolution neural network is developed to recognize the multicultural facial expressions. In the proposed model, a pre-trained racial identity network learns the racial features. Then, the racial identity aware network and racial identity network jointly learn the racial identity aware facial expressions. By enforcing the marginal independence of facial expression and racial identity, the proposed joint learning approach is expected to be purer for the expression and be robust to facial structure and culture specific facial characteristics variations. For the reliability of the proposed joint learning technique, extensive experiments were performed with racial identity features and without racial identity features. Moreover, culture wise facial expression recognition was performed to analyze the effect of inter-culture variations in facial expression representation. A large scale multi-culture dataset is developed by combining the four facial expression datasets including JAFFE, TFEID, CK+ and RaFD. It contains facial expression images of Japanese, Taiwanese, American, Caucasian and Moroccan cultures. We achieved 96% accuracy with racial identity features and 93% accuracy without racial identity features. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

21 pages, 689 KiB  
Article
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks
by Thomas Teixeira, Éric Granger and Alessandro Lameiras Koerich
Appl. Sci. 2021, 11(24), 11738; https://0-doi-org.brum.beds.ac.uk/10.3390/app112411738 - 10 Dec 2021
Cited by 3 | Viewed by 2371
Abstract
Facial expressions are one of the most powerful ways to depict specific patterns in human behavior and describe the human emotional state. However, despite the impressive advances of affective computing over the last decade, automatic video-based systems for facial expression recognition still cannot [...] Read more.
Facial expressions are one of the most powerful ways to depict specific patterns in human behavior and describe the human emotional state. However, despite the impressive advances of affective computing over the last decade, automatic video-based systems for facial expression recognition still cannot correctly handle variations in facial expression among individuals as well as cross-cultural and demographic aspects. Nevertheless, recognizing facial expressions is a difficult task, even for humans. This paper investigates the suitability of state-of-the-art deep learning architectures based on convolutional neural networks (CNNs) to deal with long video sequences captured in the wild for continuous emotion recognition. For such an aim, several 2D CNN models that were designed to model spatial information are extended to allow spatiotemporal representation learning from videos, considering a complex and multi-dimensional emotion space, where continuous values of valence and arousal must be predicted. We have developed and evaluated convolutional recurrent neural networks, combining 2D CNNs and long short term-memory units and inflated 3D CNN models, which are built by inflating the weights of a pre-trained 2D CNN model during fine-tuning, using application-specific videos. Experimental results on the challenging SEWA-DB dataset have shown that these architectures can effectively be fine-tuned to encode spatiotemporal information from successive raw pixel images and achieve state-of-the-art results on such a dataset. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

9 pages, 1354 KiB  
Article
Viewpoint Robustness of Automated Facial Action Unit Detection Systems
by Shushi Namba, Wataru Sato and Sakiko Yoshikawa
Appl. Sci. 2021, 11(23), 11171; https://0-doi-org.brum.beds.ac.uk/10.3390/app112311171 - 25 Nov 2021
Cited by 7 | Viewed by 1845
Abstract
Automatic facial action detection is important, but no previous studies have evaluated pre-trained models on the accuracy of facial action detection as the angle of the face changes from frontal to profile. Using static facial images obtained at various angles (0°, 15°, 30°, [...] Read more.
Automatic facial action detection is important, but no previous studies have evaluated pre-trained models on the accuracy of facial action detection as the angle of the face changes from frontal to profile. Using static facial images obtained at various angles (0°, 15°, 30°, and 45°), we investigated the performance of three automated facial action detection systems (FaceReader, OpenFace, and Py-feat). The overall performance was best for OpenFace, followed by FaceReader and Py-Feat. The performance of FaceReader significantly decreased at 45° compared to that at other angles, while the performance of Py-Feat did not differ among the four angles. The performance of OpenFace decreased as the target face turned sideways. Prediction accuracy and robustness to angle changes varied with the target facial components and action detection system. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

17 pages, 3899 KiB  
Article
Real-Time Facial Emotion Recognition Framework for Employees of Organizations Using Raspberry-Pi
by Navjot Rathour, Zeba Khanam, Anita Gehlot, Rajesh Singh, Mamoon Rashid, Ahmed Saeed AlGhamdi and Sultan S. Alshamrani
Appl. Sci. 2021, 11(22), 10540; https://0-doi-org.brum.beds.ac.uk/10.3390/app112210540 - 09 Nov 2021
Cited by 5 | Viewed by 4466
Abstract
There is a significant interest in facial emotion recognition in the fields of human–computer interaction and social sciences. With the advancements in artificial intelligence (AI), the field of human behavioral prediction and analysis, especially human emotion, has evolved significantly. The most standard methods [...] Read more.
There is a significant interest in facial emotion recognition in the fields of human–computer interaction and social sciences. With the advancements in artificial intelligence (AI), the field of human behavioral prediction and analysis, especially human emotion, has evolved significantly. The most standard methods of emotion recognition are currently being used in models deployed in remote servers. We believe the reduction in the distance between the input device and the server model can lead us to better efficiency and effectiveness in real life applications. For the same purpose, computational methodologies such as edge computing can be beneficial. It can also encourage time-critical applications that can be implemented in sensitive fields. In this study, we propose a Raspberry-Pi based standalone edge device that can detect real-time facial emotions. Although this edge device can be used in variety of applications where human facial emotions play an important role, this article is mainly crafted using a dataset of employees working in organizations. A Raspberry-Pi-based standalone edge device has been implemented using the Mini-Xception Deep Network because of its computational efficiency in a shorter time compared to other networks. This device has achieved 100% accuracy for detecting faces in real time with 68% accuracy, i.e., higher than the accuracy mentioned in the state-of-the-art with the FER 2013 dataset. Future work will implement a deep network on Raspberry-Pi with an Intel Movidious neural compute stick to reduce the processing time and achieve quick real time implementation of the facial emotion recognition system. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

26 pages, 2647 KiB  
Article
A Unified Framework of Deep Learning-Based Facial Expression Recognition System for Diversified Applications
by Sanoar Hossain, Saiyed Umer, Vijayan Asari and Ranjeet Kumar Rout
Appl. Sci. 2021, 11(19), 9174; https://0-doi-org.brum.beds.ac.uk/10.3390/app11199174 - 02 Oct 2021
Cited by 21 | Viewed by 2892
Abstract
This work proposes a facial expression recognition system for a diversified field of applications. The purpose of the proposed system is to predict the type of expressions in a human face region. The implementation of the proposed method is fragmented into three components. [...] Read more.
This work proposes a facial expression recognition system for a diversified field of applications. The purpose of the proposed system is to predict the type of expressions in a human face region. The implementation of the proposed method is fragmented into three components. In the first component, from the given input image, a tree-structured part model has been applied that predicts some landmark points on the input image to detect facial regions. The detected face region was normalized to its fixed size and then down-sampled to its varying sizes such that the advantages, due to the effect of multi-resolution images, can be introduced. Then, some convolutional neural network (CNN) architectures were proposed in the second component to analyze the texture patterns in the facial regions. To enhance the proposed CNN model’s performance, some advanced techniques, such data augmentation, progressive image resizing, transfer-learning, and fine-tuning of the parameters, were employed in the third component to extract more distinctive and discriminant features for the proposed facial expression recognition system. The performance of the proposed system, due to different CNN models, is fused to achieve better performance than the existing state-of-the-art methods and for this reason, extensive experimentation has been carried out using the Karolinska-directed emotional faces (KDEF), GENKI-4k, Cohn-Kanade (CK+), and Static Facial Expressions in the Wild (SFEW) benchmark databases. The performance has been compared with some existing methods concerning these databases, which shows that the proposed facial expression recognition system outperforms other competing methods. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

Other

Jump to: Research

13 pages, 2391 KiB  
Brief Report
Event-Related Potentials during Verbal Recognition of Naturalistic Neutral-to-Emotional Dynamic Facial Expressions
by Vladimir Kosonogov, Ekaterina Kovsh and Elena Vorobyeva
Appl. Sci. 2022, 12(15), 7782; https://0-doi-org.brum.beds.ac.uk/10.3390/app12157782 - 02 Aug 2022
Cited by 1 | Viewed by 2127
Abstract
Event-related potentials during facial emotion recognition have been studied for more than twenty years. Nowadays, there has been a growing interest in the use of naturalistic stimuli. This research was aimed, therefore, at studying event-related potentials (ERP) during recognition of dynamic facial neutral-to-emotional [...] Read more.
Event-related potentials during facial emotion recognition have been studied for more than twenty years. Nowadays, there has been a growing interest in the use of naturalistic stimuli. This research was aimed, therefore, at studying event-related potentials (ERP) during recognition of dynamic facial neutral-to-emotional expressions, more ecologically valid than static faces. We recorded the ERP of 112 participants who watched 144 dynamic morphs depicting a gradual change from a neutral expression to a basic emotional expression (anger, disgust, fear, happiness, sadness and surprise) and labelled those emotions verbally. We revealed some typical ERP, like N170, P2, EPN and LPP. Participants with lower accuracy exhibited a larger posterior P2. Participants with faster correct responses exhibited a larger amplitude of P2 and LPP. We also conducted a classification analysis that yielded the accuracy of 76% for prediction of participants who recognise emotions quickly on the basis of the amplitude of posterior P2 and LPP. These results extend data from previous research about the electroencephalographic correlates of facial emotion recognition. Full article
(This article belongs to the Special Issue Research on Facial Expression Recognition)
Show Figures

Figure 1

Back to TopTop