1. Introduction
Fatigue is a phenomenon that has not been conventionally defined and relates, in particular, to reactions to various loads and conditions, including experiences and states of mind. Fatigue is also defined as a subjective lack of physical and/or mental energy perceived by an individual to interfere with their usual or desired activities [
1].
Usually, fatigue is a state associated with a weakening or depletion of an individual’s physical and/or mental resources, ranging from a general state of lethargy to a specific burning sensation in a particular muscle. Physical fatigue leads to an inability to continue functioning at a normal level of activity. Mental fatigue is a state of tiredness that sets in when brain energy levels are depleted. In the literature, fatigue is differentiated into six types: social, emotional, physical, pain, mental, and chronic illnesses; furthermore, these types are often distinguished in terms of physical and mental fatigue [
2].
Many people experience mental fatigue (MF) in daily life or work activities that require sustained mental efficiency [
3]. MF can be defined as a psychobiological state caused by prolonged episodes of cognitive exertion [
4]. Overwork-related disorders, such as cerebrovascular/cardiovascular diseases, diabetes, and cancer, are major health issues worldwide [
5,
6]. However, fatigue is a common symptom in both sick and healthy people [
7]. Fatigue is one of the most crucial factors contributing to decreased performance among aircraft pilots; car drivers [
8]; individual athletes [
9]; and team sport athletes, such as soccer players [
10]; among other professions. Furthermore, mental fatigue may reduce cognitive resources and impair balance performance [
11]. In some literature reports, cognitive fatigue (CF) is considered the main component of mental fatigue. CF is known to cause attention deficits, leading to poor situational awareness and impaired vigilance [
12]. Various cognitive tests are used to detect mental fatigue, such as the psychomotor vigilance task, the Stroop task, the AX-continuous performance test, and the TloadDback test; however, such tasks takes time and require additional performance [
13].
In the literature, electroencephalographic (EEG) signal features are studied and analyzed as a relevant tool for the detection of mental fatigue [
14,
15]. However, it is not always possible to record EEG signals and conduct measurements in real-life environments due to electric line noise or noise from electronic equipment [
16]. In other research, electromyography (EMG) and electro-oculography (EOG) signals, as well as inertial measurement unit (IMU) sensors, 3D optical tracking techniques, infrared cameras, and accelerometer signals, have been analyzed [
17,
18]. Scientists have proposed artificial intelligence and expert system-based solutions that combine several sensors and devices [
19]. EEG, together with ECG signal recordings, are very common in fatigue detection tasks for drivers, for whom exhaustion and distraction may lead to serious accidents [
20]. In the literature, real-time monitoring systems are used to detect heart anomalies [
21]. In such cases, ECG signals are classified for in an alert configuration to notify designated healthcare providers. However, most systems are designed to detect various heart anomalies and might not be applicable to fatigue recognition.
When the sympathetic nervous system is active to a heightened degree, the heart regularly beats at a faster pace, whereas the opposite occurs when the parasympathetic nervous system is active to a heightened degree. Therefore, during mental fatigue or stress, heart rate variability (HRV) is lower than normal. This parameter is a convenient tool to monitor personal health using simple smart devices, such as watches; however, the results are consistently inaccurate and depend on individual human characteristics. Furthermore, to acquire accurate results, the non-linear characteristics of heart rate (HR) must be investigated [
22].
Usually, a classification algorithm consists of two main parts: primary signal transformation and classification. The primary transformation process is based on feature analysis to extract the raw signal and reduce its dimensions [
23]. Classifying ECG signals into pathologies or health stages is a complicated task that requires recognition of the signal structure. Generally, a combination of several classification algorithms is used to solve this problem [
24]. A similar classification problem is considered in the fatigue identification process. In this article, the research object is not a continuous ECG signal or its segments but separate signals that were recorded at different times of the day (in the morning and the evening). Usually, fatigue occurs after intensive physical or mental activity, mostly at the end of the working day. Instant physical fatigue detection after an intensive training session is a simple task because the heart is loaded and works faster. However, mental fatigue detection is a more complicated task because there is no clear difference in terms of ECG signal parameters.
Another technique used in medicine is principal component analysis (PCA). This method of analysis is used to detect early stages of diseases and to diagnose the cardiac health of patients [
25,
26,
27,
28,
29,
30]. This technique was used in the present study to evaluate the differences between ECG features in different states and to detect mental fatigue symptoms.
This research focuses on mental fatigue recognition in healthy individuals based on their health condition at different times of the day. All data are split into two datasets. Data gathered in the first subset corresponds to a normal state without fatigue, whereas the second subset consists of data recorded in the evening, representing a fatigued state. The main purpose of this paper is to determine whether ECG signal features reveal differences in mental states. However, the proposed framework is not designed for diagnostic purposes and should be tested on clinical patients for use as a specific criterion for mental fatigue diagnosis.
The structure of this article is as follows:
Section 2 highlights the recent literature on mental fatigue detection using various biosignal classification methods and other techniques. The experimental design, data description, and applied methods are presented in
Section 3.
Section 4 consists of data analysis using the PCA method, and analysis of the performance of ML algorithms. Finally, a discussion and conclusions are presented in
Section 5.
2. Related Work
Modern wearable devices, such as eye-tracking technologies, are becoming increasingly popular. Li et al. [
31] demonstrated the feasibility of applying wearable eye-tracking technology to identify and classify mental fatigue in construction equipment operators. The Toeplitz inverse covariance-based clustering (TICC) method was used to determine multiple levels of mental fatigue, and the classification task was performed using support vector machine (SVM) methods. However, this study consisted of a narrow target group and might not be applicable in other fields.
In [
32], EEG and HRV signals were observed and analyzed to detect the impacts of prolonged cognitive activity on the central nervous system and the autonomic nervous system. EEG signal wavelet packet parameters and HRV spectral indices were combined to measure changes in mental fatigue. Although 91% classification accuracy was achieved, two separate devices for EEG and HRV recordings are not efficient and barely usable in daily life activities. Furthermore, EEG signals are most likely contaminated by muscle artifacts, which may lead to incorrect interpretation. For this reason, various filtering and feature extraction methods have been proposed [
33,
34]. Preprocessed EEG signals can be used in multilevel fatigue recognition tasks. In [
35], EEG signals were classified using a K-nearest neighbor (KNN) classifier, achieving 100% accuracy. These results demonstrate the feasibility of using EEG signals and extracted features to successfully detect mental fatigue. However, in this case, the data were collected using a driving simulator and a brain cap with 32 electrodes placed on the skin surface, which may not be applicable in real-world environments.
Portable single-channel electrocardiogram equipment (“LaPatch”) was used in [
5] to record and analyze ECG signals. Eight heart rate variability (HRV) indicators were considered and classified using SVM, KNN, naïve Bayes (NB), and logistic regression (LR) models. Although the technique is promising, due to the small sample size, only 75.5% accuracy was achieved. In another study, researchers developed an automatic mental stress detection system based on ECG signals recorded from T-shirts and analyzed using machine learning (ML) classifiers: decision tree (DT), random forest (RF), NB, and LR [
6]. The best-performing model achieved an accuracy of 94.1%. However, in this research only mental stress detection was considered, and the same technique may not be applicable to mental fatigue recognition.
Wearable devices for HRV recordings are usually user-friendly and convenient. Furthermore, they do not require electrodes to be attached directly to the skin surface. Many studies have focused on heart rate (HR) and time- or spectral-domain HRV analysis. For example, in [
36], mental and physical fatigue detection methods were applied based on HR, HRV, skin temperature, and pulse. Causal convolutional neural networks (cCNN) and RF models were used to detect and distinguish between mental and physical fatigue. However, only 66.2% accuracy was achieved in the mental fatigue recognition task. Other similar research used a polar H10 chest strap and photoplethysmography (PPG) technology for HRV detection [
37]. Results were compared with those obtained with a Bittium FarosTM 360 device, which records a single ECG lead. Furthermore, the study included several watches, such as the Actigraph wGT3X-BT, Garmin, and Polar Vantage V. Various time- and spectral-domain HRV parameters were estimated and compared. However, no decision-making or fatigue recognition techniques were applied.
Modern wearable electronics have been developed in recent years, such as epidermal electronics systems (EES) and electronic tattoos (E-tattoos), with which ECG signals, respiration rate, and galvanic skin responses (GSR) can be recorded [
38]. Comparing three ML models (SVM, KNN, and DT) the obtained signals were classified with 89% accuracy. Although these technologies are promising, the equipment has not been fully tested and prepared for production. A transparent eye detection system can also be considered a modern wearable device [
39]. Such a system can acquire movement in the pupil and detect blinking based on the light that is reflected from the eye. A summary of these and similar wearable devices and corresponding research in recent literature is presented in
Table 1.
Because HRV analysis cannot achieve high accuracy in mental fatigue recognition and EEG signals are not reliable in daily life activities, in this paper, a novel framework is proposed for mental fatigue detection that involves analysis and classification of ECG signal features. Furthermore, principal component analysis (PCA) is applied to distinguish between ECG signal parameters in two different states (in the morning, i.e., a non-fatigued state, and in the evening, i.e., fatigue condition). The classification task is performed using several models: KNN, LDA, DT, and RF.
3. Materials and Methods
In this section, we describe the proposed data analysis and classification processes that are essential for mental fatigue recognition. The whole process flow consists of five main parts: ECG signal recording, ECG signal preprocessing, feature extraction, PCA analysis, and ML performance (see
Figure 1). The experiment was designed with ECG signal registration twice a day (in the morning and in the evening). HRV analysis or whole ECG signal classification techniques failed on the mental fatigue recognition task, so we proposed the extraction of ECG signal features only when applying classification algorithms, such as KNN, DT, RF, etc. Before implementation of a machine learning technique, PCA analysis was applied to make sure that there were significant differences between ECG signal features in separate states. This research was conducted with the approval of the Kaunas Regional Research Ethics Committee of our institution under the project name, “Various directionalities on physical exercise effects that are based on differential learning methodology, and impact on heart and cardiovascular system” (biomedical ethics permission number BE-2–38, Lithuania).
3.1. ECG Signal Characteristics and Data Analysis
In this study, various ECG signal features were analyzed and classified. The protocol consisted of two 60 sec recordings of each participant. These recordings enabled the detection of differences in ECG signal parameters, which were estimated at the beginning of the day and in the evening. V5 lead was selected in this research (see an example in
Figure 2), with each parameter representing a separate component of heart activity (see
Table 2).
All ECG signal features are visible in the properly filtered data. Numerous methods can be applied to ECG signal preprocessing, such as moving average (MA), exponential smoothing, or linear Fourier transformation. Usually, biological signals are contaminated with various environmental disturbances. The main purpose of signal filtering algorithms is to divide separate components into informative parts and undesirable noise. Furthermore, biological signals that are recorded during movement are highly contaminated by various disturbances, and sometimes, noise overlaps the signal itself. The main problem associated with movement-contaminated signals is non-stationary, low-frequency noise (a trend resulting from movement artifacts). In such cases, ordinary filtering methods for signal processing are insufficient or unreliable. In this research, ECG signals were recorded while each participant was standing so that only small movement artifacts might affect the signal. Therefore, a Butterworth filter was used for noise reduction [
40].
ECG signal preprocessing continues with feature extraction (ECG parameter estimation). ECG feature extraction starts with R peak detection and QRS complex identification. All other parameters, such as Q and S peaks, RR interval, and T wave, are based on R peaks or QRS complex positions. In this research, 9 ECG parameters were estimated: Q, R, S, and T amplitudes; QT, ST, RR, and QRS intervals; and T-wave intervals. All ECG features were estimated using the NeuroKit2 toolbox in Python programming language [
42].
3.2. Research Design and Data Acquisition
In this research, a CardioScout Multi-device was used to record ECG signals and transmit them to mobile devices or tablets. The signal recording frequency was 500 Hz, and each segment was 60 s long. In this article, the analyzed experiments comprise data recorded twice a day (in the morning and in the evening) for signal parameter classification and fatigue recognition. In
Table 3, two different states are defined: A1 in the morning and A2 in the evening.
In total, 60 healthy adults were recruited (aged between 24 and 34 years) without a diagnosis of health pathologies or overwork-related problems. In this research, 8271 measurements were estimated from 60 participants via ECG signal recordings: 4195 corresponding to state A1 and 4076 corresponding to state A2.
3.3. Data Description and Visualization
As mentioned in the previous section, two states were analyzed in this study (A1 in the morning and A2 in the evening). All parameter data were normalized by subtracting means and dividing by the standard deviation. This type of data normalization is needed to eliminate differences in individual heart rate characteristics of each person. For example, some participants may have higher (or lower) ECG signal amplitude values compared to others in both states (in the morning and in the evening), which may affect classification results, indicating fatigued state in both datasets. Furthermore, normalization increases data integrity without distorting differences in the ranges of values. The distribution and scatter plots are shown in
Figure 3. Pearson correlation coefficients are presented in
Figure 4 (Y represents the state: a value of 1 corresponds to the fatigued condition or state (A2), and a value of 0 corresponds to the fatigue-free condition of state (A1)). Comparing data from different states, clear differences could be noticed. For example, histograms of the Sa parameter look similar, but A2 data are shifted, with higher values compared to state A1 values (see
Figure 3). Furthermore, some ECG signal parameter values overlap. For example, there is no significant difference between states A1 and A2 in terms of RR interval values. Therefore, typical HRV analysis fails in mental fatigue detection, with low classification accuracies.
Figure 3 shows linear dependences between several parameters (for example, between parameters QT and ST). Similar results are shown in
Figure 4 (for example, the Pearson correlation coefficient for ST and QT reaches 0.98). Additionally, a strong dependence (Pearson correlation coefficient > 0.8) is evident between Tint and QT or ST. Although some parameters may be eliminated in the classification step, it is not clear which parameters have a greater impact on classification accuracy. In the initial stage of the classification process, all ECG signal features were included (all 9 ECG parameters).
3.4. Principal Component Analysis
Principal component analysis (PCA) was applied to distinguish between ECG features in the morning and in the evening. The general idea behind this technique is to obtain new latent variables based on the original data. The newly defined principal components reflect directions of maximal variance of the projected data and form a new orthonormal basis of the original vector space.
PCA is commonly used to reduce the dimensions of the collected data matrix by choosing
k principal components (PC1, PC2, …, PC
k) and evaluating the amount of information explained by the chosen components as follows:
where
is the
i-th eigenvalue of the covariance matrix (
C), and
Tr(C) is its trace, i.e., the sum of all entries on the main diagonal. Furthermore, the original data are projected to the low-dimension hyperplane spanned by components PC1, PC2, …, PC
k, thus extracting the essential information from the initial data cloud. In this study, three principal components were considered, and the obtained results were visualized via 3D plots to emphasize the desired differences in mental states.
3.5. Machine Learning Technique
The use of social media, smartphones, smartwatches, computers, and even portable devices provides big data about various mental and physical health disorders. Effective algorithms for big data processing are usually based on machine learning (ML) techniques.
Various ML algorithms have been created for data classification and prognosis. There are three main categories: Supervised learning: examples of such methods include support vector machine (SVM), k-nearest neighbors (KNN), decision tree (DT), and random forest (RF); Unsupervised learning: these methods include neural networks (NN) and clustering; Semi-supervised learning: this category includes methods such as semi-supervised SVM, mixed models, etc. [
43].
Supervised learning is used to analyze the labeled data and make predictions or classify data into different categories, whereas unsupervised learning methods can learn from unlabeled data and extract similar patterns. The third group is semi-supervised learning, which involves the analysis of data with and without labels; such methods are used when there is not enough labeled data for classification or prognosis.
One way to evaluate the potential accuracy of predictions is to use a confusion matrix [
44]. The entries in this matrix indicate the correctness of the prediction or classification of distinct fault categories compared to actual observed values. To evaluate the quality of the selected predictions or data classifications, additional measurements can be considered. Two widely used standard statistics are accuracy (
acc) and
F1 score. These are estimated using Equation (2).
where
TN corresponds to true-negative elements after prediction (correctly predicted as not correct),
TP represents true-positive elements (correctly predicted as correct),
FP represents false-positive elements, and
FN represents false-negative elements. Although popular in ML analysis, both
acc and
F1 ignore the size of each category in the confusion matrix. Therefore, the additional statistic called Matthew’s correlation coefficient (
MCC) was measured [
45]. This coefficient is calculated as follows:
The value of this coefficient is in the range [−1; 1], where −1 is interpreted as the worst-case scenario, whereas 1 is the best possible value.
Additionally, in 1960 J. Cohen revealed that there is a level of algorithm precision when the algorithm is no longer capable of predicting correctly, i.e., the prediction becomes as accurate as a simple guess. This level is called Cohen’s Kappa (
) statistic and can be expressed as follows:
Three main intervals are considered: if
, then the value is viewed as perfect; if
is in the range of [0.4, 0.75] the value is sufficient; and if
, it is considered weak [
46].
In this research, we compared multiple ML methods, revealing that the RF algorithm classifies signal parameters with the highest accuracy. For RF algorithms in the feature extraction process, the Gini coefficient needs to be estimated. If
is defined as the number of samples in node
t and each node has
c classes, then the number of samples belonging to class
is
. The ratio (
) is expressed as:
In this case, the Gini coefficient G for each node is defined as [
47]:
Generally, the RF classifier is based on DT and consists of three main steps: input all data into root nodes for every DT; minimize the Gini coefficient by dividing data into separate nodes; recursively repeat all steps at each node that needs to be split until the root mean square error (RMSE) value for the node falls below a threshold value or the tree reaches a defined depth.
RF may consist of many separate decision trees that train each model concurrently using random data samples. This type of RF is also called a bagged tree algorithm. Consistent DT models that are trained consecutively are called boosted trees. In this case, every DT model learns from previous model errors. Usually, this type of RF has more nodes [
48]. Like any other classifier, the random forest algorithm requires two datasets: one for training and one for testing. In ML techniques, the more data provided, the higher the classification accuracy. Additionally, in every ML technique, overfitting of training data should be considered, which may negatively affect algorithm performance, thereby reducing prediction accuracy. Cross validation can be applied to avoid overfitting. This method involves splitting data into different groups and estimating the classification accuracy for each group. In this case, the training dataset is divided into two groups: a training set and a validation set. If cross validation is performed several times, in each iteration, different data samples are assigned to the testing data subset [
49].
5. Discussion and Conclusions
5.1. Discussion and Future Work
In this paper, we proposed a framework for mental fatigue detection combining ECG signal recording twice a day corresponding to different mental states: fatigued and without mental fatigue. Extracted ECG signal features, such as R, S, and T wave amplitude values, as well as QT intervals, increased the classification accuracy compared to similar methods reported in the literature. For example, in [
5], heart rate variability (HRV) indicators achieved an accuracy of 75.5%. Additionally, in [
6], the proposed methods achieved an accuracy of 93.3%. However, this research considered mentally stressed participants (after 12 h of intense work), which may have resulted in an increased impact on HRV parameter values. This research enables the detection of smaller changes in mental health conditions compared to previously mentioned literature reports. Furthermore, statistical analysis of several ECG signal features showed that RR interval values (used in HRV analysis) overlap in between states, which is why HRV analysis and parameter estimation are not efficient and may reduce classification accuracies. PCA analysis showed that other ECG features present with larger differences between states. Due to the use of several ECG signal features, an accuracy of 94.5% was achieved.
Although the proposed technique shows promising results, it is also subject to some weaknesses. ECG signals should be recorded using professional devices, such as CardioScout Multi, which is expensive and inconvenient. User-friendly devices, such as a Polar v10 belt or Garmin watch, do not record full ECG signals, and data from such devices are not sufficiently reliable. Furthermore, in this research, the gathered data were not suitable for diagnostic purposes because we did not include patients with diagnosed mental illness. Future research should focus on gathering more data and improving classification accuracies. We suggest that 60 sec ECG signal recordings could be expanded and compared with HRV analysis results.
5.2. Conclusions
Considering gaps identified in recent literature, we presented a novel framework that combines ECG signal feature extraction, PCA analysis, and ML classification algorithms. The obtained results show that the proposed framework is feasible for automatic mental fatigue detection.
To ensure daily fatigue recognition, we designed an experiment involving separate recordings registered twice a day. Each recording represents a mental state, i.e., a state without fatigue recorded in the morning and a fatigued state recorded in the evening. A total of 60 healthy adults (ages 24 to 34) without a diagnosis of health pathologies or over-work-related problems were recruited for this experiment. All ECG signals were filtered using a Butterworth filter, and features were extracted using Python toolbox NeuroKit2. Using these methods, the following high-quality ECG parameters were obtained: Q, R, S, and T wave amplitude values; QRS complexes; and RR, ST, QT, and T intervals.
Data visualization processes and statistical analysis show that RR interval values overlap between states, which is why only RR interval analysis alone, such as HRV parameter estimation, is not an efficient way to detect mentally fatigued states. To overcome this issue, other ECG signal parameters were considered in this paper.
PCA analysis showed a significant difference between states (with and without fatigue). As the most representative ECG signal features Q and R wave amplitude values and QT and T intervals were observed. Changes in the first three principal components were evident, indicating the importance of ECG signal feature extraction for mental fatigue recognition.
Finally, machine learning algorithms were applied for automatic classification of ECG signal features into separate states. Four ECG signal parameters (Sa, Ra, Ta, and QT) were identified as the most important for the mental fatigue classification process. The final RF model was able to detect daily mental fatigue with an accuracy of more than 94.5%.
Although the proposed technique shows promising results, it is also subject to some weaknesses. Future work should focus on user-friendly devices for the ECG signal gathering process to ensure that a wide range of participants can be included in experiments.