Next Article in Journal
In Vitro Anticancer Drug Sensitivity Sensing through Single-Cell Raman Spectroscopy
Next Article in Special Issue
Design of a Wearable Eye-Movement Detection System Based on Electrooculography Signals and Its Experimental Validation
Previous Article in Journal
Single-Drop Analysis of Epinephrine and Uric Acid on a Screen-Printed Carbon Electrode
Previous Article in Special Issue
Micro-Droplet Platform for Exploring the Mechanism of Mixed Field Agglutination in B3 Subtype
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study of One-Class Classification Algorithms for Wearable Fall Sensors

1
Departamento de Tecnología Electrónica, Universidad de Málaga, 29071 Málaga, Spain
2
Departamento de Tecnología Electrónica, Universidad de Málaga, Instituto TELMA, 29071 Málaga, Spain
*
Author to whom correspondence should be addressed.
Submission received: 23 July 2021 / Revised: 10 August 2021 / Accepted: 14 August 2021 / Published: 19 August 2021
(This article belongs to the Collection Wearable Biosensors for Healthcare Applications)

Abstract

:
In recent years, the popularity of wearable devices has fostered the investigation of automatic fall detection systems based on the analysis of the signals captured by transportable inertial sensors. Due to the complexity and variety of human movements, the detection algorithms that offer the best performance when discriminating falls from conventional Activities of Daily Living (ADLs) are those built on machine learning and deep learning mechanisms. In this regard, supervised machine learning binary classification methods have been massively employed by the related literature. However, the learning phase of these algorithms requires mobility patterns caused by falls, which are very difficult to obtain in realistic application scenarios. An interesting alternative is offered by One-Class Classifiers (OCCs), which can be exclusively trained and configured with movement traces of a single type (ADLs). In this paper, a systematic study of the performance of various typical OCCs (for diverse sets of input features and hyperparameters) is performed when applied to nine public repositories of falls and ADLs. The results show the potentials of these classifiers, which are capable of achieving performance metrics very similar to those of supervised algorithms (with values for the specificity and the sensitivity higher than 95%). However, the study warns of the need to have a wide variety of types of ADLs when training OCCs, since activities with a high degree of mobility can significantly increase the frequency of false alarms (ADLs identified as falls) if not considered in the data subsets used for training.

1. Introduction

According to the World Health Organization (WHO), a fall is defined as an involuntary event that results in a person losing their balance and coming to lie unintentionally on the ground or other lower level [1]. Despite the fact that the majority of falls are not fatal, it is estimated that 646,000 fatal falls occur annually, which makes them the second worldwide cause of death due to accidental injuries [1].
Fall-related health problems are particularly serious among older people as they are strongly associated to loss of autonomy, impairment, and early death. In the world, about 28–35% of adults over 65 suffer one or more falls per year, while this percentage rises to 32–42% among those over 70 [2]. This situation poses a logistical and economic challenge for national health systems, especially if we think that the share of population aged over 60 will double in 2050, reaching a figure of 2 billion people, compared to 900 million in 2015 [3]. This problem is aggravated as a significant proportion of older adults live alone, so that if an accident occurs, a caregiver (a family member, medical or nursing staff, etc.) must be alerted to provide help. In this context, the time that elapses between a fall and the moment in which the person is assisted has been shown to determine the physical aftermaths of the accident and even the probability of survival [4]. Consequently, the last decade has witnessed an increasing interest in the development of affordable Fall Detection Systems (FDSs), which are able to permanently monitor patients and to trigger an automatic alarm message to a remote agent as soon as the occurrence of a fall is presumed.
Existing FDSs can be categorized into two generic groups. Firstly, context-aware systems are grounded on the deployment of cameras, microphones, and/or other environmental sensors in the specific locations where the user must be monitored. On the other hand, wearable-based systems utilize small transportable sensors that can be easily integrated or attached to the users’ clothing or garments to measure different parameters that describe their mobility.
When compared to context-aware solutions, the monitoring provided by wearable architectures offers a more ubiquitous service as they are not restricted to the particular area where the contextual sensors are installed. In addition, they are less privacy intrusive than camera-based methods and more robust to the presence of external artifacts or the alteration of the user’s setting. In addition, this type of FDS can benefit from the widespread acceptability and decreasing costs of wearable devices (smartwatches, sport bands, etc.).
The fundamental purpose of automatic fall detectors is to achieve the most accurate discernment between falls and other movements or Activities of Daily Living (ADLs), by simultaneously minimizing the number of undetected falls and false alarms (ADLs misjudged as falls). The efficiency of an FDS relies on the algorithm that makes the detection decision after processing and analyzing the measurements that are constantly captured by the wearable sensors (mainly, accelerometers, less frequently, gyroscopes, and in some prototypes, magnetometers, barometers, or heart rate sensors).
Detection strategies can be roughly classified into two groups [5]: threshold-based and machine learning methods. Threshold-based algorithms assume that a fall has occurred when one or several parameters (derived from the sensor measurements) exceeds or drops below a certain threshold limit. These algorithms are easy to implement and have a low computational load, although they are too simplistic and rigid to correctly classify many complex movements (especially those ADLs that involve an intense physical activity). Contrariwise, algorithms based on machine learning models usually overperform the thresholding schemes [6], as they have a greater potential to self-adapt to a wider typology of ADLs and falls, by directly learning from a set of samples or movement traces and without requiring the explicit and heuristic definition of a threshold value.
In most studies of the related literature, machine learning algorithms follow a fully supervised approach, so they need to be trained with labeled examples of both ADLs and falls. However, falls are rare events, and most studies on FDS are almost completely determined by the lack of real-world fall examples. Owing to the evident difficulties of capturing samples of actual falls experienced by the target public of these systems (older adults), falls aimed at training and testing new proposals on FDSs have to be normally generated in a testbed through the movements of young and healthy volunteers that emulate falls on cushioned surfaces according to a systematic and predefined test plan.
The validity of this procedure is still under discussion. Some related studies [7,8] have compared the dynamics of the falls experienced by older people and those ‘mimicked’ by young subjects in an experimental environment. Authors concluded that although there are similarities between the characteristics of both fall patterns, there also exist relevant differences in the monitored magnitudes related to the reaction time and the mechanisms of the compensatory movements to avoid falling or further damages. In this respect, Aziz et al. showed in [9] that the effectiveness of some supervised learning algorithms may dramatically decrease when they are evaluated in real scenarios.
To cope with this problem, one-class classifiers (OCCs) are a subtype of machine learning architectures particularly adequate to develop binary pattern classifiers with heavily unbalanced datasets [10]. OCCs bypass the need of obtaining laboratory samples of the minority class (falls), as they are conceived to be exclusively trained with traces of the most common class (ADLs). In this way, in the case of FDSs, once the training of the system is accomplished, a fall is detected whenever a certain movement is classified as a ‘anomaly’ (‘novelty’ or ‘outlier’). This occurs when its features substantially diverge from the samples of the majority class used during the training phase.
In a real use scenario, FDSs will have most likely to be adjusted or ‘tuned’ to the particular dynamics of the movements of the user to be monitored. In this vein, Medrano et al. evinced in [11] the benefits of ‘personalizing’ the configuration of the FDS by training the models with movements generated by the final user. Obviously, this process should not oblige the patient to emulate or generate fall patterns to particularize the FDS. In this regard, OCCs may greatly ease the implementation of this system personalization, as long as any user could train from scratch a certain machine learning method just by wearing the system during a certain training period in which the sensors could collect the traces generated by the daily routines of the user and feed the detector.
The idea of utilizing OCCs as the decision core of an FDS is not new. Table 1 summarizes the works that have assessed the performance of anomaly detectors when they are programmed to detect falls with a wearable device. In some specific cases, the FDS develops a ‘hybrid’ approach by combining an OCC and a thresholding method (such as the proposal by Viet et al. in [12]) or an OCC and a fully supervised classifier (such as that proposed by Lisowska et al. in [13]).
In all cases, the algorithms are primarily based on the analysis of the signals captured by a triaxial accelerometer, which is a strategy that has been massively adopted by the related literature on wearable FDSs. Only in six papers the information provided by the accelerometer is complemented by the use of other inertial sensors (a gyroscope, a magnetometer, or an orientation sensor), and in just two cases, a more complex sensor-fusion policy is applied, so that the classifiers are also fed with signals captured by other type of wearable sensing units (e.g., a heart rate monitor in the paper by Nho et al. [14]).
Table 1 indicates the best reported performing metrics (normally expressed in terms of sensitivity or specificity) of the corresponding OCC in the review literature. When more than one type of classifier is compared, the best performing algorithm in each study is marked in bold in the third column of the table. The results show that in some works, OCCs may achieve a noteworthy efficacy to discriminate ADLs from falls (with sensitivities and specificities higher than 0.98 or 98%). Furthermore, in [15], Medrano et al. illustrate that one-class classifiers may even exhibit a significantly better performance than their supervised counterparts. However, as it can be also appreciated from the last column in the table, all the works employ only one or at most two datasets to evaluate these algorithms. In some studies, these datasets are not obtained from a public repository but directly generated (and not released) by the authors. Due to the limited number of subjects and types of ADLs and falls considered in these datasets, it is legitimate to question if these results can be extrapolated to other repositories. Furthermore, the design criteria of these benchmarking datasets do not follow any particular recommendation and strongly rely on particular decisions of their creators. In a recent work [16], we have shown that even a deep learning method may achieve very divergent results when it is applied to different datasets. Thus, the good performance metrics obtained with a certain repository should be confirmed by training and testing the classifier with other datasets.
Another key problem of OCCs that is normally neglected by the related literature relates to the fact that these detectors may produce false alarms when tested with types of ADLs that were not part of the training subset [17]. This situation would be not so uncommon in a realistic scenario where the monitored user may execute unexpected movements (not caused by falls) that can be consequently be catalogued as ‘anomalies’ by the detector and trigger an undesired alerting message. Contrariwise, in the previous works on OCC-based FDSs, the ADLs included in the data subsets used for testing incorporate the same types of movements utilized for the configuration of the detector, which inherently minimizes the possibility of experiencing these false alarms.
In this paper, we thoroughly analyze these two issues. To this end, we systematically analyze the behavior of five basic types of anomaly detectors (with diverse hyperparameter configurations and input feature sets) when they are employed with nine different well-known datasets captured on the same body positions (the waist). We also investigate if the classification efficacy degrades when new of types of ADLs (not considered for training) are used for testing.
The paper is organized as follows: after the introduction and analysis of the related works presented in this section, Section 2 describes the different aspects of the methodology followed to evaluate the classifiers. Section 3 displays and discussed the main results for the considered study cases. Finally, Section 4 recapitulates the main conclusions of the article.

2. Methods

2.1. Election of the Datasets

To date, there have been released about 25 available datasets to benchmark detection algorithms for transportable FDSs (see [33] for a comprehensive review on this topic). These databases are formed by a set of numerical traces describing the signals captured by inertial sensors placed on one or several locations of the body. To the best of our knowledge, just one released dataset, provided by the FARSEEING project [34], publicly offers a very limited and unrepresentative number of traces captured from actual falls of older adults. In the other cases, the repositories are generated by recruiting a group of volunteers that systematically execute or emulate a series of predetermined ADLs or falls while transporting the corresponding sensor or sensors. For each movement, a trace (labeled as ADL or fall) is created.
Several studies [35,36,37,38,39] have shown that FDSs located on the waist outperform those placed on other body positions with a higher and independent mobility (e.g., a limb) as long as the waist is adjacent to the center of gravity of the human body. Therefore, in order to set up a common reference framework under optimal conditions, we limit our analysis to those 15 repositories that offer inertial data measured at the waist (although some of them also contain measurements captured on other body positions). For the study, we also discard those datasets that do not provide a significant number of samples (less than 400) or those that were collected with an accelerometer range of 2 g, which is too small to properly characterize the abrupt acceleration peaks caused by falls. After applying these criteria, we selected the 9 datasets (DLR, DOFDA, Erciyes, FallAllD, IMUFD, KFall, SisFall, UMAFall, and UP-Fall) described in Table 2. This quantity is clearly superior to the number of benchmarking repositories that are typically considered by the related literature to assess the performance of fall detection algorithms (in fact, as confirmed in Table 1, most proposals are validated against a single dataset). This need of evaluating the classifiers with different repositories is critical if we consider the remarkable heterogeneity [33,40] that exists among the available datasets in terms of the typology of the emulated ADLs and falls, strategies to generate the movements, duration of the traces, environment for the testbed, election of the volunteers, etc.

2.2. Compared One-Class Classifying Algorithms

As aforementioned, one-class classifiers constitute a particularization of binary supervised classification systems, in which the detection algorithms are trained only with data of one class. After the classifier is trained on these one-class traces, data corresponding to a category different from that used during training can be detected as anomalies. Therefore, once the model of an OCC is developed, input patterns can be identified as anomalies when a certain parameter derived from the input signals (e.g., a distance) exceeds a predefined decision threshold.
In the case of FDSs, the concept of an anomaly fits well with that of a fall, which can be envisaged as an unexpected movement that presents atypical characteristics with regard to those of the common or majority class (ADLs). Thus, in our evaluation, the classifiers are trained exclusively with part of the ADL samples included in the datasets while they are tested with both the falls and the rest of the ADLs (those not employed during the training stage).
In order to thoroughly evaluate the feasibility of using an OCC as the core of FDSs, we analyze the performance of five well-known one-class classifiers [10]: an autoencoder, a Gaussian Mixture Model (GMM), a Parzen Probabilistic Neural Network (PPNN), a One-Class K-Nearest Neighbor (OC-KNN), and One-Class Support Vector Machine (OC-SVM). All the classifiers were implemented and executed with Matlab scripts that used the Statistics and Machine Learning Toolbox [48]. Table 3 summarizes the values and possible considered alternatives to hyper-parameterize these classifiers. Through a grid search, we evaluated the performance of the algorithms for the different combinations of these hyperparameters.
As the decision threshold to detect the anomaly for each OCC, we employ the variable described in Table 4.

2.3. Feature Selection

In order to characterize the mobility samples and feed the machine learning classifiers, we compute a set of features derived from the raw signals collected by the inertial sensors. As all the repositories include the data captured by an accelerometer, which is the most employed sensor in the literature on wearable FDSs, the features are derived from the triaxial acceleration measurements. Falls provoke sudden peaks of the acceleration magnitude when the body hits the ground. This Signal Magnitude Vector (SMVi), for the i-th measurement, is computed as:
S M V i = A x i 2 + A y i 2 + A z i 2
where A x i ,   A y i ,   and   A z i indicate the values of triaxial components of the acceleration for each axis. For every movement trace (ADL or fall), the feature extraction exclusively focuses on a time interval of ±1 s around the sample where the maximum value of SMVi is identified, while the rest of the measurements in the sequence are not considered. The election of the duration of this observation window of 2 s (centered around the acceleration peak) is justified by the fact that an interval between 1 and 2 s is a good trade-off between recognition speed and accuracy to recognize most human activities [49]. In any case, the duration of the critical (impact) phase of a fall does not typically last longer than 0.5 s [50,51]. Thus, all the features will be derived from the consecutive acceleration components collected in the interval: i o N W , i o + N W where io is the index of the sample in which the maximum acceleration module is located:
S M V i o = max S M V i   i 1 , N N W + 1 )
where N denotes the number of measurements of in the trace (for each axis), while NW describes the number of samples captured during the observation window. NW can be straightforwardly calculated as:
N W = f s   ·   t w
where fs is the sampling rate of the trace and tw is the total duration of the window (2 s).
As a proper selection of the input features is a crucial factor in the design of any machine learning method, we consider different alternative candidate feature sets.
Firstly, we employ a set of twelve statistical candidate features that are physically interpretable as they entail a certain characterization of human dynamics. These features have been utilized by other works in the related literature on fall detection and activity recognition systems (refer, for example, to the comprehensive studies presented by Vallabh in [52] or by Xi in [53]). The symbol, labels (or labeling identifiers) and description of these twelve features are presented in Table 5 (a more detailed formal description of these parameters is provided in [33]).
In order to select the most convenient combination of input features from these 12 candidate statistics, we performed a preliminary analysis of the effectiveness of these statistics when they are applied to the aforementioned datasets to discriminate falls and ADLs with the classifiers. For all the studies, all the features were z-score normalized before training and testing. After implementing all the possible permutations of the statistics to feed the detectors, obtained results (not presented here for the sake of simplicity) revealed that the two combinations that yielded the best performance metric (sensitivity and specificity) in the classifiers were those using the seven features labeled as B, C, D, F, G, I, and K in Table 4 (‘BCDFGIK’ feature set) as well as the set that employed the 12 candidate features (‘ABCDEFGHIJKL’ feature set).
As the election of these input feature sets can still seem arbitrary, we also consider another set of features obtained from the hctsa (Highly Comparative Time-Series Analysis) Matlab software package [54]. This software is capable of extracting thousands of heterogeneous features from a time-series dataset to produce an optimized low-dimensional representation of the data.
In our case, a set (HCTSA feature set) of 12 features has been selected according to the following procedure:
  • The SisFall [31] repository is selected as the baseline reference as it is considered one of the most complete in terms of types and quantity of movements and number and typology of subjects.
  • The candidate features of the samples are obtained by using HCTSA.
  • The performance resulting from the classification of the data is calculated by using each characteristic as input of a Support Vector Machine classifier with linear kernel and a k-fold analysis (with k = 10).
  • The tool analyzed the correlation between the features that have led to the best results. Then, the application was programmed to divide these features into 12 different clusters, grouping those that are correlated into the same cluster. From each cluster, hctsa selected the most representative feature (the one closest to the center of the cluster).

2.4. Performance Metrics and Model Evaluation

For each combination of the hyperparameters, input feature set, and dataset, we trained an instance of the five contemplated OCCs with a certain number of ADLs and tested it with both ADLs and falls of the same repository. To assess the capability of the one-class classifiers to discriminate both categories, we employed two metrics universally employed in the evaluation of binary classifiers: the sensitivity (Se) or recall, which is defined as the ratio of falls in the test subset that are properly recognized, and specificity (Sp), which is defined as the proportion of test ADLs that are not misidentified as falls. Unlike other metrics (such as the accuracy or F1 score), sensitivity and specificity are not affected if the data classes are unbalanced in the datasets. Once the model is trained, the classifier is tested with 2500 possible values of the detection threshold (between a minimum and maximum that respectively guarantee the maximization of the sensitivity and specificity). Through the estimation of Se and Sp for each value of the discrimination threshold, we compute the receiver operating characteristic curve (ROC curve), which represents the evolution of Se (true positive rate) against 1-Sp (false positive rate). From the curve, we graphically calculate the AUC (Area Under the Curve) as a metric commonly used to characterize the overall performance of the binary classifiers. Additionally, as another global performance metric of the system, which describes the trade-off between an adequate recognition rate of falls (high sensitivity) and the absence of false alarms (high specificity), we also utilize the geometric mean of Se and Sp ( S e · S p ) (together with the values of Se and Sp) in the point of the ROC where the maximum of this statistic is found. The election of this optimal cut-point in the ROC to select the corresponding decision threshold has been also proposed in works such as [55].
In order to minimize the impact of the election of the data used for training and testing the models, we evaluated the classifiers by means of a k-fold cross-validation [56,57]. For that purpose, the ADL traces of all datasets were split in five partitions (k = 5). Thus, for each combination of OCC, hyperparameters, input feature set, and dataset, the classifier is independently trained and tested five times. For each iteration, one of the five different partitions is reserved for the testing phase, while the rest of the ADLs and all the falls in the corresponding database are used to test the model. The performance metrics obtained with the test datasets for the five iterations (AUC, Se, and Sp for the threshold value that yields the highest value of S e · S p ) are averaged to characterize the performance of the classifier.

3. Results and Discussion

3.1. Study for the ‘Fair’ Case

As previously commented and indicated in Table 2, the datasets were generated by considering different predetermined types of ADLs and falls, which were executed by the experimental subjects. In our first analysis, we investigate the performance of the OCCs when the different typologies of ADLs are evenly (‘fairly’) distributed among the five subsets for five-fold cross-validation. Thus, we guarantee that all the types of ADL movements are represented in the subsets with which the anomaly detectors are trained.
The performance metrics obtained for the five algorithms and the nine datasets are presented in Table 6. Due to the high number of combinations that were evaluated, for each dataset and each type of OCC, the table only shows the combination of hyperparameters and input feature set (also indicated in the table) with which the highest value of the geometric mean of sensitivity and specificity ( S e · S p ) was achieved. For each dataset, the row corresponding to the classifier with the best global metric is highlighted in bold. Aiming at giving an insight of the confidence interval of the measurements, together with the mean value of the global metric S e · S p , the table also includes in the last column (preceded by the sign ±) the standard deviation of this parameter obtained for the five tests of the corresponding k-fold validation of the classifier. To ease the visualization of the comparison of the algorithms, the particular results of the AUC and S e · S p are summarized in Table 7 and Table 8, respectively. The highest values are also emphasized in bold.
From the results, we can draw the following conclusions:
  • The best results are achieved by the OC-KNN classifier, which outperforms the rest of the detection methods for five out of the nine analyzed datasets (in terms of the geometric mean of sensitivity and specificity), while it presents the second or third best results for the other three datasets.
  • The one-class SVM detector produces the best results for three datasets, while it offers the second-best behavior for five repositories. In any case, if we take into account the confidence interval that can be derived from the measurements, we can conclude that the differences in the behavior of OC-KNN and OC-SVM are not statistically significant.
  • In most cases, the best performance is attained with the simplest input feature set (with the seven features labeled as BCDFGIK and described in Table 5): This suggests that if the features are conveniently selected, a parsimonious OCC architecture can be sufficient to produce efficient detection decisions.
  • GMM, autoencoder and, specially, PPNN classifiers offer a more variable and erratic behavior as the quality of the classification strongly depends on the employed datasets. In several databases, the best achieved geometric mean of sensitivity and specificity is under 0.90.
  • For all the datasets, the OC-KNN classifier yields at least a specificity and a sensitivity of 0.9. In most cases, these metrics are both higher than 0.95. These results are in line with most of the supervised (double-class) methods of machine learning that can be found in the related literature (see, for example, the surveys presented in [58,59,60,61,62,63]). This implies that if the decision threshold is properly chosen, an OCC can behave as a two-class classifier without requiring training the detector with falls. In a realistic use scenario, the final user of the detector e.g., an older adult) could be monitored during his/her daily routines to generate a dataset of ADLs. This dataset could be used to train and personalize an FDS based on an OCC.

3.2. Study of the Benefits of Ensemble Learning

Ensemble methods offer a simple and efficient paradigm to boost the prediction capability of single machine learning methods basing on the combined decision of multiple models [64]. In this subsection, we assess if the aggregate knowledge reached by the models evaluated in the previous analysis can improve the individual performance of the classifiers. In particular, we re-calculate the detection decision when a simplistic majority voting of three classifiers is applied (a similar performance is achieved if a higher number of models is considered). In this case, for each dataset, we use as base learners the three combinations of hyperparameters, input feature sets, and OCCs with which the three highest global performance metrics (geometric mean of Se and Sp) were obtained. Thus, during the testing phase, a trace is identified as a fall if a majority of the decision classifiers (two or three) classify the movement as a fall.
The obtained results are presented in Table 9. For comparison purposes, the table also indicates the best results (extracted from Table 6) corresponding to the best discrimination ratio achieved by a single OCC. In the table, the metrics of the ensemble classifier are marked in bold when they improve those generated by the single learner. Conversely, the results are highlighted in italics when the majority voting underperforms the best single model.
As it can be observed, the use of the ensemble improves the global performance metric in six out of the nine analyzed datasets (in several cases, a value of S e · S p close to 0.99 is attained), while with just one repository (DLR), the application of the voting technique reduces the effectiveness of the binary classification process.

3.3. Impact of the Typology of ADLs Employed in the Training Phase

As mentioned above, OCCs avoid the need of obtaining (or generating) traces describing real or emulated falls that are required to train supervised learning algorithms. In contrast, the use of one-class classifiers can present difficulties related to a lower specificity of the system due to the appearance of a greater number of false alarms or false positives, which is caused by ADLs not contemplated in the training dataset and identified as anomalies.
To determine the extent of this problem, we repeat the previous study of Section 3.1 when a certain typology of ADLs is removed from the training set and included in the testing subset. For this purpose, as already suggested in our previous studies in [33,40], the ADL movements of all the repositories have been split into three categories, which are displayed in Table 10, depending on the physical effort that they required to perform.
For each dataset (except for the DOFDA repository, which does not include sufficient traces of two categories), we generated three subsets of ADLs containing the traces of the corresponding categories. The best combination of hyperparameters and input feature set of each type of OCC obtained in Section 3.1 is trained and tested three times. In each experiment, each model is exclusively trained with the subsets of two categories and then tested with the falls and ADLs of the remaining category using the optimal decision threshold computed for the ‘fair’ case.
The results for all the analyzed datasets and the best performing OCC of each type are shown in Table 11, Table 12 and Table 13 for the cases in which the training sets do not include basic, standard, and sporting activities, respectively. The last column of each table (‘Loss’) indicates the difference between the global performance metric obtained with this segregation of the training and test subsets based on the categorization of the ADLs and the performance metric achieved with the ‘fair’ case (Table 6) in which traces of all the categories of ADLs are incorporated into the training subset. Consequently, a negative value of this parameter denotes a deterioration of the recognition capacity of the classifier.
As it could be expected, results show that the presence of new types of ADLs in the testing sets (not considered during the training phase) causes a strong degradation of the capability of the classifiers to discriminate falls from ADLs. This loss of effectiveness is particularly remarkable in those repositories (such as FallAllD) that encompass a greater number of types of ADL.
In this regard, the poorest discrimination rate is achieved when the system is tested with sporting movements. In some datasets, the best results for this situation achieve specificities below 80% (which imply that more than 20% of sporting actions are considered as falls and would trigger a false alarm). The brusque mobility patterns induced by this category of movements obviously provoke that the classifiers (trained with much less agitated activities) misinterpret them as anomalies.
Paradoxically, the results also indicate that very basic and less energetic activities also result in false positives, as they can be also identified as ‘novelties’ if traces corresponding to low motion movements are not included in the training subset. Nevertheless, these false alarms originated by ‘sedentary’ actions could be most probably avoided by a simple thresholding technique so that a movement trace is inputted to the OCC only if the magnitude of the acceleration exceeds a certain value and a fall can be reasonably suspected.
Finally, the movements included in the standard category seem to be the typology of activities with the lowest impact on the effectiveness of the training. This can be explained by the fact that these activities represent an intermediate point of physical strength between basic and sporting movements. Thus, training with movements with a lower and greater intensity (basic activities and sports, respectively) gives enough information to the classifiers to avoid being considered as anomalies. Yet, a relevant decay of the performance of certain OCCs is also perceived when this category is excluded from the training phase.

4. Conclusions

This work has assessed the effectiveness of utilizing one-class classifiers as the decision core of fall detection systems based on wearable inertial sensors. Unlike fully supervised methods, OCCs benefit from the fact that they can be trained exclusively with samples of a single class (conventional Activities of Daily Living), which avoids the need of obtaining traces captured during falls to train the classifiers.
In particular, we have analyzed the performance of five well-known OCCs under different input feature sets and a wide selection of hyperparameters. In contrast with most studies in the literature, which base their analysis on the use of a single dataset, we have extended the study to nine public repositories.
The achieved results (with values of the geometric mean of sensitivity and specificity higher than 95%) have shown the capability of the OCC to discriminate falls from ADLs with a high accuracy if the election of the decision threshold is optimized. This performance is comparable to that obtained with supervised systems in the literature. For almost all tests and datasets, the one-class KNN classifier stood out as the best (or second best) detection algorithm, which is a conclusion that is coherent with other previous analysis in the related works. The study has also revealed that the use of simplistic ensemble learning methods (such as voting) may improve the hit rate of the detector if the decisions of several OCCs are simultaneously considered.
In any case, the analyses have illustrated the extreme vulnerability of these classifiers to the typology of the ADLs used for the training phase. Actions that involve rapid movements (such as sports) and even very basic activities (which do not require any physical effort) may be straightforwardly identified as anomalies if they are not considered in the patterns used for training. This problem, which could be alleviated with the combination of OCCs and other simple methods that avoid identifying certain typical ADLs as falls, forces rethinking the way in which one-class detectors are adjusted and evaluated. The results clearly show the importance of having a sufficiently varied set of samples for training. Likewise, in the test phase, and as stress tests of the system, the evaluation should ponder the use of ADLs (not used for training) that entail agitated movements that may affect the decision of the classifier. Future studies should also focus on methodologies that automatically optimize the election of the decision threshold.

Author Contributions

Conceptualization, E.C.; methodology, E.C. and J.A.S.-R.; software, J.A.S.-R.; validation, J.A.S.-R.; formal analysis, E.C. and J.A.S.-R.; investigation, E.C. and J.A.S.-R.; resources, E.C. and J.M.C.-G.; data curation, E.C. and J.A.S.-R.; writing—original draft preparation, E.C.; writing—review and editing, E.C., J.A.S.-R. and J.M.C.-G.; visualization, J.A.S.-R.; supervision, E.C. and J.M.C.-G.; project administration, E.C.; funding acquisition, E.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by FEDER Funds (under grant UMA18-FEDERJA-022), Andalusian Regional Government (-Junta de Andalucía- grant PAIDI P18-RT-1652) and Universidad de Málaga, Campus de Excelencia Internacional Andalucia Tech.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the datasets employed in this work are publicly available. The URL to download the repositories can be found in the corresponding reference.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. World Health Organization Falls: Key Facts. Available online: https://www.who.int/news-room/fact-sheets/detail/falls (accessed on 16 July 2021).
  2. World Health Organization. WHO Global Report on Falls Prevention in Older Age; WHO Press: Geneva, Switzerland, 2007. [Google Scholar]
  3. World Health Organization Ageing and Health—Key Facts. Available online: http://www.who.int/mediacentre/factsheets/fs404/en/ (accessed on 21 July 2021).
  4. Lord, S.R.; Sherrington, C.; Menz, H.B.; Close, J.C.T. Falls in Older People: Risk Factors and Strategies for Prevention; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  5. Casilari, E.; Luque, R.; Morón, M. Analysis of android device-based solutions for fall detection. Sensors 2015, 15, 17827–17894. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Aziz, O.; Musngi, M.; Park, E.J.; Mori, G.; Robinovitch, S.N. A comparison of accuracy of fall detection algorithms (threshold-based vs. machine learning) using waist-mounted tri-axial accelerometer signals from a comprehensive set of falls and non-fall trials. Med. Biol. Eng. Comput. 2017, 55, 45–55. [Google Scholar] [CrossRef]
  7. Klenk, J.; Becker, C.; Lieken, F.; Nicolai, S.; Maetzler, W.; Alt, W.; Zijlstra, W.; Hausdorff, J.M.; Van Lummel, R.C.; Chiari, L. Comparison of acceleration signals of simulated and real-world backward falls. Med. Eng. Phys. 2011, 33, 368–373. [Google Scholar] [CrossRef]
  8. Bagalà, F.; Becker, C.; Cappello, A.; Chiari, L.; Aminian, K.; Hausdorff, J.M.; Zijlstra, W.; Klenk, J. Evaluation of accelerometer-based fall detection algorithms on real-world falls. PLoS ONE 2012, 7, e37062. [Google Scholar] [CrossRef] [Green Version]
  9. Aziz, O.; Klenk, J.; Schwickert, L.; Chiari, L.; Becker, C.; Park, E.J.; Mori, G.; Robinovitch, S.N. Validation of accuracy of SVM-based fall detection system using real-world fall and non-fall datasets. PLoS ONE 2017, 12, e0180318. [Google Scholar] [CrossRef] [Green Version]
  10. Khan, S.S.; Madden, M.G. One-class classification: Taxonomy of study and review of techniques. Knowl. Eng. Rev. 2014, 29, 345–374. [Google Scholar] [CrossRef] [Green Version]
  11. Medrano, C.; Plaza, I.; Igual, R.; Sánchez, Á.; Castro, M. The effect of personalization on smartphone-based fall detectors. Sensors 2016, 16, 117. [Google Scholar] [CrossRef] [PubMed]
  12. Viet, V.; Choi, D.-J. Fall detection with smart phone sensor. In Proceedings of the 3rd International Conference on Internet (ICONI 2011), Sepang, Malaysia, 15–19 December 2011; pp. 15–19. [Google Scholar]
  13. Lisowska, A.; Wheeler, G.; Inza, V.C.; Poole, I. An evaluation of supervised, novelty-based and hybrid approaches to fall detection using silmee accelerometer data. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 402–408. [Google Scholar]
  14. Nho, Y.H.; Lim, J.G.; Kwon, D.S. Cluster-analysis-based user-adaptive fall detection using fusion of heart rate sensor and accelerometer in a wearable device. IEEE Access 2020, 8, 40389–40401. [Google Scholar] [CrossRef]
  15. Medrano, C.; Igual, R.; García-Magariño, I.; Plaza, I.; Azuara, G. Combining novelty detectors to improve accelerometer-based fall detection. Med. Biol. Eng. Comput. 2017, 55, 1849–1858. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Casilari, E.; Lora-Rivera, R.; García-Lagos, F. A study on the application of convolutional neural networks to fall detection evaluated with multiple public datasets. Sensors 2020, 20, 1466. [Google Scholar] [CrossRef] [Green Version]
  17. Khan, S.S.; Hoey David, J.R. Review of fall detection techniques: A data availability perspective. Med. Eng. Phys. 2017, 39, 12–22. [Google Scholar] [CrossRef] [Green Version]
  18. Zhang, T.; Wang, J.; Liu, P.; Hou, J. Fall detection by embedding an accelerometer in cellphone and using KFD algorithm. Int. J. Comput. Sci. Netw. Secur. 2006, 6, 277–284. [Google Scholar]
  19. Zhang, T.; Wang, J.; Xu, L.; Liu, P. Fall detection by wearable sensor and one-class SVM algorithm. In Intelligent Computing in Signal Processing and Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2006; pp. 858–863. [Google Scholar]
  20. Yin, J.; Yang, Q.; Member, S.; Pan, J.J. Sensor-based abnormal human-activity detection. IEEE Trans. Knowl. Data Eng. 2008, 20, 1082–1090. [Google Scholar] [CrossRef]
  21. Medrano, C.; Igual, R.; Plaza, I.; Castro, M. Detecting falls as novelties in acceleration patterns acquired with smartphones. PLoS ONE 2014, 9, e94811. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Khan, S.S.; Karg, M.E.; Kulić, D.; Hoey, J. X-factor HMMs for Detecting Falls in the Absence of Fall-specific training data. In International Workshop on Ambient Assisted Living; Springer: Cham, Switzerland, 2014; Volume 8868, pp. 1–9. [Google Scholar]
  23. Khan, S.S.; Karg, M.E.; Kulić, D.; Hoey, J. Detecting falls with X-factor hidden markov models. Appl. Soft Comput. J. 2017, 55, 168–177. [Google Scholar] [CrossRef] [Green Version]
  24. Frank, K.; Vera Nadales, M.J.; Robertson, P.; Pfeifer, T. Bayesian recognition of motion related activities with inertial sensors. In Proceedings of the 12th ACM International Conference on Ubiquitous Computing-Adjunct, Copenhagen, Denmark, 26–29 September 2010; pp. 445–446. [Google Scholar]
  25. Vavoulas, G.; Pediaditis, M.; Spanakis, E.G.; Tsiknakis, M. The MobiFall dataset: An initial evaluation of fall detection algorithms using smartphones. In Proceedings of the IEEE 13th International Conference on Bioinformatics and Bioengineering (BIBE 2013), Chania, Greece, 10–13 November 2013; pp. 1–4. [Google Scholar]
  26. Yang, K.; Ahn, C.R.; Vuran, M.C.; Aria, S.S. Semi-supervised near-miss fall detection for ironworkers with a wearable inertial measurement unit. Autom. Constr. 2016, 68, 194–202. [Google Scholar] [CrossRef] [Green Version]
  27. Khan, S.S.; Taati, B. Detecting unseen falls from wearable devices using channel-wise ensemble of autoencoders. Expert Syst. Appl. 2017, 87, 280–290. [Google Scholar] [CrossRef] [Green Version]
  28. Ojetola, O.; Gaura, E.; Brusey, J. Data Set for Fall Events and Daily Activities from Inertial Sensors. In Proceedings of the 6th ACM Multimedia Systems Conference (MMSys’15), Portland, OR, USA, 18–20 March 2015; pp. 243–248. [Google Scholar]
  29. Micucci, D.; Mobilio, M.; Napoletano, P.; Tisato, F. Falls as anomalies? An experimental evaluation using smartphone accelerometer data. J. Ambient Intell. Humaniz. Comput. 2017, 8, 87–99. [Google Scholar] [CrossRef] [Green Version]
  30. Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. A public domain dataset for human activity recognition using smartphones. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2013), Bruges, Belgium, 24–26 April 2013; pp. 437–442. [Google Scholar]
  31. Lisowska, A.; O’Neil, A.; Poole, I. Cross-cohort evaluation of machine learning approaches to fall detection from accelerometer data. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018)—Volume 5: HEALTHINF, Funchal, Madeira, Portugal, 19–21 January 2018; Volume 5, pp. 77–82. [Google Scholar]
  32. Chen, L.; Li, R.; Zhang, H.; Tian, L.; Chen, N. Intelligent fall detection method based on accelerometer data from a wrist-worn smart watch. Measurement 2019, 140, 215–226. [Google Scholar] [CrossRef]
  33. Casilari, E.; Santoyo-Ramón, J.A.; Cano-García, J.M. On the heterogeneity of existing repositories of movements intended for the evaluation of fall detection systems. J. Healthc. Eng. 2020, 2020, 6622285. [Google Scholar] [CrossRef]
  34. Bourke, A.K.; Klenk, J.; Schwickert, L.; Aminian, K.; Ihlen, E.A.F.; Mellone, S.; Helbostad, J.L.; Chiari, L.; Becker, C. Fall detection algorithms for real-world falls harvested from lumbar sensors in the elderly population: A machine learning approach. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS 2016), Orlando, FL, USA, 16–20 August 2016; pp. 3712–3715. [Google Scholar]
  35. Gjoreski, H.; Luštrek, M.; Gams, M. Accelerometer placement for posture recognition and fall detection. In Proceedings of the 7th International Conference on Intelligent Environments (IE 2011), Nottingham, UK, 25–28 July 2011; pp. 47–54. [Google Scholar]
  36. Dai, J.; Bai, X.; Yang, Z.; Shen, Z.; Xuan, D. PerFallD: A pervasive fall detection system using mobile phones. In Proceedings of the 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), Mannheim, Germany, 29 March–2 April 2010; pp. 292–297. [Google Scholar]
  37. Kangas, M.; Konttila, A.; Lindgren, P.; Winblad, I.; Jämsä, T. Comparison of low-complexity fall detection algorithms for body attached accelerometers. Gait Posture 2008, 28, 285–291. [Google Scholar] [CrossRef] [PubMed]
  38. Fang, S.-H.; Liang, Y.-C.; Chiu, K.-M. Developing a mobile phone-based fall detection system on android platform. In Proceedings of the Computing, Communications and Applications Conference (ComComAp), Hong Kong, China, 21 February 2012; pp. 143–146. [Google Scholar]
  39. Ntanasis, P.; Pippa, E.; Özdemir, A.T.; Barshan, B.; Megalooikonomou, V. Investigation of sensor placement for accurate fall detection. In Proceedings of the International Conference on Wireless Mobile Communication and Healthcare (MobiHealth 2016), Milan, Italy, 14–16 November 2016; pp. 225–232. [Google Scholar]
  40. Casilari, E.; Santoyo-Ramón, J.A.; Cano-García, J.M. Analysis of public datasets for wearable fall detection systems. Sensors 2017, 17, 1513. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Cotechini, V.; Belli, A.; Palma, L.; Morettini, M.; Burattini, L.; Pierleoni, P. A dataset for the development and optimization of fall detection algorithms based on wearable sensors. Data Br. 2019, 23, 103839. [Google Scholar] [CrossRef]
  42. Özdemir, A.T.; Barshan, B. Detecting falls with wearable sensors using machine learning techniques. Sensors 2014, 14, 10691–10708. [Google Scholar] [CrossRef] [PubMed]
  43. Saleh, M.; Abbas, M.; Le Jeannes, R.B. FallAllD: An open dataset of human falls and activities of daily living for classical and deep learning applications. IEEE Sens. J. 2021, 21, 1849–1858. [Google Scholar] [CrossRef]
  44. Human Factors and Ergonomics Lab—Korea Advanced Intitute of Science and Technology KFall: A Comprehensive Motion Dataset to Detect Pre-impact Fall for the Elderly based on Wearable Inertial Sensors. Available online: https://sites.google.com/view/kfalldataset (accessed on 30 April 2021).
  45. Sucerquia, A.; López, J.D.; Vargas-bonilla, J.F. SisFall: A fall and movement dataset. Sensors 2017, 17, 198. [Google Scholar] [CrossRef]
  46. Casilari, E.; Santoyo-Ramón, J.A.; Cano-García, J.M. Analysis of a smartphone-based architecture with multiple mobility sensors for fall detection. PLoS ONE 2016, 11, e01680. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Martínez-Villaseñor, L.; Ponce, H.; Brieva, J.; Moya-Albor, E.; Núñez-Martínez, J.; Peñafort-Asturiano, C. UP-fall detection dataset: A multimodal approach. Sensors 2019, 19, 1988. [Google Scholar] [CrossRef] [Green Version]
  48. Mathworks Statistics and Machine Learning Toolbox—MATLAB. Available online: https://es.mathworks.com/products/statistics.html (accessed on 18 August 2021).
  49. Banos, O.; Galvez, J.-M.; Damas, M.; Pomares, H.; Rojas, I. Window size impact in human activity recognition. Sensors 2014, 14, 6474–6499. [Google Scholar] [CrossRef] [Green Version]
  50. Becker, C.; Schwickert, L.; Mellone, S.; Bagalà, F.; Chiari, L.; Helbostad, J.L.; Zijlstra, W.; Aminian, K.; Bourke, A.; Todd, C.; et al. Proposal for a multiphase fall model based on real-world fall recordings with body-fixed sens. Z. Gerontol. Geriatr. 2012, 45, 707–715. [Google Scholar] [CrossRef]
  51. Noury, N.; Rumeau, P.; Bourke, A.K.; ÓLaighin, G.; Lundy, J.E. A proposal for the classification and evaluation of fall detectors. IRBM 2008, 29, 340–349. [Google Scholar] [CrossRef]
  52. Vallabh, P.; Malekian, R. Fall detection monitoring systems: A comprehensive review. J. Ambient Intell. Humaniz. Comput. 2018, 9, 1809–1833. [Google Scholar] [CrossRef]
  53. Xi, X.; Tang, M.; Miran, S.M.; Luo, Z. Evaluation of feature extraction and recognition for activity monitoring and fall detection based on wearable sEMG sensors. Sensors 2017, 17, 1229. [Google Scholar] [CrossRef]
  54. Fulcher, B.D.; Jones, N.S. hctsa: A computational framework for automated time-series phenotyping using massive feature extraction. Cell Syst. 2017, 5, 527–531. [Google Scholar] [CrossRef] [PubMed]
  55. Liu, X. Classification accuracy and cut point selection. Stat. Med. 2012, 31, 2676–2686. [Google Scholar] [CrossRef] [PubMed]
  56. Rodríguez, J.D.; Pérez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef] [PubMed]
  57. Wong, T.T. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 2015, 48, 2839–2846. [Google Scholar] [CrossRef]
  58. Delahoz, Y.S.; Labrador, M.A. Survey on fall detection and fall prevention using wearable and external sensors. Sensors 2014, 14, 19806–19842. [Google Scholar] [CrossRef] [Green Version]
  59. Wang, X.; Ellul, J.; Azzopardi, G. Elderly fall detection systems: A literature survey. Front. Robot. AI 2020, 7, 71. [Google Scholar] [CrossRef]
  60. Andò, B.; Baglio, S.; Castorina, S.; Crispino, R.; Marletta, V. Advanced solutions aimed at the monitoring of falls and human activities for the elderly population. Technologies 2019, 7, 59. [Google Scholar] [CrossRef] [Green Version]
  61. Ren, L.; Peng, Y. Research of fall detection and fall prevention technologies: A systematic review. IEEE Access 2019, 7, 77702–77722. [Google Scholar] [CrossRef]
  62. Islam, M.; Tayan, O.; Islam, R.; Islam, S.; Nooruddin, S.; Kabir, M.N.; Islam, R. Deep learning based systems developed for fall detection: A review. IEEE Access 2020, 8, 166117–166137. [Google Scholar] [CrossRef]
  63. Broadley, R.; Klenk, J.; Thies, S.; Kenney, L.; Granat, M.; Broadley, R.W.; Klenk, J.; Thies, S.B.; Kenney, L.P.J.; Granat, M.H. Methods for the real-world evaluation of fall detection technology: A scoping review. Sensors 2018, 18, 2060. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  64. Polikar, R. Ensemble learning. In Ensemble Machine Learning; Springer: Boston, MA, USA, 2012; pp. 1–34. [Google Scholar]
Table 1. Works that have proposed and compared one-class classifiers to detect falls as anomalies.
Table 1. Works that have proposed and compared one-class classifiers to detect falls as anomalies.
Ref. and AuthorsYearType of Compared OCCsNumber of FeaturesEmployed SensorsBest Achieved PerformanceEmployed Datasets
(Number of ADL/Falls)
Zhang et al. [18,19]2006OC-SVM + KFD +k-NN, OC-SVM6AccSe = 0.9703
Sp = 0.9521
-Unpublished dataset (676/418)
Yin et al. [20] 22008OC-SVM + KNLR, OC-SVM + MLLR, OC-SVM n.i.Light, Temp., Mic., Acc., Mag.AUC = 0.985
Se = 0.90
Sp = 0.93
-Unpublished dataset (431/112 near falls)
Viet and Choi [12]20111-SVM4Acc, Ori,Se = 0.7699-Unpublished dataset (n.i./226)
Medrano et al. [21] 2014OC-KNN, OC-SVM, OC-KNN-sum, Kmeans + OC-KNN51AccAUC = 0.957
Se = 0.929
Sp = 0.890
-tFall [21] (9883/1026)
Khan et al. [22,23]2014
2017
XHMM, HMM, OC-KNN, OC-SVM31Acc, Gyr.Se = 0.893
Sp = 0.970
-DLR [24] (961/56)
-MobiFall [25] (342/288)
Lisowska et al. [13] 2015RNN, OC-SVM, OC-KNN21AccSe = 0.858
Sp = 0.853
AUC = 0.915
-Unpublished dataset (641/168)
Medrano et al. [11]2016OC-KNN, OC-SVM, LOF153AccAUC = 0.9809
Se = 0.9541
Sp = 0.9484
-tFall [21] (9883/1026)
Yang et al. [26] 12016OC-SVM38Acc, Gyr.Se = 0.783
Accuracy = 0.852
-Unpublished dataset (n.i./252 near falls)
Medrano et al. [15] 2017KDE4AccSe = 0.986
Sp = 0.972
-tFall [21] (9883/1026)
Khan and Taati [27] 2017Ensemble of AEs, OC-KNN, OC-SVM6Acc, Gyr S e · S p = 0.959-DLR [24] (961/56)
-Cogent Labs [28] (1520/448)
Micucci et al. [29] 2017OC-KNN, OC-SVM12, 51, 384Acc.AUC = 0.997
Se = 0.996
Sp = 0.993
-tFall [21] (9883/1026)
-HAR database [30] (360/0)
Lisowska et al. [31] 2018RNN, OC-SVM, OC-KNN21AccAUC = 0.950-Unpublished dataset (641/168)
-tFall [21] (9883/1026)
Chen et al. [32]2019Ensemble of AEs + OCCCH, OCCCH, OC-SVM3 windows of 500 samplesAcc.Se = 0.9913, Sp = 0.9625-Unpublished dataset (288/234)
Nho et al. [14]2020GMM22Acc, HRSe = 0.9309, Sp = 0.8958-Unpublished dataset (273/126)
1 The system is actually designed to detect “near-miss falls” (not falls). 2 The system is actually designed to generically detect abnormal activities (not only falls). n.i. Not indicated by the authors in the article. Acronyms for the OCCs: AE (Autoencoder), GMM( Gaussian Mixture Model), HMM (Hidden Markov Model), KDE (Kernel Density Estimation), KFD (Kernel Fisher Discriminant), Kmeans (K-means clustering), KNLR: Kernel Non-Linear Regression, LOF (Local Outlier Factor), MLLR (Maximum Likelihood Linear Regression), OC-KNN (One-Class K-Nearest Neighbors), OC-KNN-sum (One-Class K-Nearest Neighbors with sum of the distances), OC-SVM (One-Class Support Vector Machines), OCCCH (One-Class Classification based on the Convex Hull), RNN (Replicator Neural Network), XHMM (X-Factor’ Hidden Markov Model). Acronyms for the employed sensors: Acc. (Accelerometer), Gyr. (Gyroscope), HR (Heart Rate monitor), Mag. (Magnetometer), Mic. (Microphone), Ori. (Orientation angle sensor: pitch and roll), Temp. (Temperature). Acronyms for the metrics: AUC (Area Under the Curve), Se (Sensitivity), SP (Specificity).
Table 2. Basic data of the employed datasets.
Table 2. Basic data of the employed datasets.
DatasetNumber of
Subjects
(Females/Males)
Number of Types of ADLs/FallsNumber
of Samples (ADLs/Falls)
Duration of the Samples (s)Captured Signals in Each Sensing Point 1Number and Positions of the Sensing PointsSampling Rate (Hz)
DLR [24]19 (8/11)15/11017 (961/56)[0.27–864.33]3 (A, G, M)1: Waist (belt)100
DOFDA [41] 8 (2/6)5/13432 (120/312)1.96–17.2624 (A, G, O, M)Waist33
Erciyes Univ. [42] 17 (7/10)16/203302 (1476/1826)[8.36–37.76]1 (A)6: Chest, Head, Ankle, Thigh, Wrist, Waist25
FallAllD [43] 15 (7/8)44/356605 (4883/1722)3204 (A, G, M, B)3: Waist, Wrist, Chest (lanyard around the neck)238 (A, G)
80 (M)
10 (B)
IMUFD [6]10 (n.i.)8/7600 (390/210)[15–20.01]3 (A, G, M)7: Chest, Head, Left ankle, Left thigh, Right ankle, Right thigh, Waist128
KFall [44] 32 (0/32)21/155075 (2729/2346)[2.03–40.86]A, G, O1: Waist (Low back)100
SisFall [45] 38 (19/19)19/154505 (2707/1798)[9.99–179.99] s3 (A, A, G)Waist200
UMAFall [46] 19 (8/11)12/3746 (538/208)15 s (all samples)3 (A, G, M)5: Ankle, Chest, Thigh, Waist, Wrist100 (Thigh)
20 (Rest)
UP-Fall [47] 17 (8/9)6/5559 (304/255)[9.409–59.979]2 (A, G)5: Ankle, Neck, Thigh (pocket), Waist, WristAround
18 Hz
1 A: Accelerometer, G: Gyroscope, M: Magnetometer, O: Orientation sensor.
Table 3. Values and alternatives of the hyperparameters utilized for the evaluated models of the classifiers.
Table 3. Values and alternatives of the hyperparameters utilized for the evaluated models of the classifiers.
One-Class ClassifierHyperparameterValue/Alternatives
AutoencoderNumber of hidden neurons6, 10, 12, 15
Encoder/decoder transfer functionLogistic Sigmoid
f z = 1 1 + e z
Number of epochs1000
Loss functionMean squared error plus L2&sparsity regularization
Sparsity regularization coefficient1
L2 weight regularization coefficient0.001
Gaussian Mixture Model (GMM)Type of covariance matrixDiagonal
Number of components3, 5 and 7
Parzen Probabilistic Neural Networks (PPNN)Window function f x = e x 1
One-Class K-Nearest Neighbors (OC-KNN)Distance functionEuclidean, Minkowski, Chebychev, Cosine
Number of neighbors5, 10, 50
One-Class Support Vector Machine (OC-SVM)Kernel functionsLinear, cubic, quadratic, medium gaussian
Table 4. Decision thresholds employed to detect anomalies for the five considered OCCs.
Table 4. Decision thresholds employed to detect anomalies for the five considered OCCs.
OCCDescription of the threshold
AutoencoderMSE (Mean Square Error) between input and output
GMMNegative log-likelihood of the Gaussian mixture model given the data input
PPNNScore indicating the likelihood that a label comes from the training class
OC-KNNDistance between the observation and the k closest neighbor
OC-SVMScore indicating the likelihood that a label comes from the training class
Table 5. Values and alternatives of statistics analyzed to select the input feature set of the classifiers.
Table 5. Values and alternatives of statistics analyzed to select the input feature set of the classifiers.
IDSymbolDescription
A μ S M V The mean Signal Magnitude Vector (SMV)
B A ω d i f f m a x Magnitude of the maximum variation of the acceleration components
C σ S M V The standard deviation of SMV
D μ θ The mean rotation angle
E μ S M V d i f f The mean absolute difference between two
consecutive samples of the acceleration module
F μ A p Mean of the acceleration components that are parallel to the floor plane
G S M V M a x The peak or maximum of the SMV to describe the violence of the impact against the floor
H S M V M i n The “valley” or minimum of the SMV to characterize the phase of free-fall
I γ S M V The skewness of SMV, which describes the symmetry of the distribution of the acceleration
J S M A The Signal Magnitude Area
K E Sum of the energy estimated in the three axes during the observation interval
L μ R Mean of the autocorrelation function of the acceleration magnitude captured during the observation interval
Table 6. Performance metrics (AUC, Se, Sp, and S e · S p ) for the best combination of hyperparameters of the classifiers when they are applied to the datasets under study.
Table 6. Performance metrics (AUC, Se, Sp, and S e · S p ) for the best combination of hyperparameters of the classifiers when they are applied to the datasets under study.
DatasetFeaturesOCCKernel/DistanceNeighbors/
Hidden Neurons/
Components
AUCSeSp S e · S p
(µ ± σ)
DLRBCDFGIKAutoencoderLogistic sigmoid100.89760.93330.87480.9007 ± 0.0739
HCTSAGMMDiagonal70.94831.00000.87840.9371 ± 0.0204
HCTSAOC-KNNCosine500.94600.93330.92980.9286 ± 0.0760
ABCDEFGHIJKLPPNN--0.55640.66670.58260.6165 ± 0.1283
HCTSAOC-SVMMedium Gaussian-0.90680.93330.84810.8864 ± 0.0696
DOFDABCDFGIKAutoencoderLogistic sigmoid120.97040.96380.98330.9733 ± 0.0201
BCDFGIKGMMDiagonal50.97620.98350.98330.9833 ± 0.0198
BCDFGIKOC-KNNEuclidean50.97270.96040.98330.9716 ± 0.0247
BCDFGIKPPNN--0.99340.95061.00000.9749 ± 0.0122
HCTSAOC-SVMLinear-0.97750.98031.00000.9901 ± 0.0930
ErciyesABCDEFGHIJKLAutoencoderLogistic sigmoid150.97950.95440.94360.9488 ± 0.0091
BCDFGIKGMMDiagonal30.98570.96480.94360.9541 ± 0.0106
BCDFGIKOC-KNNCosine50.99510.98460.97820.9814 ± 0.0028
ABCDEFGHIJKLPPNN--0.98980.96160.96400.9627 ± 0.0061
ABCDEFGHIJKLOC-SVMCubic-0.98670.96540.98370.9745 ± 0.0042
FallAllDBCDFGIKAutoencoderLogistic sigmoid60.90700.87530.81590.8449 ± 0.0161
BCDFGIKGMMDiagonal70.93590.85810.86490.8613 ± 0.0149
BCDFGIKOC-KNNCosine100.96490.92900.90620.9175 ± 0.0162
ABCDEFGHIJKLPPNN--0.82810.76990.78970.7793 ± 0.0196
BCDFGIKOC-SVMLinear-0.95520.89030.91640.9029 ± 0.0227
IMUFDABCDEFGHIJKLAutoencoderLogistic sigmoid150.91110.82380.86100.8419 ± 0.0269
BCDFGIKGMMDiagonal30.94910.89910.91590.9069 ± 0.0294
BCDFGIKOC-KNNCosine50.97100.97120.92120.9458 ± 0.0184
BCDFGIKPPNN--0.92690.82270.90280.8608 ± 0.0294
BCDFGIKOC-SVMLinear-0.97450.96680.91350.9393 ± 0.0109
KFallABCDEFGHIJKLAutoencoderLogistic sigmoid120.99310.97270.96990.9713 ± 0.0059
BCDFGIK GMMDiagonal70.98750.96970.95060.9601 ± 0.0063
BCDFGIKOC-KNNMinkowski50.99760.98930.98950.9894 ± 0.0026
HCTSAPPNN--0.99060.96070.94500.9528 ± 0.0077
ABCDEFGHIJKLOC-SVMLinear-0.99400.97050.96720.9688 ± 0.0098
SisFallBCDFGIKAutoencoderLogistic sigmoid60.96110.93040.89480.9124 ± 0.0054
BCDFGIKGMMDiagonal70.94830.92210.84180.8808 ± 0.0135
BCDFGIKOC-KNNCosine50.98950.95210.96340.9578 ± 0.0061
HCTSAPPNN--0.98980.96380.95750.9606 ± 0.0045
HCTSAOC-SVMLinear-0.99320.95830.95670.9575 ± 0.0073
UMAFallBCDFGIKAutoencoderLogistic sigmoid150.98020.99460.94160.9677 ± 0.0123
BCDFGIKGMMDiagonal50.97690.96290.90860.9353 ± 0.0164
BCDFGIKOC-KNNCosine100.98810.98950.96700.9781 ± 0.0109
HCTSAPPNN--0.97100.90950.95930.9337 ± 0.0119
BCDFGIKOC-SVMCubic-0.99240.98950.96700.9781 ± 0.0144
UP-FallBCDFGIKAutoencoderLogistic sigmoid60.96741.00000.92490.9616 ± 0.0152
BCDFGIKGMMDiagonal30.96800.97550.90520.9394 ± 0.0300
BCDFGIKOC-KNNCosine100.98880.99180.96850.9800 ± 0.0070
ABCDEFGHIJKLPPNN--0.97090.98370.96050.9720 ± 0.0248
BCDFGIKOC-SVMCubic-0.99120.99180.98420.9880 ± 0.0110
Table 7. Obtained AUC (Area Under the Curve) of the ROC for the best combination of hyperparameters of the classifiers.
Table 7. Obtained AUC (Area Under the Curve) of the ROC for the best combination of hyperparameters of the classifiers.
Dataset
ClassifierDLRDOFDAErciyesFallAllDIMUFDKFallSisFallUMAFallUP-Fall
Autoencoder0.89760.97040.97950.90700.91110.99310.96110.98020.9674
GMM0.94830.97620.98570.93590.94910.98750.94830.97690.9680
OC-KNN0.94600.97270.99510.96490.97100.99760.98950.98810.9888
PPNN0.55640.99340.98980.82810.92690.99060.98980.97100.9709
OC-SVM0.90680.97750.98670.95520.97450.99400.99320.99240.9912
Table 8. Maximum obtained geometric mean of sensitivity and specificity ( S e · S p ) for the best combination of hyperparameters of the classifiers.
Table 8. Maximum obtained geometric mean of sensitivity and specificity ( S e · S p ) for the best combination of hyperparameters of the classifiers.
Dataset
ClassifierDLRDOFDAErciyesFallAllDIMUFDKFallSisFallUMAFallUP-Fall
Autoencoder0.90070.97330.94880.84490.84490.97130.91240.96770.9616
GMM0.93710.98330.95410.86130.90690.96010.88080.93530.9394
OC-KNN0.92860.97160.98140.91750.94580.98940.95780.97810.9800
PPNN0.61650.97490.96270.77930.86080.95280.96060.93370.9720
OC-SVM0.88640.99010.97450.90290.93930.96880.95750.97810.9880
Table 9. Comparison of the performance metrics of the majority voting ensemble and those of the best single OCC.
Table 9. Comparison of the performance metrics of the majority voting ensemble and those of the best single OCC.
DatasetAlgorithmSeSp S e · S p
DLRGMM (Diagonal. 7 components)1.00000.87840.9371
Majority Voting Ensemble0.93330.91460.9215
DOFDAOC-SVM (Linear kernel)0.98031.00000.9901
Majority Voting Ensemble0.98031.00000.9901
ErciyesOC-KNN (Cosine distance. 5 neighbors)0.98460.97820.9814
Majority Voting Ensemble0.98520.97760.9814
FallAllDOC-KNN (Cosine distance. 10 neighbors)0.92900.90620.9175
Majority Voting Ensemble0.92690.92230.9245
IMUFDOC-KNN (Cosine distance. 5 neighbors)0.97120.92120.9458
Majority Voting Ensemble0.97630.93420.9550
KFallOC-KNN (Minkowski distance. 5 neighbors)0.98930.98950.9894
Majority Voting Ensemble0.99870.99850.9986
SisFallOC-KNN (Cosine distance. 5 neighbors)0.96380.95750.9606
Majority Voting Ensemble0.96380.96270.9632
UMAFallOC-KNN (Cosine distance. 10 neighbors)0.98950.96700.9781
Majority Voting Ensemble0.98950.97710.9832
UP-FallOC-SVM (Cubic kernel)0.99180.98420.9880
Majority Voting Ensemble0.99590.98410.9899
Table 10. Categorization criteria to divide the ADL movements into different types.
Table 10. Categorization criteria to divide the ADL movements into different types.
ADL CategoryDescriptionExamples
Basic movementsSimple movements of low intensityGetting up of a bed or chair, sitting down, lying, turning over while lying down, standing, clapping hands, etc.
Standard movementsRoutines of daily life that require intermediate physical effort or a certain degree of mobilityWalking at a normal pace, climbing up/down stairs, squatting, picking up an object from the floor, etc.
Sporting movementsActivities that require a higher physical effort and brusque and/or repetitive movementsRunning, jogging, hopping, walking fast, etc.
Table 11. Results of the classifiers when they are tested with falls and basic activities and trained with the rest of ADL categories.
Table 11. Results of the classifiers when they are tested with falls and basic activities and trained with the rest of ADL categories.
DatasetOCCSeSp S e · S p Loss
DLRAutoencoder0.81250.42990.5910−0.3097
GMM0.81250.32930.5172−0.4199
OC-KNN0.81250.75910.7854 −0.1432
PPNN0.75000.01220.0959−0.5206
OC-SVM0.43750.04270.1367−0.7497
ErciyesAutoencoder0.98850.84570.9143−0.0345
GMM0.91040.92130.9159−0.0382
OC-KNN0.96320.90740.9349 −0.0465
PPNN0.97640.81640.8928−0.0699
OC-SVM0.98630.87650.9298−0.0447
FallAllDAutoencoder0.95550.79780.8639+0.0190
GMM0.91400.83520.8737+0.0124
OC-KNN0.92690.87470.9004 −0.0171
PPNN0.66880.77580.7203−0.0568
OC-SVM0.92900.68130.7956−0.1073
IMUFDAutoencoder0.97610.73870.8492+0.0073
GMM0.95220.54950.7234−0.1835
OC-KNN0.95690.83780.8954 −0.0504
PPNN0.88520.81080.8472−0.0136
OC-SVM1.00000.70270.8383−0.1010
KFallAutoencoder0.99440.86900.9296−0.0417
GMM0.98510.85580.9181−0.0420
OC-KNN0.96240.97600.9692 −0.0202
PPNN0.90180.99280.9462+0.0005
OC-SVM0.99530.91830.9560−0.0128
SisFallAutoencoder0.90760.56260.7146 −0.1978
GMM0.99500.37050.6072−0.2736
OC-KNN0.98390.50730.7065−0.2513
PPNN0.85200.08250.2650−0.6956
OC-SVM0.99110.15740.3949−0.5626
UMAFallAutoencoder0.86700.84930.8581 −0.1096
GMM0.93090.72260.8201−0.1152
OC-KNN0.98940.73970.8555−0.1226
PPNN0.98940.22260.4693−0.4644
OC-SVM0.99470.61300.7809−0.1972
UP-FallAutoencoder0.95100.98810.9694+0.0078
GMM0.96330.98810.9756+0.0362
OC-KNN0.97961.00000.9897 +0.0097
PPNN0.97140.97620.9738+0.0130
OC-SVM0.99181.00000.9959+0.0079
Table 12. Results of the classifiers when they are tested with falls and standard movements and trained with the rest of ADL categories.
Table 12. Results of the classifiers when they are tested with falls and standard movements and trained with the rest of ADL categories.
DatasetOCCSeSp S e · S p Loss
DLRAutoencoder0.81250.90080.8555−0.0452
GMM0.93750.88550.9111−0.0260
OC-KNN0.81250.81680.8146−0.1140
PPNN0.68750.04580.1775−0.4390
OC-SVM0.75000.93130.8357 −0.0507
Erciyes Autoencoder0.96650.98170.9741+0.0253
GMM0.97640.98540.9809+0.0268
OC-KNN0.97800.96890.9735−0.0079
PPNN0.97200.96890.9704+0.0077
OC-SVM0.98680.96340.9751 +0.0006
FallAllD Autoencoder0.82580.88080.8528+0.0079
GMM0.86020.88080.8704+0.0091
OC-KNN0.89030.96210.9255 +0.0080
PPNN0.67740.78320.7284−0.0487
OC-SVM0.81290.66670.7362−0.1667
IMUFD Autoencoder0.81340.78330.7982−0.0437
GMM0.93300.96670.9497+0.0428
OC-KNN0.93781.00000.9684 +0.0226
PPNN0.74161.00000.8612+0.0004
OC-SVM0.85651.00000.9255−0.0138
KFall Autoencoder0.99830.94080.9691−0.0022
GMM0.98930.97230.9808 +0.0207
OC-KNN0.98590.97810.9820−0.0074
PPNN0.90610.94940.9275−0.0182
OC-SVM0.99100.81770.9002−0.0686
SisFall Autoencoder0.86200.67540.7630−0.1494
GMM0.91820.53310.6996−0.1812
OC-KNN0.94820.77560.8576−0.1002
PPNN0.97270.77150.8663−0.0943
OC-SVM0.94880.83170.8883 −0.0692
UMAFall Autoencoder0.99470.89360.9428−0.0249
GMM0.98940.93620.9624+0.0271
OC-KNN0.97870.95740.9680−0.0101
PPNN0.87231.00000.9340+0.0003
OC-SVM0.95741.00000.9785 +0.0004
UP-Fall Autoencoder0.97140.96750.9695+0.0079
GMM1.00000.94310.9711+0.0317
OC-KNN1.00000.95120.9753 −0.0047
PPNN1.00000.64230.8014−0.1594
OC-SVM1.00000.68290.8264−0.1616
Table 13. Results of the classifiers when they are tested with falls and sporting activities and trained with the rest of ADL categories (results for the IMUFD dataset are not included, as this repository does not include sporting movements).
Table 13. Results of the classifiers when they are tested with falls and sporting activities and trained with the rest of ADL categories (results for the IMUFD dataset are not included, as this repository does not include sporting movements).
DatasetOCCSeSp S e · S p Loss
DLRAutoencoder0.93750.02940.1661−0.7346
GMM0.93750.02940.1661−0.7710
OC-KNN1.00000.07350.2712−0.6574
PPNN0.75000.85290.79980.1833
OC-SVM0.31250.97060.5507−0.3357
Erciyes Autoencoder0.52310.68480.5985−0.3503
GMM0.94230.26090.4958−0.4583
OC-KNN0.97580.93480.9551−0.0263
PPNN0.95000.98910.96940.0067
OC-SVM0.95710.94570.9514−0.0231
FallAllD Autoencoder0.67960.41760.5327−0.3122
GMM0.79780.32350.5081−0.3532
OC-KNN0.92470.62940.7629−0.1546
PPNN0.82800.52350.6584−0.1187
OC-SVM0.87310.67650.7685−0.1344
KFall Autoencoder0.93770.46940.6635−0.3078
GMM0.51370.63320.5703−0.3898
OC-KNN0.96840.76860.8627−0.1267
PPNN0.96670.71620.8320−0.1137
OC-SVM0.97440.73580.8467−0.1221
SisFall Autoencoder0.76070.72540.7428−0.1696
GMM0.85640.32640.5287−0.3521
OC-KNN0.91710.51810.6893−0.2685
PPNN0.91760.97930.9480−0.0126
OC-SVM0.93380.93780.9358−0.0217
UMAFall Autoencoder0.95210.00000.0000−0.9677
GMM0.90960.00000.0000−0.9353
OC-KNN0.89890.52730.6885−0.2896
PPNN0.89890.61820.7455−0.1882
OC-SVM0.72870.85450.7891−0.1890
UP-Fall Autoencoder0.86670.00000.0000−0.9616
GMM0.77920.00000.0000−0.9394
OC-KNN0.92920.95650.9427−0.0373
PPNN0.96250.89130.9262−0.0346
OC-SVM0.88980.95650.9226−0.0654
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Santoyo-Ramón, J.A.; Casilari, E.; Cano-García, J.M. A Study of One-Class Classification Algorithms for Wearable Fall Sensors. Biosensors 2021, 11, 284. https://0-doi-org.brum.beds.ac.uk/10.3390/bios11080284

AMA Style

Santoyo-Ramón JA, Casilari E, Cano-García JM. A Study of One-Class Classification Algorithms for Wearable Fall Sensors. Biosensors. 2021; 11(8):284. https://0-doi-org.brum.beds.ac.uk/10.3390/bios11080284

Chicago/Turabian Style

Santoyo-Ramón, José Antonio, Eduardo Casilari, and José Manuel Cano-García. 2021. "A Study of One-Class Classification Algorithms for Wearable Fall Sensors" Biosensors 11, no. 8: 284. https://0-doi-org.brum.beds.ac.uk/10.3390/bios11080284

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop