1. Introduction
According to the World Health Organization (WHO), heart disease (HD), also known cardiovascular disease (CVD), is one of the major causes of mortality in the world today [
1]. It reported that 17.9 million people were estimated to have died from CVDs in 2019, accounting for 32% of all global deaths. Heart disease describes a series of conditions that affect the heart, which in turn affects the heart to pump blood around the body normally [
2]. However, there is no way to track cardiovascular or heart disease without considering the heart rate (HR), which is one of the important measures of heart health. The HR is the number of times the heart’s chambers contract (squeeze) and relax to pump blood within a specified period (i.e., minute) and at rest, a normal heart beats approximately 60–80 times per minute [
2]. The heart rate, however, is affected by the activities a human engages in and in turn, the heart rate data are nonstationary in nature, which are unpredictable and cannot be modelled or forecasted [
3,
4]. This may be complicated by unpredictability attributes and other behavioral risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol, which contribute to worse wellbeing and may even double the death risk of a CVD patient [
4,
5]. It is then important to detect cardiovascular disease as early as possible.
Recent advancement in artificial intelligence is bringing a paradigm shift to healthcare, ranging from early disease detection and diagnosis, to personalized treatment and prognosis evaluation [
6,
7,
8,
9,
10]. The ongoing revolution in health and clinical examination procedures has continued to witness improvement with the increasing rates of wearable sensors [
11]. For example, the health monitoring system is currently monitoring a patient’s cardiovascular conditions at home in order to provide appropriate recommendations to both patients and the medical consultants [
12]. The low cost and non-invasive feature of the wearable devices has made it possible to record large quantities of physiological data, track medications, follow the recovery of post-op patients and track sleep, which in turn, provides real-time health monitoring of vital statistics, providing more timely data for analysis and earlier detection of disease or the risk of major health events.
This huge contribution to the fast-growing IoT and wearable monitoring system in the healthcare space has played a vital role in early detection of a high heart rate to prevent the risk of cardiovascular disease progression. Early detection and diagnosis of cardiovascular disease are very important because it is easier to manage and treat at early stages of the disease [
1,
4].
Several research works have used various techniques such as statistical models, machine learning models and historical data to measure various risks factors of several diseases. Recently, new novel mechanical elements have been used as wearable sensors and actuators due to their incredibly small sizes. Accelerometers are sensors that are used to accurately monitor human activity by measuring external forces along a reference axis. Accelerometers can be tools to monitor heart rates [
13]; they generate time-series streaming heart rate data that can be processed on a row-by-row basis by time progression. In recent research, [
14] used an accelerometer to monitor several subjects and their 24 h HR and shared their collection of raw data including several other 24 h continuous psycho-physiological information that enables investigation of possible relationships between the physical and psychological characteristics of people in daily life. Further, the combination of these data enables the development of tools that can predict the users’ well-being. However, the HR time-series dataset provided by the study needs to be analyzed in a consecutive and incremental way using a sliding time window approach. In working towards employing data analytics and machine learning to analyze the effectiveness of using accelerometer data to monitor and predict heart rates, we explored several data analytics techniques in this study for the analysis of accelerometer data to make future HR predictions from the accelerometer’s univariant HR time-series dataset.
Over the last few decades, there has been much research directed at understanding and predicting the future from time series data. In the literature, several linear approaches have been proposed for time series forecasting. Autoregressive integrated moving average (ARIMA) models have gained popularity as linear models over the past three decades [
15] and hence, have been widely applied to construct more accurate hybrid models in time series forecasting. ARIMA models have been applied for forecasting in many fields such as health, social, economic, engineering, foreign exchange and stock problems. A study by [
16] used ARIMA to perform a spatial prediction of the COVID-19 epidemic to forecast the epidemiologic pattern in India. Further, ARIMA was used to forecast coronavirus disease in Indonesia in a study by [
17]. An investigation on the effect of post-traumatic stress disorder (PTSD) on various factors including heart rate by [
18] employed ARIMA models to analyze the heart rate data. An ARIMA model was used to capture the trend of pulse production in India by [
19] to predict pulse production from 2020 to 2029 to bridge the gap between the supply and demand. Two time series models were employed to estimate the growth rate of glioblastoma in response to ionizing radiotherapy treatment in a comparative study presented by [
20]. Their study showed that ARIMA performed better based on the mean square error (MSE) and MAPE values obtained than the Holt method. Ref. [
21] applied an ARIMA time series model to forecast the future gold price in India to mitigate the risk in gold purchases. A study presented by [
22] proposed a novel approach to improve an ARIMA model by applying a mean estimation error for time series forecasting. A novel hybridization of artificial neural networks (ANNs) and an ARIMA model was proposed by [
15] to overcome limitations of ANNs. Their model produced more general and more accurate forecasting than traditional hybrid ARIMA–ANNs models. However, ARIMA models perform best when the time series is stationary and the data are free from missing values that may be imputed through advanced interpolation techniques [
23,
24]. Due to the linearized nature of ARIMA, which may not capture nonlinear behavior [
25], it is unreasonable to assume that a particular realization of a given time series is generated by a linear process. This limitation has led to the exploration of alternatives to statistical linear models: machine learning and deep learning.
Many studies have been conducted using machine learning and deep learning to examine time series data. A study by [
26] explored the applicability of machine learning and the advantages of recurrent neural networks (RNNs) for pore-water pressure (PWP) time-series prediction. A comparative investigation between different deep learning models such as LSTM, BI-LSTM and CNN, using univariate and multivariate time-series data, was conducted by [
27] for forecasting blood pressure and heart rate. The models were used to predict blood pressure (BP) 30 min in advance and HR 30 min in advance as univariates and to predict BP and HR as multivariates. In work by [
28], a novel hybrid machine learning technique was proposed to improve the accuracy in the prediction of cardiovascular disease. Their prediction model produced an enhanced performance level with an accuracy level of 88.7% using a hybrid random forest with a linear model (HRFLM). A real-time prediction system for heart rate was proposed by [
4] using deep learning and stream processing platforms using heart rate time-series dataset extracted from Medical Information Mart for Intensive Care (MIMIC-II). Their proposed system consists of two phases, namely, an offline phase and an online phase. Different deep learning forecasting techniques were used to find the lowest mean square error for the offline phase. The best developed model from the offline phase was used to predict the heart rate in advance from the online phase. In a telehealth system architecture developed by [
29] for monitoring the cardiovascular risk, a fuzzy inference system (FIS) was employed to predict the level of cardiovascular risk from vital parameters related to cardiovascular diseases such as heart rate, respiration rate, blood oxygen saturation and color of lips that were collected through a contact-less smart object. A machine learning approach was proposed to improve the accuracy of HR detection in naturalistic measurements in study by [
30]. A four-layer deep neural network, two CNN layers and two LSTM layers, was used by [
31] to model and predict heart rate. The proposed network was evaluated on the TROIKA dataset with 22 PPG records collected during various physical activities. The proposed system achieved an improved mean absolute error accuracy for heart rate prediction. A novel deep learning framework was developed by [
32] for real-time heart rate estimation from facial video captured by an RGB camera. [
33] proposed the use of an LSTM deep learning model for initial diagnosis of heart failure (HF). Their proposed model was compared with other baseline models such as multilayer perceptron (MLP), logistic regression, k-nearest neighbor (KNN) and support vector machine (SVM). The results show that the proposed model achieved the best accuracy compared to other algorithms.
However, none of the methods is a universal model that is suitable for all circumstances. The approximation of ARIMA models to complex nonlinear problems as well as machine learning to model linear problems may be totally inappropriate, as well as for problems that consist of both linear and nonlinear correlation structures. Using hybrid models or combining several models has become a common practice in order to overcome the limitations of components models and improve the forecasting accuracy [
15]. A study presented by [
34] used two approaches for energy consumption forecast: an autoregressive integrated moving average (ARIMA) model and a non-linear autoregressive neural network (NAR) model.
Hence, the limitations and the inapplicability of using a specific method for solving time-series prediction problems shows a need to explore the effectiveness of these popular forecasting techniques in cardiovascular disease prediction using a 24 h accelerometer-generated HR time-series recordings. To the best of our knowledge, none of the existing studies on heart rate prediction used the ARIMA model for predicting future heart rates. For this reason, in this paper, we employed the ARIMA model, regression models and a deep learning model for predicting heart rates. The data analytics methods included an autoregressive integrated moving average (ARIMA) model, linear regression, support vector regression (SVR), k-nearest neighbor (KNN) regressor, decision tree regressor, random forest regressor and a long short-term memory (LSTM) recurrent neural network algorithm. We compared the performances of these models by evaluating the root mean squared error (RMSE) and calculating the scatter index (SI) of each model against the different sliding windows.
Our experimental results prove that the ARIMA model can better perform in predicting future heart rates from univariant heart rate time-series data than machine and deep learning models. Thus, our findings demonstrated that ARIMA is a better model for predicting future heart rates more accurately.
3. Results
Our study used the autoregressive integrated moving average (ARIMA) model, linear regression, support vector regression (SVR), k-nearest neighbor (KNN) regressor, decision tree regressor, random forest regressor and long short-term memory (LSTM) recurrent neural network algorithm to predict future HR from a univariant HR time-series data obtained from an Actigraph dataset of 22 healthy subjects. Each model was evaluated using RMSE and SI of different sliding windows (30 secs, 1 min, 3 min, 5 min, 10 min, 15 min, 30 min and 1 h), and the average RMSE and SI for all subjects were computed for each sliding window.
We experimentally demonstrate the model performance using a sliding window of 30 min for prediction. The ARIMA and SVR models had the best SI scores of 0.00% and 0.29%, respectively, while the KNN regressor and LSTM performed the worst, with SI scores of 41.36% and 34.15%, respectively, as shown in
Table 1.
In the 1 min sliding window experiment, the ARIMA and SVR models had the best SI scores of 0.00% and 0.29%, respectively, while the KNN regressor and LSTM models performed the worst, with SI scores of 41.36% and 34.15%, respectively, as shown in
Table 2 below.
The ARIMA model and linear regression models performed best for the 3 min sliding window, with SI scores of 1.38% and 1.76%, respectively. KNN and LSTM showed fair performance with SI scores of 3.62% and 3.31%, respectively, as shown in
Table 3 below.
Evaluation of the models for the 5 min sliding window showed that the ARIMA model and linear regression model had the best performance, with SI scores of 1.57% and 1.80%, respectively, and the models with a fair performance were the LSTM and SVR models, with SI scores of 3.27% and 3.21%, respectively, as shown in
Table 4.
In the 10 min sliding window experiment, the ARIMA model and the linear regression model again performed the best, with SI values of 1.36% and 1.38%, and the SVR and LSTM models had a fair performance, with SI values of 2.68% and 2.36%, respectively, as shown in
Table 5 below.
For the 15 min sliding window experiment, the ARIMA model and the linear regression showed the best performance, with SI values of 1.33% and 1.44%, while the KNN and random forest model performed poorly, with SI values of 5.87% and 5.25%, respectively, as shown in
Table 6.
When the experiment was carried out on a 30 min sliding windows, the results showed that the ARIMA model and the linear regression also performed the best, with SI values of 1.64% and 1.67%, and the LSTM and SVR models had a fair performance, with SI values of 2.33% and 2.17%, respectively, as shown in
Table 7 below.
Finally, the models were also evaluated for the 1 h HR recording sliding windows. The logistic regression and the ARIMA models also had the best performances, with SI scores of 1.63% and 1.17%, respectively while the LSTM and the KNN models had a fair performance, with SI scores of 3.04% and 2.10%, respectively, as shown in
Table 8.
4. Discussion
In our study, we used the 24 h accelerometer-generated HR time-series research data provided by [
14] for prediction. This may not be an efficient way to capture HR data [
50], compared to more accurate HR data recorded by an electrocardiogram, which are not applicable and suitable for everyday use [
31]. However, the research dataset also captured the IBI recordings that we reconstructed to filter out ectopic heart beats from the accelerometer data, thus producing more reliable and accurate data.
The results of the study showed a very close evaluation score for the 30 s and 1 min sliding windows, which indicated very few HR fluctuations within the duration of 30 s; therefore, using 30 s of HR recording is not sufficient to make predictions in the case where a high degree of fluctuations in the HR is expected to occur in the future.
Model performance with SI scores less than 5% is considered to be a very good model to make predictions for our second-to-second HR time-series data, i.e., the closer the model performance is to 0%, the closer the performance is to 100%. To visualize this, we computed each model’s performance on a scale of 100% against each sliding window as shown in
Figure 1 below. It was observed that some model performances were on the negative scale, which indicates how far their SI values are from 0%.
Further, the study also showed that the ARIMA and linear regression models performed the best in all experiments, and the KNN, LSTM and random forest regressor models performed very poorly; the decision tree regressor model had average performance for the 30 s and 1 min windows. The SVR model also performed better in the first two experiments, i.e., the 30 s and 1-min windows; however, similar to the other models such as KNN, decision tree regressor, random forest regressor and the LSTM, the performance for other experimental sliding windows was relatively better but unstable. However, our results also indicated that the RMSE and SI were the best in ultra-short (i.e., between 30 s and 4 min) sliding window durations. This is due the fact that there is a decrease in bias towards the HR as a result of limited HR fluctuations, which is also a good parameter to measure the heart rate variability (HRV) [
51].
A comparison of the results of each sliding window in this study to the results of the corresponding sliding window obtained in recent studies by [
4] and [
30] shows that some of the techniques we explored performed better than the techniques used in their approaches.