Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

Ullah, Amin; Anwar, Syed Muhammad; Bilal, Muhammad; Mehmood, Raja Majid

doi:10.3390/rs12101685

Open AccessArticle

Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

¹

Software Engineering Department, University of Engineering and Technology Taxila, Punjab 47050, Pakistan

²

Center for research in computer vision lab (CRCV Lab), College of Engineering and Computer Science, University of Central Florida (UCF), Orlando, FL 32816, USA

³

Computer and Electronics Systems Engineering, Hankuk University of Foreign Studies, Yongin-si 17035, Korea

⁴

Information and Communication Technology Department, School of Electrical and Computer Engineering, Xiamen University Malaysia, Sepang 43900, Malaysia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2020, 12(10), 1685; https://0-doi-org.brum.beds.ac.uk/10.3390/rs12101685

Submission received: 28 March 2020 / Revised: 14 April 2020 / Accepted: 15 April 2020 / Published: 25 May 2020

(This article belongs to the Section Remote Sensing Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

The electrocardiogram (ECG) is one of the most extensively employed signals used in the diagnosis and prediction of cardiovascular diseases (CVDs). The ECG signals can capture the heart’s rhythmic irregularities, commonly known as arrhythmias. A careful study of ECG signals is crucial for precise diagnoses of patients’ acute and chronic heart conditions. In this study, we propose a two-dimensional (2-D) convolutional neural network (CNN) model for the classification of ECG signals into eight classes; namely, normal beat, premature ventricular contraction beat, paced beat, right bundle branch block beat, left bundle branch block beat, atrial premature contraction beat, ventricular flutter wave beat, and ventricular escape beat. The one-dimensional ECG time series signals are transformed into 2-D spectrograms through short-time Fourier transform. The 2-D CNN model consisting of four convolutional layers and four pooling layers is designed for extracting robust features from the input spectrograms. Our proposed methodology is evaluated on a publicly available MIT-BIH arrhythmia dataset. We achieved a state-of-the-art average classification accuracy of 99.11%, which is better than those of recently reported results in classifying similar types of arrhythmias. The performance is significant in other indices as well, including sensitivity and specificity, which indicates the success of the proposed method.

Keywords:

ECG signal; classification; arrhythmia; convolution neural network; deep learning

Graphical Abstract

1. Introduction

Cardiovascular diseases (CVDs) are the leading cause of human death, with over 17 million people known to lose their lives annually due to CVDs [1]. According to the World Heart Federation, three-fourths of the total CVD deaths are among the middle and low-income segments of the society [2]. A classification model to identify CVDs at their early stage could effectively reduce the mortality rate by providing a timely treatment [3]. One of the common sources of CVDs is cardiac arrhythmia, where heartbeats are known to deviate from their regular beating pattern. A normal heartbeat varies with age, body size, activity, and emotions. In cases where the heartbeat feels too fast or slow, the condition is known as palpitations. An arrhythmia does not necessarily mean that the heart is beating too fast or slow, it indicates that the heart is following an irregular beating pattern. It could mean that the heart is beating too fast—tachycardia (more than 100 beats per minute (bpm)), or slow—bradycardia (less than 60 bpm), skipping a beat, or in extreme cases, cardiac arrest. Some other common types of abnormal heart rhythms include atrial fibrillation, atrial flutter, and ventricular fibrillation. These deviations could be classified into various subclasses and represent different types of cardiac arrhythmia. An accurate classification of these types could help in diagnosing and treatment of heart disease patients. Arrhythmia could either mean a slow or fast beating of heart, or patterns that are not attributed to a normal heartbeat. An automated detection of such patterns is of great significance in clinical practice. There are certain known characteristics of cardiac arrhythmia, where the detection requires expert clinical knowledge.

The electrocardiogram (ECG) recordings are widely used for diagnosing and predicting cardiac arrhythmia for diagnosing heart diseases. Towards this end, clinical experts might need to look at ECG recordings over a longer period of time for detecting cardiac arrhythmia. The ECG is a one-dimensional (1-D) signal representing a time series, which can be analyzed using machine learning techniques for automated detection of certain abnormalities. Recently, deep learning techniques have been developed, which provide significant performance in radiological image analysis [4,5]. Convolutional neural networks (CNNs) have recently been shown to work for multi-dimensional (1-D, 2-D, and in certain cases, 3-D) inputs but were initially developed for problems dealing with images represented as two-dimensional inputs [6]. For time series data, 1-D CNNs are proposed but are less versatile when compared to 2-D CNNs. Hence, representing the time series data in a 2-D format could benefit certain machine learning tasks [7,8]. Hence, for ECG signals, a 2-D transformation has to be applied to make the time series suitable for deep learning methods that require 2-D images as input. The short-time Fourier transform (STFT) can convert a 1-D signal into a 2-D spectrogram and encapsulate the time and frequency information within a single matrix. The 2-D spectrogram is similar to hyper-spectral and multi-spectral images (MSI), which have diverse applications in remote sensing and clinical diagnosis, including spectral un-mixing, ground cover classification and matching, mineral exploration, medical image classification, change detection, synthetic material identification, target detection, activity recognition, and surveillance [9,10,11,12,13,14,15]. The 2-D matrix of spectrogram coefficients could be useful for extracting robust features for representation of a cardiac ECG signal [16]. This representation could allow the application of CNN architectures (designed to operate on 2-D inputs) for development of automated systems related to CVDs.

1.1. Related Works

The ECG signal detects abnormal conditions and malfunctions by recording the potential bio-electric variation of the human heart. Accurately detecting the clinical condition presented by an ECG signal is a challenging task [17]. Therefore, cardiologists need to accurately predict and identify the right kind of abnormal heartbeat ECG wave before recommending a particular treatment. This might require observing and analyzing ECG recordings that might continue for hours (patients in critical care). To overcome this challenge for the visual and physical explanation of the ECG signal, computer-aided diagnostic systems have been developed to automatically identify such signals automatically [18]. Most of the research in this field has been conducted by incorporating different approaches of machine learning (ML) techniques for the efficient identification and accurate examination of ECG signals [19,20]. ECG signal classification based on different approaches has been presented in the literature including frequency analysis [21], artificial neural networks (ANNs) [22], heuristic-based methods [23], statistical methods [24], support vector machines (SVMs) [19], wavelet transform [25], filter banks [26], hidden Markov models [27], and mixture-of-expert methods [28]. An artificial neural network based method obtained an average accuracy of 90.6% for the classification of ECG wave into six classes [29]. Meanwhile, a feed-forward neural network was used as a classifier for the detection of four types of arrhythmia classes and achieved an average accuracy of 96.95% [30].

Machine learning is a subset of artificial intelligence used with high-end diagnostic tools [31,32,33,34] for the prediction and diagnosis of different types of illnesses [35]. Deep learning, as a subset of ML, has many applications in the prediction and prevention of fatal sicknesses, particularly CVDs. Different techniques of deep learning used for the analysis of bioinformatics signals have been presented in [31,36,37]. A recurrent neural network (RNN) [38] was used for feature extraction and achieved an average accuracy of 98.06% for detecting four types of arrhythmia. For the classification and extraction of features from a 1-D ECG signal, a 1-D convolutional neural network model was proposed [39] and yielded a classification accuracy of 96.72%. Another deeper 1-D CNN model was proposed for the classification of the ECG dataset [40] and obtained an average accuracy of 97.03%. In both instances, a large ECG dataset was used, but the ECG signals were represented as a 1-D time series. A nine-layer 2-D CNN model was applied for an automatic classification of five different heartbeat arrhythmia types achieving an accuracy of 94.03% [41].

1.2. Our Contributions

The conventional techniques might not achieve efficient results due to the inter-patient variability in ECG signals [42]. Additionally, the efficiency and accuracy of traditional methods could be negatively affected by the increasing size of data [36,43,44]. The techniques presented in literature have been applied to smaller datasets; however, for the purpose of generalization, the performance should be tested on larger datasets. There are methods reported that use 2-D ECG signals [16,45]; however, to the best of our knowledge, there are not clear details on how the 1-D ECG signal is converted to 2-D images for using 2-D CNN models. Most methods have been tested on only a few types of arrhythmia and must be evaluated on all major types of arrhythmia. It should be noted that the performance of methods developed for 1-D ECG signals can be further improved. Towards this end, the major contributions of our proposed work are:

Spectrograms (2-D images) are employed, which are generated from the 1-D ECG signal using STFT. In addition, data augmentation was used for the 2-D image representation of ECG signals.
A state-of-the-art performance was achieved in ECG arrhythmia classification by using the proposed CNN-based method with 2-D spectrograms as input.

The rest of the paper is organized as follows. The proposed algorithm is presented in detail in Section 2. The experiments conducted for the validation of the proposed scheme is presented in Section 3. Classification results are presented in Section 4, and conclusions in Section 5.

2. Proposed Scheme

A schematic representation of the proposed scheme is presented in Figure 1. The method consists of five steps, i.e., signal pre-processing, generation of 2-D images (spectrograms), augmentation of data, extraction of features from the data (using the CNN model), and its classification based on the extracted features. The details of these steps are presented in the following subsections.

2.1. Pre-Processing

The three primary forms of noise in the ECG signal are power line interference, baseline drift, and electromyographic noise [46]. The noise from the original ECG signal must be removed to ensure that a denoised ECG signal is obtained for further processing. We combined wavelet based thresholding and the reconstruction algorithm of wavelet decomposition to remove noise from the original ECG signal [47]. The wavelet thresholding was performed using,

\bar{ω_{x, y}} = \{\begin{matrix} sgn (ω_{x, y}) \{|ω_{x, y}| - \frac{λ}{{exp}^{3} [α (|ω_{x, y}| - λ) / λ]}\}, & |ω_{x, y}| \geq λ \\ 0, & |ω_{x, y}| < λ \end{matrix}

(1)

where

w_{x, y}

represents the wavelet coefficients,

\bar{ω_{x, y}}

represents the estimated wavelet coefficients after threshold, x represents the scale and y represents the shift,

λ

represents the threshold, and

α

is a parameter whose value can be set arbitrarily. The wavelet thresholding reduced the electromyographic noise and power line noise interference. Moreover, the reconstruction algorithm of wavelet decomposition was used to remove the baseline drift noise from the noisy ECG signal.

2.2. Generation of 2-D Images

While 1-D CNN can be used for time series signals, the flexibility of such models is limited due to the use of 1-D kernels. On the other hand, 3-D CNNs require a large amount of training data and computational resources. In comparison, 2-D CNNs are more versatile since they use 2-D kernels and, hence, could provide representative features for time series data. Hence, for certain applications where sufficient data is available and for 1-D signals that can be represented in a 2-D format, using a 2-D CNN could be beneficial. Herein, for generating 2-D images to be used with the 2-D CNN model, the ECG signal was transformed into a 2-D representation. The 2-D time-frequency spectrograms were generated using the short-time Fourier transform. The ECG signal represents non-stationary data where the instantaneous frequency varies with time. Hence, such changes cannot be fully represented by just using information in the frequency domain. The STFT is a method derived from the discrete Fourier transform to analyze instantaneous frequency as well as the instantaneous amplitude of a localized wave with time-varying characteristics. In the analysis of a non-stationary signal, it is assumed that the signal is approximately stationary within the span of a temporal window of finite support. The 1-D ECG signals were converted into 2-D spectrogram images by applying STFT as follows,

X_{S T F T} [m, n] = \sum_{k = 0}^{L - 1} x [k] g [k - m] e^{- j 2 π n k / L}

(2)

where L is the window length, and

x [k]

is the input ECG signal. The log values of

X_{S T F T} [m, n]

are represented as spectrogram (256 × 256) images.

2.3. Data Augmentation

Another significant advantage of using 2-D CNN models is the flexibility it provides in terms of data augmentation. For 1-D ECG signals, data augmentation could change the meaning of the data and hence is not beneficial. However, with 2-D spectrograms, the CNN model can learn the data variations, and augmentation helps in increasing the amount of data available for training. The ECG data is highly imbalanced, where most of the instances represent the normal class. In this scenario, data augmentation can help when those classes that are underrepresented are augmented. For arrhythmia classification using ECG signals, augmenting training data manually could degrade the performance. Moreover, classification algorithms such as SVM, fast Fourier neural network, and tree-based algorithms, assume that the classification of a single image based representation of an ECG signal is always the same [48]. The proposed CNN model works on 2-D images of ECG signals as input data, which allows changing the image size with operations such as cropping. Such augmentation methods would add to the training data and hence would allow better training of the CNN model. Another important issue that arises when using small data with CNN based architectures is overfitting. Data augmentation is a way to deal with overfitting and allows better training of a CNN model. For imbalanced data, data augmentation can help in maintaining a balance between different classes. We have used the cropping method for the augmentation of seven classes of ECG beats; namely, premature ventricular contraction beat (PVC), paced beat (PAB), right bundle branch block beat (RBB), left bundle branch block beat (LBB), atrial premature contraction beat (APC), ventricular flutter wave (VFW), and ventricular escape beat (VEB). These are common types of cardiac arrhythmias and are considered in studies we have used for comparison (refer to the Discussion section). While other methods of augmentation are used, such as warping in image processing applications, the aim here is to augment classes that are under-represented. Towards this end, eight different cropping operations (left top, center top, right top, left center, center, right center, left bottom, center bottom, right bottom) were applied. As a result of cropping, we obtain multiple ECG spectrograms of reduced size (200 × 200), which are then resized to 256 × 256 images (using linear interpolation) before being fed into the CNN. This resulted in an eight times increase in the training data, which benefited the training process.

2.4. Deep Neural Network

In this study, a CNN-based model is proposed for an automatic classification of arrhythmia using the ECG signal in a supervised manner. The ECG data used in the study have corresponding labels (ground truth) identifying the type of arrhythmia present. These labels were assigned by expert cardiologists and are used for supervised training. For each heartbeat segment, the arrhythmia class label was transferred to the corresponding spectrogram image representation. The first CNN-based algorithm, introduced in 1989 [49], was developed and used for the recognition of handwritten zip codes. Since then, multiple CNN models have been proposed for the classification of images, among which AlexNet [39] has achieved significant performance for a variety of images. The existing neural networks with the feed-forward process for the automatic classification of the 2-D image was not feasible since these methods do not take into account the local spatial information. However, with the development of CNN architectures and using nonlinear filters, spatially adjacent pixels can be correlated to extract local features from the 2-D image. In the 2-D convolution algorithm, the downsampling layer is highly desirable for extracting and filtering the spatial vicinity of the 2-D ECG images. For these reasons, the ECG signal was transformed into a 2-D representation, and a 2-D CNN algorithm was used for classification. Consequently, high accuracy was obtained in the automatic taxonomy of the ECG waves. The details of the proposed CNN model is presented in Section 3.2.

3. Experiments

3.1. Dataset

The MIT-BIH arrhythmia dataset consists of 48 records, each having an approximate duration of 30 min recorded from a two-channel ambulatory system, collected between 1975 and 1979 [50]. Twenty-three recordings were selected at random from 4000 long term Holter recordings composed of a diverse group of inhabitants of indoor patients (60%) as well as outdoor patients (40%). Twenty-five recordings were chosen from a similar set, with a focus on complex ventricular, junctional, and supra-ventricular arrhythmias. These recordings were digitized at 360 samples/sec for each channel with a resolution of 11-bits over a range of 10 mV. A minimum of two cardiologists were involved in annotating each record and recorded the issues and corresponding solutions needed to reach to the computer-readable outcome. Hence, for the records, approximately 110,000 explanations were documented in this database. The data is publicly available for download here: https://www.physionet.org/content/mitdb/1.0.0/.

3.2. Deep Neural Network Parameters

The performance of the proposed CNN algorithm was compared with AlexNet and VGGNet architectures [48] in terms of the ECG arrhythmia classification. The regular normal beat (NOR) and seven other types of cardiac arrhythmia (VFW, PVC, VEB, RBB, LBB, PAB, and APC) classes were selected from the MIT-BIH arrhythmia database. Although the data is annotated with eighteen different classes, some of the classes have extremely low representation. Moreover, the selected eight types are more commonly found (hence having acceptable representation in the ground truth data) and also used by the methods we have evaluated for comparison. The architecture of the CNN model used in our experiments is shown in Figure 2. A detailed representation of layers within the model are presented in Table 1. The model follows the CNN architecture with four 2-D convolutional layers. Each convolutional layer is followed by a pooling layer. The output layer is a softmax layer with eight neurons to give the final classification. A fully connected layer is used between the last pooling layer and the output layer and represents the features learned by the CNN model.

3.3. Experimental Setup

The proposed CNN classifier was implemented in Python with the open source library Tensor Flow [51], which was developed by Google for deep learning. Substantial computational power and training time were needed to train the CNN model. The experimental setup consisted of an eighth-generation ASUS server with 32GB internal RAM, 500 GB external SSD hard drive with the addition of internal hard drive, and NVIDIA 1080 GPU with 11 GB memory. The 2-D spectral images were divided such that 70% of the data was used for training, 30% for test. A 5-fold cross validation was used during the training process. The train/test splits were generated such that there was no overlap between the two splits.

3.4. Cost Function

The cost function is used to measure the error of the CNN model between the estimated worth and the actual worth or the desired quality. An optimizer function was used to minimize the error function. Different cost functions have been used in the neural network theory. In our experiments, we used the cross-entropy function which is given as,

C = \frac{- 1}{n} \sum_{c = 1}^{N} ([y_{c} * l n (a_{c}) + (1 - y_{c}) l n (1 - a_{c})])

(3)

where C represents the cost that needs to be minimized, n is the number of training points, y is the expected or target value, N is the total number of classes, c is the class index, and a is the actual value. A gradient descent algorithm was used as an optimizer function with a learning rate of 0.001 to reduce the error of cost function. Adam optimizer was used in the experiments for training the proposed CNN model, and it reached the optimal point in fewer iterations.

3.5. Evaluation Parameters

Four evaluation metrics were used in this study, including accuracy, precision, sensitivity, and specificity. The accuracy for the multi-class problem was evaluated as,

A = \frac{1}{N} \sum_{c = 1}^{N} \frac{(T_{p}^{c} + T_{N}^{c})}{(T_{P}^{c} + T_{N}^{c} + F_{P}^{c} + F_{N}^{c})},

(4)

where

T_{p}

denotes the true positives,

F_{p}

represents the false positives,

T_{N}

represents the true negatives, and

F_{N}

represents the false negatives, c represents the class index, and N represents the total number of classes. The accuracy (A) represents the ratio of the correctly classified instances to that of the total number of instances. The precision (P) and sensitivity (

S e n

) were calculated as,

P = \frac{1}{N} \sum_{c = 1}^{N} \frac{T_{p}^{c}}{T_{p}^{c} + F_{p}^{c}},

(5)

S e n = \frac{1}{N} \sum_{c = 1}^{N} \frac{T_{p}^{c}}{T_{p}^{c} + F_{N}^{c}} .

(6)

The specificity (Sp), also known as the true negative rate, was calculated as,

S p = \frac{1}{N} \sum_{c = 1}^{N} \frac{T_{N}^{c}}{T_{N}^{c} + F_{P}^{c}} .

(7)

The F1 score was calculated using the precision (P) and recall (Sen) as,

F 1 S c o r e = 2 \times (\frac{P \times S e n}{P + S e n}) .

(8)

4. Classification Results and Discussion

4.1. Results

The two significant optimization parameters in the proposed 2-D CNN model are the learning rate and the batch size of the data used. To improve the performance, these two optimization parameters must be selected carefully to obtain the best accuracy in the automatic classification of arrhythmia using the ECG signals. The proposed model was evaluated in different experiments with various values of learning parameters. For a smaller value of the learning rate (i.e., less than 0.0005), the speed of the convergence was very slow. However, when the value of the learning rate was large (i.e., greater than 0.001), the speed of convergence improved. At the same time, asymmetrical changes were observed in the accuracy rate. Henceforth, we selected an optimum value of 0.001 for the learning rate, as this value can attain better accuracy for the proposed model (i.e., optimum value), as shown in Table 2.

Similar to the learning rate, the batch size of the data also greatly affected the behavior and accuracy of the model. When the batch size was chosen to be 1000, the accuracy of the system showed abnormally large fluctuations in terms of system convergence. When the batch size was set to 2000, the accuracy of the system increased but did not reach a stable state. When the batch size was further increased to 2800, the accuracy of the proposed model was the highest and reached a stable state. The results are summarized in Table 3.

A detailed performance comparison between the proposed 2-D CNN model and other CNN models (including VGGNet and AlexNet) is presented using confusion matrices for all eight classes. The diagonal elements show the correctly classified classes, whereas anything off diagonal represents an incorrect classification. For the 2-D ECG data used in experiments, results are presented for the VGGNet (Figure 3), AlexNet (Figure 4), and the proposed model (Figure 5). The average accuracy of these three models is presented by averaging the diagonal values.

4.2. Discussion

Table 4 summarizes the performance evaluation of the proposed CNN algorithm with other classification methods of arrhythmia using ECG signals. The terms ‘native’ and ‘augmented’ in Table 4 represent the training set without and with data augmentation, respectively. However, a direct comparison of our proposed CNN model with existing techniques may be unfit due to variations in the training and testing dataset, size of the ECG dataset used for experiments, architecture of the CNN models used, and the varying number of types of arrhythmia used for classification. It should be noted that there are various methods that used 1-D data directly for the classification of arrhythmia [52,53,54,55,56,57,58]. Among these methods, 1-D CNN models have been proposed with a lower classification accuracy ([56]—96.40% and [58]—93.60%) when compared with the proposed model. We also used 1-D ECG signals as input to the CNN model used in experiments and achieved a classification accuracy of 97.80%. In recent years, 2-D CNN models have also been used, by converting the 1-D ECG signals to 2-D representation, with noticeable performance [16]. Towards this end, the proposed model was based on a 2-D representation of the ECG data to efficiently apply 2-D CNN models and benefit from the flexibility of data augmentation in such methods.

The proposed 2-D CNN model attained better accuracy, sensitivity, and specificity (in eight class classification) than the FFNN [59] model, which classified only four kinds of arrhythmia. It was observed that the VGGNet model performs worse than the proposed model, albeit a deeper network. One of the reasons for these observations could be the deeper architecture of VGGNet and limited training data. These results prove that the proposed CNN model has the state-of-the-art accuracy for the automatic classification of arrhythmia based on the comparison with different CNN based algorithms. Varying performance among the compared CNN models is due to the difference in their architectures and the number of convolution filters used in these CNN models’ structures. In the proposed CNN model, we employed four convolutional layers, four downsampling (pooling) layers, and one fully connected layer. In the AlexNet model, six convolutional layers, three downsampling layers, and two fully connected layers were used, while the VGGNet model entailed ten convolutional layers, four downsampling layers, and two fully connected layers. By adding a convolutional or a downsampling layer to the architecture of the CNN models, the computational resources and the simulation time for training and testing the models also increase, and this is the main reason for using a carefully selected CNN model. Since we have a limited amount of data, more deeper networks (such as DenseNet or ResNet) would not qualify to perform well within the scope of this problem. The proposed model can be trained on other classes of arrhythmia, although we did not perform this analysis so that we can compare our work with published results that use a 2-D representation of ECG data.

We compared the proposed CNN-based model with recent techniques for the automatic classification of arrhythmia (Table 4), where the algorithm achieved 97.88% average sensitivity, 99.61% specificity, 99.11% average accuracy, and 98.59% positive predictive value (precision). These values indicate improved performance when compared with recent methods using of 1-D and 2-D CNNs, given the same arrhythmia classification. The results also show that the proposed CNN algorithm has better results in terms of accuracy with both the augmented and without augmented data. The proposed model has attained the highest sensitivity among all the compared CNN algorithms. It is pertinent to note that detecting these cardiac arrhythmias is a labor intensive task, where a clinical expert needs to carefully observe recordings that can go for up to hours. With such automated methods, the artificially intelligent system could augment the performance of clinical experts by detecting these patterns and directing the observer to look more closely at regions of more significance. This would ultimately improve the clinical diagnosis and treatment of some of the major CVDs.

5. Conclusions

In this study, we proposed a 2-D CNN-based classification model for automatic classification of cardiac arrhythmias using ECG signals. An accurate taxonomy of ECG signals is extremely helpful in the prevention and diagnosis of CVDs. Deep CNN has proven useful in enhancing the accuracy of diagnosis algorithms in the fusion of medicine and modern machine learning technologies. The proposed CNN-based classification algorithm, using 2-D images, can classify eight kinds of arrhythmia, namely, NOR, VFW, PVC, VEB, RBB, LBB, PAB, and APC, and it achieved 97.91% average sensitivity, 99.61% specificity, 99.11% average accuracy, and 98.59% positive predictive value (precision). These results indicate that the prediction and classification of arrhythmia with 2-D ECG representation as spectrograms and the CNN model is a reliable operative technique in the diagnosis of CVDs. The proposed scheme can help experts diagnose CVDs by referring to the automated classification of ECG signals. The present research uses only a single-lead ECG signal. The effect of multiple lead ECG data to further improve experimental cases will be studied in future work.

Author Contributions

Conceptualization, A.U. and S.M.A.; Methodology, A.U., R.M.M., S.M.A.; Validation, A.U., R.M.M. and M.B.; Formal Analysis, A.U., S.M.A.; Writing-Original Draft Preparation, A.U., R.M.M., M.B., S.M.A.; Writing-Review & Editing, S.M.A., A.U.; Supervision, S.M.A.; Funding Acquisition, R.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research has funded by the Xiamen University Malaysia Research Fund (XMUMRF) (Grant No. XMUMRF/2019-C3/IECE/0007).

Acknowledgments

The authors thank for the valuable advice from Ulas Bagci (Center for Research in Computer Vision (CRCV) Laboratory, University of Central Florida (UCF), Orlando, FL, USA). This work was supported by the ASRTD at University of Engineering and Technology, Taxila and Xiamen University Malaysia (XMUM).

Conflicts of Interest

The authors declare that there is no conflict of interest regarding this publication.

References

Mc Namara, K.; Alzubaidi, H.; Jackson, J.K. Cardiovascular disease as a leading cause of death: How are pharmacists getting involved? Integr. Pharm. Res. Pract. 2019, 8, 1. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lackland, D.T.; Weber, S.M.A. Global burden of cardiovascular disease and stroke: Hypertension at the core. Can. J. Cardiol. 2015, 31, 569–571. [Google Scholar] [CrossRef] [PubMed]
Mustaqeem, A.; Anwar, S.M.; Majid, M. A modular cluster based collaborative recommender system for cardiac patients. Artif. Intell. Med. 2020, 102, 101761. [Google Scholar] [CrossRef] [PubMed]
Irmakci, I.; Anwar, S.M.; Torigian, D.A.; Bagci, U. Deep Learning for Musculoskeletal Image Analysis. arXiv 2020, arXiv:2003.00541. [Google Scholar]
Anwar, S.M.; Majid, M.; Qayyum, A.; Awais, M.; Alnowami, M.; Khan, M.K. Medical image analysis using convolutional neural networks: A review. J. Med. Syst. 2018, 42, 226. [Google Scholar] [CrossRef] [Green Version]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
Wu, Y.; Yang, F.; Liu, Y.; Zha, X.; Yuan, S. A comparison of 1-D and 2-D deep convolutional neural networks in ECG classification. arXiv 2018, arXiv:1810.07088. [Google Scholar]
Zhao, J.; Mao, X.; Chen, L. Speech emotion recognition using deep 1D & 2-D CNN LSTM networks. Biomed. Signal Process. Control 2019, 47, 312–323. [Google Scholar]
Ortega, S.; Fabelo, H.; Iakovidis, D.K.; Koulaouzidis, A.; Callico, G.M. Use of hyperspectral/multispectral imaging in gastroenterology. Shedding some–different–light into the dark. J. Clin. Med. 2019, 8, 36. [Google Scholar] [CrossRef] [Green Version]
Feng, Y.-Z.; Sun, D.-W. Application of Hyperspectral Imaging in Food Safety Inspection and Control: A Review. Crit. Rev. Food Sci. Nutr. 2012, 52, 1039–1058. [Google Scholar] [CrossRef]
Lorente, D.; Aleixos, N.; Gómez-Sanchis, J.; Cubero, S.; García-Navarrete, O.L.; Blasco, J. Recent Advances and Applications of Hyperspectral Imaging for Fruit and Vegetable Quality Assessment. Food Bioprocess Technol. 2011, 5, 1121–1142. [Google Scholar] [CrossRef]
Tatzer, P.; Wolf, M.; Panner, T. Industrial application for inline material sorting using hyperspectral imaging in the NIR range. Real-Time Imaging 2005, 11, 99–107. [Google Scholar] [CrossRef]
Kubik, M. Chapter 5 Hyperspectral Imaging: A New Technique for the Non-Invasive Study of Artworks. Phys. Tech. Study Art Archaeol. Cult. Herit. 2007, 2, 199–259. [Google Scholar]
Hassan, H.; Bashir, A.K.; Abbasi, R.; Ahmad, W.; Luo, B. Single image defocus estimation by modified gaussian function. Trans. Emerg. Telecommun. Technol. 2019, 30, 3611. [Google Scholar] [CrossRef] [Green Version]
Ahmad, M.; Bashir, A.K.; Khan, A.M. Metric similarity regularizer to enhance pixel similarity performance for hyperspectral unmixing. Optik 2017, 140, 86–95. [Google Scholar] [CrossRef]
Salem, M.; Taheri, S.; Yuan, J.S. ECG arrhythmia classification using transfer learning from 2-dimensional deep CNN features. In Proceedings of the 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS), Cleveland, OH, USA, 17–19 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–4. [Google Scholar]
Mustaqeem, A.; Anwar, S.M.; Khan, A.R.; Majid, M. A statistical analysis based recommender model for heart disease patients. Int. J. Med. Inform. 2017, 108, 134–145. [Google Scholar] [CrossRef]
Anwar, S.M.; Gul, M.; Majid, M.; Alnowami, M. Arrhythmia Classification of ECG Signals Using Hybrid Features. Comput. Math. Methods Med. 2018. [Google Scholar] [CrossRef] [Green Version]
Mustaqeem, A.; Anwar, S.M.; Majid, M. Multiclass classification of cardiac arrhythmia using improved feature selection and SVM invariants. Comput. Math. Methods Med. 2018. [Google Scholar] [CrossRef] [Green Version]
Mustaqeem, A.; Anwar, S.M.; Majid, M.; Khan, A.R. Wrapper method for feature selection to classify cardiac arrhythmia. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju Island, Korea, 11–15 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 3656–3659. [Google Scholar]
Minami, K.I.; Nakajima, H.; Toyoshima, T. Real-time discrimination of ventricular tachyarrhythmia with Fourier-transform neural network. IEEE Trans. Biomed. Eng. 1999, 46, 179–185. [Google Scholar] [CrossRef]
Coast, D.A.; Stern, R.M.M.; Cano, G.G.; Briller, S.A. An approach to cardiac arrhythmia analysis using hidden markov models. IEEE Trans. Biomed. Eng. 1990, 37, 826–836. [Google Scholar] [CrossRef]
Osowski, S.; Hoai, L.T.; Markiewicz, T. Support vector machine based expert system for reliable heartbeat recognition. IEEE Trans. Biomed. Eng. 2004, 51, 582–589. [Google Scholar] [CrossRef] [PubMed]
Willems, J.L.; Lesaffre, E. Comparison of multigroup logistic and linear discriminant ecg and vcg classification. J. Electrocardiol. 1987, 20, 83–92. [Google Scholar] [CrossRef]
Hu, Y.H.; Tompkins, W.J.; Urrusti, J.L.; Afonso, V.X. Applications of artificial neural networks for ECG signal detection and classification. J. Electrocardiol. 1993, 26, 66–73. [Google Scholar] [PubMed]
Trahanias, P.; Skordalakis, E. Syntactic pattern recognition of the ECG. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 648–657. [Google Scholar] [CrossRef] [Green Version]
Inan, O.T.; Giovangrandi, L.; Kovacs, G.T. Robust neural-network-based classification of premature ventricular contractions using wavelet transform and timing interval features. IEEE Trans. Biomed. Eng. 2006, 53, 2507–2515. [Google Scholar] [CrossRef]
Hu, Y.H.; Palreddy, S.; Tompkins, W.J. A patient-adaptable ECG beat classifier using a mixture of experts approach. IEEE Trans. Biomed. Eng. 1997, 44, 891–900. [Google Scholar]
Dehan, L.; Guanggui, X.U.; Yuhua, Z.; Hosseini, H.G. Novel ECG diagnosis model based on multi-stage artificial neural networks. Chin. J. Sci. Instrum. 2008, 29, 27. [Google Scholar]
Ceylan, R.; Ozbay, Y. Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst. Appl. 2007, 33, 286–295. [Google Scholar] [CrossRef]
Polat, K.; Günes, S. Breast cancer diagnosis using least square support vector machine. Digit. Signal Process. 2007, 17, 694–701. [Google Scholar] [CrossRef]
Dreiseitl, S.; Ohno-Machado, L.; Kittler, H.; Vinterbo, S.; Billhardt, H.; Binder, M. A comparison of machine learning methods for the diagnosis of pigmented skin lesions. J. Biomed. Inform. 2001, 34, 28–36. [Google Scholar] [CrossRef] [Green Version]
Shafiq, M.; Yu, X.; Bashir, A.K.; Chaudhry, H.N.; Wang, D. A machine learning approach for feature selection traffic classification using security analysis. J. Supercomput. 2018, 74, 4867–4892. [Google Scholar] [CrossRef]
Bashir, A.K.; Arul, R.; Basheer, S.; Raja, G.; Jayaraman, R.; Qureshi, N.M.F. An optimal multitier resource allocation of cloud RAN in 5G using machine learning. Trans. Emerg. Telecommun. Technol. 2019, 30, 3627. [Google Scholar] [CrossRef]
Kononenko, I. Machine learning for medical diagnosis: History, state of the art and perspective. Artif. Intell. Med. 2001, 23, 89–109. [Google Scholar] [CrossRef]
Ecar, A. Recommended practice for testing and reporting performance results of ventricular arrhythmia detection algorithms. Assoc. Adv. Med. Instrum. 1987, 69. [Google Scholar]
Huertas-Fernandez, I.; Garcia-Gomez, F.J.; Garcia-Solis, D.; Benitez-Rivero, S.; Marin-Oyaga, V.A.; Jesus, S.; Mir, P. Machine learning models for the differential diagnosis of vascular parkinsonism and Parkinson’s disease using [123 I] FP-CIT SPECT. Eur. J. Nucl. Med. Mol. Imaging 2015, 42, 112–119. [Google Scholar] [CrossRef] [PubMed]
Salvatore, C.; Cerasa, A.; Battista, P.; Gilardi, M.C.; Quattrone, A.; Castiglioni, I. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer’s disease: A machine learning approach. Front. Neurosci. 2015, 9, 307. [Google Scholar] [CrossRef] [Green Version]
Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Trans. Biomed. Eng. 2015, 63, 664–675. [Google Scholar] [CrossRef]
Rajpurkar, P.; Hannun, A.Y.; Haghpanahi, M.; Bourn, C.; Ng, A.Y. Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv 2017, arXiv:1707.01836. [Google Scholar]
Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M.; Gertych, A.; San Tan, R. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2017, 89, 389–396. [Google Scholar] [CrossRef]
Chen, J.; Valehi, A.; Razi, A. Smart Heart Monitoring: Early Prediction of Heart Problems Through Predictive Analysis of ECG Signals. IEEE Access 2019, 7, 120831–120839. [Google Scholar] [CrossRef]
Lee, S.C. Using a translation-invariant neural network to diagnose heart arrhythmia. In Advances in Neural Information Processing Systems; Morgan Kaufmann: San Francisco, CA, USA, 1990; pp. 240–247. [Google Scholar]
De Chazal, P.; Reilly, R.B. A patient-adapting heartbeat classifier using ECG morphology and heartbeat interval features. IEEE Trans. Biomed. Eng. 2015, 53, 2535–2543. [Google Scholar] [CrossRef] [PubMed]
Xiong, Z.; Stiles, M.K.; Zhao, J. Robust ECG signal classification for detection of atrial fibrillation using a novel neural network. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–4. [Google Scholar]
Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
Li, D.; Zhang, J.; Zhang, Q.; Wei, X. Classification of ECG signals based on 1D convolution neural network. In Proceedings of the 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), Dalian, China, 12–15 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
Jun, T.J.; Nguyen, H.M.; Kang, D.; Kim, D.; Kim, D.; Kim, Y.H. ECG arrhythmia classification using a 2-D convolutional neural network. arXiv 2018, arXiv:1804.06812. [Google Scholar]
Mohanty, M.D.; Mohanty, B.; Mohanty, M.N. R-peak detection using efficient technique for tachycardia detection. In Proceedings of the 2017 2nd International Conference on Man and Machine Interfacing (MAMI), Bhubaneswar, India, 21–23 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–5. [Google Scholar]
Moody, G.B.; Mark, R.G. The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 2001, 20, 45–50. [Google Scholar] [CrossRef] [PubMed]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
Übeyli, E.D. Combining recurrent neural networks with eigenvector methods for classification of ECG beats. Digit. Signal Process. 2009, 19, 320–329. [Google Scholar] [CrossRef]
Dutta, S.; Chatterjee, A.; Munshi, S. Correlation technique and least square support vector machine combine for frequency domain based ECG beat classification. Med. Eng. Phys. 2010, 32, 1161–1169. [Google Scholar] [CrossRef]
Kumar, R.G.; Kumaraswamy, Y.S. Investigating cardiac arrhythmia in ECG using random forest classification. Int. J. Comput. Appl. 2012, 37, 31–34. [Google Scholar]
Park, J.; Lee, K.; Kang, K. Arrhythmia detection from heartbeat using k-nearest neighbor classifier. In Proceedings of the 2013 IEEE International Conference on Bioinformatics and Biomedicine, Shanghai, China, 18–21 December 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 15–22. [Google Scholar]
Ince, T.; Kiranyaz, S.; Eren, L.; Askar, M.; Gabbouj, M. Real-time motor fault detection by 1-D convolutional neural networks. IEEE Trans. Ind. Electron. 2016, 63, 7067–7075. [Google Scholar] [CrossRef]
Izci, E.; Ozdemir, M.A.; Degirmenci, M.; Akan, A. Cardiac Arrhythmia Detection from 2D ECG Images by Using Deep Learning Technique. In Proceedings of the 2019 Medical Technologies Congress (TIPTEKNO), Selçuk, Turkey, 3–5 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4. [Google Scholar]
Rajkumar, A.; Ganesan, M.; Lavanya, R. Arrhythmia classification on ECG using Deep Learning. In Proceedings of the 2019 5th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 15–16 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 365–369. [Google Scholar]
Guler, I.; Ubeylı, E.D. ECG beat classifier designed by combined neural network model. Pattern Recognit. 2005, 38, 199–208. [Google Scholar] [CrossRef]
Yu, S.N.; Chou, K.T. Integration of independent component analysis and neural networks for ECG beat classification. Expert Syst. Appl. 2008, 34, 2841–2846. [Google Scholar] [CrossRef]
Melgani, F.; Bazi, Y. Classification of electrocardiogram signals with support vector machines and particle swarm optimization. IEEE Trans. Inf. Technol. Biomed. 2008, 12, 667–677. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Complete procedure of electrocardiogram (ECG) signal classification.

Figure 2. The architecture of the proposed convolutional neural network (CNN) model architecture.

Figure 3. Confusion matrix for VGGNet.

Figure 4. Confusion matrix for AlexNet.

Figure 5. Confusion matrix for the proposed 2-D CNN based classification model.

Table 1. Details of the layers used in the proposed CNN model architecture.

Layers	Type	Filter Size	Stride	Kernel	Input Size	Parameters
Layer 1	Conv2-D	3 × 3	1	64	256 × 256 × 1	576
Layer 2	Pooling	2 × 2	2	-	256 × 256 × 64	-
Layer 3	Conv2-D	3 × 3	1	128	128 × 128 × 64	73,728
Layer 4	Pooling	2 × 2	2	-	128 × 128 × 128	-
Layer 5	Conv2-D	3 × 3	1	256	64 × 64 × 128	294,912
Layer 6	Pooling	2 × 2	2	-	64 × 64 × 256	-
Layer 7	Conv2-D	3 × 3	1	512	32 × 32 × 256	1,179,648
Layer 8	Pooling	2 × 2	2	-	32 × 32 × 512	-
Layer 9	Fully Connected	-	-	4096	16 × 16 × 512	2,097,152
Layer 10	Output Layer	-	-	8	4096	32,776

Table 2. Batch sizes and average accuracy for a learning rate of 0.001.

Learning Rate	Batch Size	Average Accuracy
0.001	2800	99.11
0.001	2000	98.96
0.001	1000	99.00
0.001	500	98.95
0.001	100	98.93

Table 3. Learning rate and average accuracy for a batch size of 2800.

Batch Size	Learning Rate	Average Accuracy
2800	0.001	99.11
2800	0.005	98.84
2800	0.100	98.89
2800	0.200	98.91

Table 4. Comparison of the proposed model with state-of-the-art ECG classification techniques.

Model	Native/Augmentation	Classes	Accuracy %	Sensitivity %	Specificity %	Precision %	F1 Score
FFNN [59]		4	96.94	96.31	97.78	-	-
PNN [60]		8	98.71	-	99.65	-	-
SVM [61]		6	91.67	93.83	90.49	-	-
RNN [52]		4	98.06	98.15	97.78	-	-
LS-SVM [53]		3	95.82	86.16	99.17	97.01	0.91
RFT [54]		3	92.16	-	-	-	-
KNN [55]		17	97.00	96.60	95.80	-	-
1-D CNN [56]		5	96.40	68.80	99.50	79.20	0.73
AlexNet [48]	Augmented	8	98.85	97.08	99.62	98.59	0.97
AlexNet [48]	Native	8	98.81	96.81	99.68	98.63	0.97
VGGNet [48]	Augmented	8	98.63	96.93	99.37	97.86	0.97
VGGNet [48]	Native	8	98.77	97.26	99.43	98.08	0.97
2-D CNN [57]		5	97.42	-	-	-
1-D CNN [58]		7	93.60	-	-	-
Proposed (1-D)	Native	8	97.80	-	-	-	-
Proposed (2-D)	Augmented	8	99.11	97.91	99.61	98.58	0.98
Proposed (2-D)	Native	8	98.92	97.26	99.67	98.69	0.98

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ullah, A.; Anwar, S.M.; Bilal, M.; Mehmood, R.M. Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation. Remote Sens. 2020, 12, 1685. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12101685

AMA Style

Ullah A, Anwar SM, Bilal M, Mehmood RM. Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation. Remote Sensing. 2020; 12(10):1685. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12101685

Chicago/Turabian Style

Ullah, Amin, Syed Muhammad Anwar, Muhammad Bilal, and Raja Majid Mehmood. 2020. "Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation" Remote Sensing 12, no. 10: 1685. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12101685

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

Abstract

1. Introduction

1.1. Related Works

1.2. Our Contributions

2. Proposed Scheme

2.1. Pre-Processing

2.2. Generation of 2-D Images

2.3. Data Augmentation

2.4. Deep Neural Network

3. Experiments

3.1. Dataset

3.2. Deep Neural Network Parameters

3.3. Experimental Setup

3.4. Cost Function

3.5. Evaluation Parameters

4. Classification Results and Discussion

4.1. Results

4.2. Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI