Next Article in Journal
Predicting Employee Attrition Using Machine Learning Techniques
Next Article in Special Issue
Machine-Learning-Based Emotion Recognition System Using EEG Signals
Previous Article in Journal
State Estimation and Localization Based on Sensor Fusion for Autonomous Robots in Indoor Environment
Previous Article in Special Issue
Fusion Convolutional Neural Network for Cross-Subject EEG Motor Imagery Classification
 
 
Article
Peer-Review Record

Statistical Model-Based Classification to Detect Patient-Specific Spike-and-Wave in EEG Signals

by Antonio Quintero-Rincón 1,2,*, Valeria Muro 2, Carlos D’Giano 2, Jorge Prendes 3 and Hadj Batatia 4
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Submission received: 11 September 2020 / Revised: 25 October 2020 / Accepted: 27 October 2020 / Published: 29 October 2020
(This article belongs to the Special Issue Machine Learning for EEG Signal Processing)

Round 1

Reviewer 1 Report

Comments to the authors:

The authors aims to develop a classifier for spike-and-wave discharge (SWD) pattern detection. The proposed method fits the morlet wavelet coefficients into generalized Gaussian distribution (GGD) and use GGD's parameters as features to build a k-nearest neighbors (kNN) classifier. The article is properly written and has a smooth logical flow. However, some parts of the methodology are unclear which weaken the authors' conclusions. I have some remarks for the authors to improve their manuscript.

1. Some description in the methodology are either unclear or missing. I have list some of my questions below:
(1) I do not fully understand the dimension changes between each step in the proposed method. According to the methodology, first, the raw EEG signal (R^{N by M}) is divided into several 2-seconds segments (R^{sample points for 1 sec by M}) with 1 second overlap. Second, apply Morlet transform and extract the coefficients which has pseudo-frequency within 1-3Hz (R^{some positive number by M}). The Morlet result of all the segments are concatenated into a coefficient matrix Ct (R^{# of coefficients in all segments by M}). Third, use the method in ref. 46 to represent Ct by GGD. Is the average energy calculated across channels first (R^{# of coefficients by 1}), then fitted into 1 GGD?
(2) In Section 4.2.3. (line 151), the authors use the same symbol when denoting the feature vector for classification and the maximum likelihood parameter vector (line 149). I suggest the authors to use another notation to prevent potential confusion. Also, I suggest the authors to describe the features in the feature vector once again since their definition are not well coherent (e.g. line 61, line 71).
(3) Following the comment above, the definition of the variance (line 151) is not clear. As my understanding, it is the variance in feature space and it is different from the variance of coefficients (line 61).
(4) In Morlet Wavelet section (line 143), the authors mention the relationship between wavelet scale and pesudo-frequency. I suggest the author to provide a example figure of time-pesudo-frequency coefficient plot, so that the readers can understand the proposed idea easier.
(5) Following the comment above, I suggest the authors to illustrate their classifier-building pipeline, so that the readers can understand the proposed idea easier.
Though the authors provide several references for detail information, I suggest the authors to describe their methodology more so the manuscript can be more reader-friendly.

2. The proposed method is similar to some of the authors' previous works (e.g. GGD in ref.43, frequency transformation following by GGD in ref. 3, kNN in ref. 19). Though the authors emphasize their contributions in the Discussion section (line 99), their results are not significantly better than others according to Table 1. I suggest the authors to compare the performance with previous methods on the same database, so the results can be more significant. The authors mention that the proposed method are more precise than previous work (line 108). However, the details for the comparison are not shown in the manuscript.

3. In line 129, the authors mention 10 SWD patterns were selected to be part of the database for each patient. Does it mean that the training signals for the classifier are 106 SWD and 106 non-SWD, plus additional 10 subject-specific SWD? Is this 10 SWD patterns included in the 106 SWD signals?

4. In the result section, the authors claim there is a clear discrimination between SWD and non-SWD classes based on the result of Table 2. and Figure 1 (line 73). However, it seems to me the data points are largely overlapped with each other. The authors also mention there is no statistical significance if one considers scale parameter only (line 103). Hence, I suggest the authors to combine the three subplots in Figure 1. into one 3D scatter plot. The 2D plots seem to be highly overlapped to me and no intuitive distinction is shown. Since the authors state that the significant different between the SWD and non-SWD classes reveal after including variance and median as features, I guess there will be a clear hyper-plane in the 3D scatter plot. Moreover, I suggest the authors to describe the data in Table 2. by mean and variance so the underlying structure might be revealed.

5. Some of the statements can be strengthen if the authors provide references. I have list some of the statements that is lack of references and/or descriptions.
(1) In line 19, the authors state that spike is characterized by high amplitude synchronization and polarity changes without references.
(2) In line 68, the authors state that 10-NN is used without explanations.
(2) In line 72, the authors introduce the term "feature vector" without descriptions.
(3) In line 170, the authors mention "incomplete waveforms" without descriptions.

Some minor suggestions:
1. There are some typos and incorrect sentences:
(1) In line 77, tree(three) parameters ...
(2) In line 146, as is(it) detects ...
(3) In line 151, then for a feature vector, (the probability of classifying in) each class is given by.
2. I suggest to label class 0 and 1 with SWD and non-SWD in Table 2 so the readers can interpret the table easier.
3. I suggest the authors to enlarge the channel labels and time stamps in Figure 2. (b).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper aims at developing an SWD detection method based on generalized Gaussian distribution and k-nearest neighbors classifier. The method was trained with data from standard medical protocols.

The paper lacks a fundamental part, the direct comparison on the same dataset with some other methods, chosen as gold standards (as listed for example in table 1), that have shown good performances.

Looking at table 1, it seems that the proposed method is not overperforming if compared with the t-location-scale distribution [ref 19] or Cross-correlation [ref 40]. This opens questions about the actual usefulness of the method.

The results shown in table 2 and figure 1 are not convincing and do not help to understand the actual value of the method. It seems that class 0
(non-spike-and-wave) and class 1 (spike-and-wave) are overlapped in terms of statistical properties. Furthermore, some examples with real EEG signals can help the reader to better understand how the 2 classes differ.

The discussion must be more focused on the strength of the method especially with respect to the reference literature.

The paper is written in an "unusual" arrangement. This doesn't help the reader and is a bit confusing. I suggest using a more standard outline: Introduction, Methods, Results, Discussion, Conclusions.

 

 

 

Author Response

Author Response

Response to Reviewer #2
The paper aims at developing an SWD detection method based on generalized Gaussian
distribution and k-nearest neighbors classifier. The method was trained with data from
standard medical protocols.
Reviewer: Remark 2.1)
The paper lacks a fundamental part, the direct comparison on the same dataset with some
other methods, chosen as gold standards (as listed for example in table 1), that have shown
good performances.
Authors: Following this remark, results from other methods have been reported in
the revised version (Result section, Table 6).Comparison with the proposed method is
presented and discussed. Please refer to our answer to Remark 1.5 above.
Reviewer: Remark 2.2)
Looking at table 1, it seems that the proposed method is not overperforming if compared
with the t-location-scale distribution [ref 19] or Cross-correlation [ref 40]. This opens
questions about the actual usefulness of the method.
Authors: The reviewer is right. Results have been discussed and compared. And the
advantage of the proposed has been provided. Please refer to our answer to Remark 1.5
above.
Reviewer: Remark 2.3)
The results shown in table 2 and figure 1 are not convincing and do not help to understand
the actual value of the method. It seems that class 0 (non-spike-and-wave) and class 1
(spike-and-wave) are overlapped in terms of statistical properties. Furthermore, some
examples with real EEG signals can help the reader to better understand how the 2 classes
differ.
Authors: The section 3 has been rewritten to make results clearer. In particular, we
introduced statistical bounds to describe the features in the two classes (mean, standard
deviation, and variance). We have also added a 3D scatter plot to show that it is possible
to find a hyper-plane that separates data-points from the two classes. Example of SWD
signals have also been provided (Figure 1). Please see our answer to Remark 1.7 above.
Reviewer: Remark 2.4)
The discussion must be more focused on the strength of the method especially with respect
to the reference literature.
Authors: Following the reviewer's remark, the Discussion section has been entirely
rewritten to focus on the advantages of the proposed method as compared to existing
methods.

 

 

Reviewer 3 Report

Dear authors,

This article is very difficult to read. The authors have to reconsider the whole structure of the paper. Presenting the results followed by the discussion section right after the introduction section is a very strange choice.
Please use a classical structure!
The methodology section needs to be further developed.
What do you mean by 10-knn? Is it 10 clusters? And why 10?
You are using a self-supervised classifier. Can you describe this type of classifier?
Your data seems to be labeled? Is this true? If so why would you choose a self-supervised classifier that is usually used for unsupervised data?
Your performances given in Table 1 seem to be significantly lower than those in the literature. What is the interest of your work?
The title of figure 1 is too long! How to interpret this figure and what is its interest? If it is to visualize the distribution of the data, why not use visualization methods that are known :

  • t-sne:

L. van der Maaten and G. Hinton, “Visualizing data using t-sne,” Journal
of Machine Learning Research, vol. 9, no. 11, p. 2579–2605, 2008.

  • Variational AutoEncoder:

Deep Convolutional Variational Autoencoder as a 2D-Visualization Tool for Partial Discharge Source Classification in Hydrogenerators, December 2019, IEEE Access PP(99):1-1, DOI: 10.1109/ACCESS.2019.2962775

Deep Variational Autoencoder: An Efficient Tool for PHM Frameworks, May 2020, DOI: 10.1109/PHM-Besancon49106.2020.00046, Conference: 2020 Prognostics and Health Management Conference (PHM-Besançon)   Semi-Supervised Adversarial Variational Autoencoder, Mach. Learn. Knowl. Extr. 2020, 2(3), 361-378; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030020       Deep NN models are generally widely used in biomedical applications in general and for EEG signal processing and recognition. You don't mention them at all. Why not? You can get inspiration from the paper below:   Deep Learning in the Biomedical Applications: Recent and Future Status, April 2019Applied Sciences 9(8):1526 Follow journal DOI: 10.3390/app9081526

Author Response

Reviewer: Remark 3.1)
1. The methodology section needs to be further developed.
2. What do you mean by 10-knn? Is it 10 clusters? And why 10?
3. You are using a self-supervised classifier. Can you describe this type of classifier?
Your data seems to be labeled? Is this true? If so why would you choose a self-
supervised classifier that is usually used for unsupervised data?
4. Your performances given in Table 1 seem to be significantly lower than those in the
literature. What is the interest of your work?
5. The title of figure 1 is too long! How to interpret this figure and what is its interest?
If it is to visualize the distribution of the data, why not use visualization methods
that are known : t-sne:
L. van der Maaten and G. Hinton, "Visualizing data using t-sne", Journal of
Machine Learning Research, vol. 9, no. 11, p. 2579-2605, 2008. Variational
AutoEncoder:
Deep Convolutional Variational Autoencoder as a 2D-Visualization Tool for
Partial Discharge Source Classification in Hydrogenerators, December 2019,
IEEE Access PP(99):1-1, DOI: 10.1109/ACCESS.2019.2962775
Authors: Following the reviewer's remark, we have revised the paper as follows:
1. The methodology section needs to be further developed: The methodlogy section has
been rewritten and restructure to make the processing steps clear. A block-diagram
of the method has been added for this purpose.
2. What do you mean by 10-knn? Is it 10 clusters? And why 10?: We are sorry for
the confusion. In K-NN one must choose the K parameter, which is the number of
labelled data-points used to assign a label to an unknown point. Explanation to
how to choose k for k-nearest neighbour has been provided at the end of Section
2.2.4: The parameter k is chosen based on √N, where N is the number of samples in the training dataset.
And, the choice of k = 10 has been explained in the Discussion section: Based on our rule to choose k, the value was √212 = 14, but we found a better performance by choosing empirically k = 10.
3. You are using a self-supervised classifier. Can you describe this type of classifier?
Your data seems to be labeled? Is this true? If so why would you choose a self-
supervised classifier that is usually used for unsupervised data?: We are sorry for the confusing term self-supervised. It has now been removed from the paper. Actually, it was not used to mean "unsupervised" but to hint to the fact that providing multiple data-points from the same patient would enforce self-learning. But this was a misleading term and has been removed.
4. Your performances given in Table 1 seem to be significantly lower than those in the
literature. What is the interest of your work?: May we refer the reviewer to our
answers to remarks 1.5, 1.7 and 2.4 above. The results and advantage of the method
are discussed.
5. The title of figure 1 is too long! How to interpret this figure and what is its interest?
If it is to visualize the distribution of the data, why not use visualization methods
that are known t-SNE:
• L. van der Maaten and G. Hinton, "Visualizing data using t-sne", Journal of
Machine Learning Research, vol. 9, no. 11, p. 2579-2605, 2008. Variational
AutoEncoder:
• Deep Convolutional Variational Autoencoder as a 2D-Visualization Tool for
Partial Discharge Source Classification in Hydrogenerators, December 2019,
IEEE Access PP(99):1-1, DOI: 10.1109/ACCESS.2019.2962775
We are grateful to the reviewer for this valuable remark and the interesting refe-
rences. We tested the t-SNE algorithm and we didn't find significant differences
with the classical scatter plot for MATLAB, in our case. So we have used 2D and
3D scatters to show the structure of the data. In addition, we provided more text
in Section 3 (Results) and section 4 (Discussion) to explain the results. The deep
learning approach referred to by the reviewer is a very interesting method to handle
the problem at hand. We have set up a perspective to explore this technique in the
conclusion: Future work will focus on other epileptic waveform patterns as well as
on the extensive evaluation of the proposed approach and its comparison with other
methods from the literature both in humans and rodents. Other techniques, such
as visual data analysis with t-distributed stochastic neighbor embedding [51] and
deep learning variational autoencoders [52] will be considered. In fact, we already
have implemented some work with deep learning on EEG, but findings have not
been published yet.


Reviewer: Remark 3.2)
• Deep Variational Autoencoder: An Efficient Tool for PHM Frameworks, May 2020,
DOI: 10.1109PHM-Besancon 49106.2020.00046, Conference: 2020 Prognostics and
Health Management Conference (PHM-Besancon)
• Semi-Supervised Adversarial Variational Autoencoder, Mach. Learn. Knowl. Extr.
2020, 2(3), 361-378; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030020
Deep NN models are generally widely used in biomedical applications in general and
for EEG signal processing and recognition. You don't mention them at all. Why not? You
can get inspiration from the paper below: Deep Learning in the Biomedical Applications:
Recent and Future Status, April 2019. Applied Sciences 9(8):1526 Follow journal DOI:
10.3390/app9081526
Authors: We totally agree with the reviewer that deep learning techniques are very
interesting to consider for EEG signal processing and that the literature abounds of such
works. As mentioned in our the answer to your previous remark, we are conducting some
work on this topic and findings will be published later. However, in this work we want
to show that strong dimension reduction is possible with statistical data modelling and
lead to good machine learning results. We understand that this sort of work appears as
completing with the deep learning trend. However, that is not our point. Our idea now
is based on our years-long statistical modelling approach, we would like to investigate
model-informed deep learning (which we agree is a challenging research problem).

Round 2

Reviewer 1 Report

The article is well written and there is no further confusion/ambiguity in the description. I appreciate the authors for their fine revision.

Reviewer 2 Report

The authors convincingly improved the manuscript following the reviewers' suggestions.

Reviewer 3 Report

The authors responded to all questions and remarks.

Back to TopTop