Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis

Formoso, Marco A.; Ortiz, Andrés; Martinez-Murcia, Francisco J.; Gallego, Nicolás; Luque, Juan L.

doi:10.3390/s21217061

Open AccessArticle

Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis

¹

Communications Engineering Department, University of Málaga, 29071 Málaga, Spain

²

Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), 18014 Granada, Spain

³

Department of Signal Theory, Networking and Communications, University of Granada, 18014 Granada, Spain

⁴

Department of Basic Psychology, University of Malaga, 29019 Málaga, Spain

^*

Author to whom correspondence should be addressed.

Sensors 2021, 21(21), 7061; https://0-doi-org.brum.beds.ac.uk/10.3390/s21217061

Submission received: 7 September 2021 / Revised: 21 October 2021 / Accepted: 22 October 2021 / Published: 25 October 2021

(This article belongs to the Special Issue Smart Computing Systems for Biomedical Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Objective Dyslexia diagnosis is a challenging task, since traditional diagnosis methods are not based on biological markers but on behavioural tests. Although dyslexia diagnosis has been addressed by these tests in clinical practice, it is difficult to extract information about the brain processes involved in the different tasks and, then, to go deeper into its biological basis. Thus, the use of biomarkers can contribute not only to the diagnosis but also to a better understanding of specific learning disorders such as dyslexia. In this work, we use Electroencephalography (EEG) signals to discover differences among controls and dyslexic subjects using signal processing and artificial intelligence techniques. Specifically, we measure phase synchronization among channels, to reveal the functional brain network activated during auditory processing. On the other hand, to explore synchronicity patterns risen by low-level auditory processing, we used specific stimuli consisting in band-limited white noise, modulated in amplitude at different frequencies. The differential information contained in the functional (i.e., synchronization) network has been processed by an anomaly detection system that addresses the problem of subjects variability by an outlier-detection method based on vector quantization. The results, obtained for 7 years-old children, show that the proposed method constitutes an useful tool for clinical use, with the area under ROC curve (AUC) values up to 0.95 in differential diagnosis tasks.

Keywords:

functional connectivity; EEG; anomaly detection; self-organizing maps; phase locking value; circular correlation

1. Introduction

Electroencephalography (EEG) is a non-invasive way to obtain information about brain activity. It has been widely used with different objectives, from the study of cortical brain activity related to the cognitive development under specific tasks [1], to the exploration of the brain processes involved in neurological diseases such as Alzheimer’s disease [2,3], Parkinsonian syndromes [4,5], epileptic disorders [6,7,8], and other psychiatric disorders such as schizophrenia [9]. Moreover, EEG has been widely used in experimental neuropsychology in the quest for answers about how the brain processes different stimuli and the cortex areas involved. This is the case of learning difficulties, since their neural origin remain unknown. Furthermore, efforts to explain the brain processes involved in language processing constitute a way for the early diagnosis of these disorders.

One of the most common learning disorders is Developmental Dyslexia (DD), a specific difficulty in the acquisition of reading and writing skills not related to mental age or inadequate schooling. DD may severely affect the self-esteem of affected children and may determine school failure. Its prevalence is estimated between 5% and 12% of the population [10], depending on the reading performance benchmark. DD diagnosis is usually carried out by behavioural tests that can only be performed by readers, whose results depend on the children mood and the issuance of a diagnostic report requires a detailed analysis of the results by expert clinicians. On the other hand, the early diagnosis of DD plays a major role for the success of specific intervention tasks developed to mitigate the consequences of DD. Then, diagnosis techniques based on the extraction of biological markers provide valuable and objective information for the clinicians and pave the way for the early diagnosis, facilitating the application of an intervention program in pre-readers [11]. In fact, neurophysiological signals constitute an objective way to evaluate neurological or behavioural disorders that are usually subjectively assessed by clinicians.

Brain functional connectivity refers to the coordinated activation of brain areas arising when the subject performs a specific task or staying in resting state. Indeed, some previous works have developed different methods to estimate the functional connectivity by searching for patterns that characterize different neurological diseases such as Alzheimer’s or Parkinson’s Diseases using functional imaging techniques [12,13,14,15], EEG [16,17], or classify different sleep stages [18]. Moreover, the use of brain connectivity has also been used in cognitive neuroscience to identify brain areas involved in language and learning tasks [19,20]. The use of EEG to explore functional connectivity provides opportunities due to its high temporal resolution, allowing researchers to measure the synchronization between brain areas. That synchronization can be interpreted as cooperation from the functional connectivity point of view, and inferred from the phase difference between EEG channels, the statistical relationship between different EEG frequency bands (namely, Cross-Frequency Coupling), or using techniques that figure out inter-channel causality [21].

In this line, some previous works used EEG and Magnetoencephalography (MEG) to explore the basis of DD using speech-based stimuli, under the assumption that DD is characterized by reduced awareness of speech units [22]. Hence, atypical neural entrainment at different rates may arise in affected subjects. Other works in the search of the basis of speech coding, hypothesize that DD originates from the atypical synchronization in the right hemisphere to the lower EEG bands [23,24]. In [25] differences between dyslexic and control groups were found in the preferred phase of entrainment of Delta band after presenting a visual and auditory stimulus to the child.

An example of EEG classification where an auditory stimulus is used can be found in [26]. In this work, Gibbon et al. presented either a drumbeat or a syllable to 8-week-old infants while recording the EEG. They were able to classify whether the EEG correspond to the drumbeat or the syllable with an AUC of 0.875 for a convolutional neural network and an AUC of 0.86 for a support vector machine. Changing the task while recording the EEG from a hearing to writing, Perera et al. [27] were able to detect dyslexic children using a support vector machine. They recorded EEGs while the subjects were doing a writing task (pen and paper) and typing task (on a keyboard). Besides using all the channels at once to classify, they selected the channels distinguishing several brain parts obtaining different results for each one. Moreover, in [28], several stimuli at different frequencies were shown to the children. Then, temporal and spectral features from the EEGs at different frequency were used to build the feature space, obtaining AUC values ranging from 0.69 to 0.89. On the other hand, [29] uses periodogram-based features. Specifically, the spectral density is obtained from the EEGs and then Principal Component Analysis is used to reduce the dimension of the spectral density vector. In order to classify, they use a support vector machine, comparing the classification performance of the whole spectral density vector and the reduce version of it. This method provided AUC values up to 0.75. In [30], power-based connectivity features were computed to train a neural network (denoising autoencoder) to generate features to represent dyslexic and control groups in a lower dimensional space.

As shown in the works cited above, the use of EEG signals in search of descriptive patterns requires the use of specific signal processing techniques due to its low signal-to-noise (SNR) ratio and the presence of multiple artefacts such as ocular, muscular or cardiac. These have to be properly removed without affecting useful information. As a consequence, different preprocessing methods such as Independent Component Analysis (ICA) are commonly used to remove known artefacts by a source separation process.

On the other hand, the feature extraction process has to provide discriminative enough descriptors for modelling the categories present in the dataset. These features can be based on time, frequency [31], or time-frequency [32], that can also be used to derive complex relationships among brain areas. Depending on the specific descriptor, these relationships can be interpreted as functional connectivity, as in the case of this work. Connectivity measures can be derived from frequency and phase variations and power spectrum, and they can be split into two types depending on the method used to compute those relationships:

Methods that characterize the statistical relationships between electrodes but in the same frequency band. In this way, Spectral Coherence (SC) provides a way to measure the synchronization between channels, which may indicate a connection from the functional point of view between the neuron clusters in the two areas involved [33]. Other connectivity measures can be computed from the phase angle differences between channels over time [21].
Methods that characterize the statistical relationships between the activity in two channels at different frequency bands. The measure provided by these methods is commonly referred as Cross Frequency Coupling (CFC). Phase-Amplitude Coupling (PAC) is a representative and practical example for computing the CFC which has neural and physical implications [34].

We use SC descriptors to estimate the connectivity between channels in the five EEG bands (Delta, Theta, Alpha, Beta, and Gamma) and then, to extract discriminative features for the differential diagnosis of DD. These EEG were obtained using a novel method based on the application of auditory stimuli at different frequencies that activate the low-level processing network in the brain cortex. Moreover, an anomaly detection approach is implemented using a method that combines unsupervised learning by vector quantization and a Bayesian classifier. The proposed method allows working with not very large databases, overcomes the imbalance problem and reduces the overfitting.

The main aim of the work is twofold: firstly, it aims to demonstrate that low-level auditory processing produces different patterns in the brain networks involved. This connectivity is studied by means of phase synchronization between EEG channels. This methodology is not only relevant for diagnostic purposes but also for the study of the differences in the brain processes developed during auditory processing in controls and dyslexic children, paving the way to go deeper into the biological basis of dyslexia. Secondly, the overall classification pipeline provides a method that can detect dyslexic subjects in an objective way, only using EEG signals. This can be used as an effective tool for clinical practice. The main contributions of this work can be summarized as:

We use low level auditory stimuli to study the brain processes involved in language processing, instead of previous works that use only speech-based stimuli [22].
Connectivity between brain areas is searched by means of phase synchronization between EEG channels, which is computed using the Circular Correlation.
An anomaly detection approach has been implemented using a method that combines unsupervised learning by vector quantization and a Bayesian classifier. The proposed method allows working with not very large databases, overcomes the imbalance problem and reduces the overfitting.

After this introduction, the rest of the paper is organized as follows. Section 2 shows the materials and methods used in this work, including the database and the acquisition protocols used to compose it. Moreover, this section presents the foundations of the methods used to measure the inter channel phase synchrony. Then, Section 4 shows the proposed classification method that considers the non-control samples as anomalies. Section 5 and Section 6 presents the main results obtained with the proposed methodology and the discussion, respectively. Finally, conclusions are drawn in Section 7.

2. Materials and Methods

2.1. Data Acquisition

EEG data used in this work was provided by the Leeduca group at the University of Málaga. Control and experimental groups are extracted by a carefully screening process from a cohort (N = 700) followed from 4 years to the second evaluation of 7 years in 20 different primary schools (Junta de Andalucía). This way, the socioeconomic status (SES) had a longitudinal dynamic evaluation of the subjects, plus ATLAS [35] (A self-report questionnaire on reading-writing difficulties for adults) family risk information, plus a complete report at 7 years old, which included standard assessment tasks. Comorbidities with other neurodevelopmental disorders such as Language Impairment (LI), Speech Sound Disorder (SSD), Attention Deficit Hyperactivity Disorder (ADHD), Autism, and other auditory or visual sensory deficit disorders were taken into account in the screening process, along with information about other relevant conditions which can affect reading achievement, as immigration or bilingualism [36].

In the experiments, EEG signals were recorded by a Brainvision actiCHamp Plus with a 32 channels amplifier that allows a sampling rate up to 500 Hz using active electrodes (actiCAP, Brain Products GmbH, Germany). EEG was recorded during 5-min sessions at a sampling rate of 500 Hz, while presenting an auditory stimulus to the subject. These auditory stimuli consisted on white noise 100% amplitude modulated at 4.8 Hz and 16 Hz. These measures were repeated two times. Stimulus were determined by expert linguistic psychologists studying the main frequency components present in voice, corresponding to syllables and phonemes. The study of the statistical distribution of these components determined a component of 4.8 Hz and 16 Hz for syllables and phonemes, respectively, in Spanish speakers.

Database descriptive statistics are shown in Table 1. Subjects in the database are right-handed, Spanish native speakers, and have neither hearing impairments nor vision problems.

The location of 32 electrodes in the 10–20 standardized system used in the experiments is shown in Figure 1.

2.2. Data Preprocessing

EEG signals were first preprocessed to remove artefacts related to eye-blinking, and impedance variations due to movements. Ocular artefacts were removed by source separation using Independent Component Analysis (ICA) [37] using eye movements recorded by the EOG channel. Artefacts related to movement or noise from unknown sources were removed by removing EEG segments. Then, all channels were referenced to the Cz channel.

After this first stage, the EEG channels were band-pass filtered to extract the information corresponding to the five EEG frequency bands (Delta, 1.5–4 Hz; Theta, 4–8Hz, Alpha, 8–13 Hz; Beta, 13–30 Hz; and Gamma, 30–80 Hz), as shown in Figure 2. It is worth noting that, since we are interested in the phase of the signals, the filtering process cannot introduce phase distortion. This way, Infinite Impulse Response (IIR) filters cannot be used. Instead, we used Finite Impulse Response (FIR) filters, which introduce a constant phase lag that can be corrected afterwards. More specifically, we used a two-way zero-phase lag band-pass FIR Least-Squares filter, which compensates the phase lag introduced while filtering by passing the signal forward and backward through the filter, achieving zero-lag phase in the overall filtering process [38]. Low-pass filtering was applied with a cut-off frequency of 80 Hz. Additionally, a 50 Hz notch filter was used in the preprocessing stage to remove this frequency component.

3. Functional Connectivity from EEG Signals

In this section, we show the method used to use the phase difference between EEG channels to infer connectivity, as well as the overall proposed method for classification.

3.1. Phase-Based Connectivity

One way to estimate the connectivity between two channels consists of analysing the phase difference of these channels. This has led to the development of different measures for phase coherence related to the synchronization of these channels. Since the phase of the signal extracted from an electrode varies over time, it is necessary to compute the phase,

ϕ_{i} (t)

for each channel i, namely the instantaneous phase. In this work, the instantaneous phase is obtained by means of the Hilbert transform computed from band-pass filtered signals.

Hilbert Filter

Hilbert transform (HT) allows to compute the analytic signal from a real one. The analytic signal is a complex-valued time series which has no negative frequency components. Thus, it is possible to compute the time varying amplitude, phase and frequency from the analytic signal, also called instantaneous amplitude, phase and frequency, respectively. Hilbert Transform (HT) is defined for a signal

x (t)

.

H [x (t)] = \frac{1}{π} \int_{- \infty}^{+ \infty} \frac{x (t)}{t - τ} d τ

(1)

and the analytic signal

z_{i} (t)

for a signal

x (t)

can be obtained as

z_{i} (t) = x_{i} (t) + j H {x_{i} (t)} = a (t) e^{j ϕ (t)}

(2)

From

z_{i} (t)

, it is straightforward to compute the instantaneous amplitude as

a (t) = \sqrt{r e {(z_{i} (t))}^{2} + i m {(z_{i} (t))}^{2}}

(3)

and the instantaneous, unwrapped phase is

ϕ (t) = t a n^{- 1} \frac{i m (z_{i} (t))}{r e (z_{i} (t))}

(4)

This way, the use of the previous explained method provide the phase value at each time point, allowing to estimate synchronization between channels from phase differences. HT is found useful in the characterization of the EEG through the synchrony of its channels. For instance, in [40] it is used to detect changes of the phase synchronization in epileptic subjects. They found evidences that epileptic seizures can be preceded by characteristic changes in synchronization. In [41], HT is part of a two steps process called empirical mode decomposition [42]. The authors used this process as a feature extractor that is then employed to classify various sleep stages.

The implementation of this method can be found in Scipy [43] library.

3.2. Channel Synchronization by Pearson’s Circular Correlation

One method to estimate the connectivity between channels consists of measuring phase difference between EEG channels. One of the most popular phase-based connectivity measures that estimates the extent of synchronization between channels is the Phase Locking Value (PLV). PLV measures the average of phase differences between channels over time [44], and can be computed from the Hilbert-filtered signal, using the instantaneous phase values as:

P L V = |\frac{1}{n} \sum_{t = 0}^{n} e^{j (ϕ_{x t} - ϕ_{y t})}|

(5)

where

ϕ_{x t}

and

ϕ_{y t}

are the phase angle at time point t for the channels x and y, respectively, and n is the number of samples of the signal.

Phase-based connectivity measures that relies on the phase difference have advantages over other synchronization measures. One of the most important is the ability to detect false connectivity due to volume conduction. This consists of capturing signal from the same source by two neighbouring electrodes, due to the spreading effect through the skull. Since PLV values due to volume conduction are clustered around zero, it will reflect a zero value in the connectivity matrix, indicating the absence of connection between the corresponding channels. However, PLV is a measure of the consistence of phase difference (average) but it does not imply information exchange between channels which would imply covariance, and could eventually provide spurious connections [45]. To overcome this, in this work, we used Circular Correlation to estimate the phase synchrony. Coherence measures have some limitations as indicated in [44]: on the one hand, since coherence is a measure of the linear covariance between two-spectra, it is not recommended for non-stationary signals. On the contrary, phase-locking based measures do not require the signals to be stationary. On the other hand, coherence increases with amplitude covariance which could bias the estimation of the between channels interaction measurement.

As explained in [45], Circular Correlation measures how the phase variance of two channels co-varies: to determine whether one channel is slightly in advance of its expected phase at a given time, the phase in the other channel is also advanced (in this example, we refer to positive correlation). In the case of unrelated channels, the phase variance will not co-vary and the Circular Correlation will be zero. On the contrary, as the PLV only measures the phase difference, it is likely to be poorer at discriminating between related and unrelated signals [44].

Since phase information is circular (in the range

[0, 2 π]

), we use circular statistics [46]. Specifically, the circular equivalent of Pearson’s correlation coefficient [47] can be defined as:

r_{c i r c u l a r} = \frac{\sum_{t = 1}^{n} [s i n (ϕ_{x t} - \bar{ϕ_{x}}) s i n (ϕ_{y t} - \bar{ϕ_{y}}]}{\sqrt{\sum_{t = 1}^{n} s i n {(ϕ_{x t} - \bar{ϕ_{x}})}^{2}} \sqrt{\sum_{t = 1}^{n} s i n {(ϕ_{y t} - \bar{ϕ_{y}})}^{2}}}

(6)

where

ϕ_{x t}

,

ϕ_{y t}

is the phase value of the channels x and y at time point t,

\bar{ϕ_{x}}

,

\bar{ϕ_{y}}

the circular mean of x and y channels, and n the number of samples of the signal. The circular mean of a channel i can be computed as:

\bar{ϕ_{i}} = a r c t a n (\frac{\sum_{t = 1}^{n} s i n (ϕ_{i t})}{\sum_{t = 1}^{n} c o s (ϕ_{i t})})

(7)

According to the above definitions,

r_{c i r c u l a r}

measures the circular covariance of differences between the observed phase and the expected (

\bar{ϕ_{i}}

) phase. As a result, relationship (i.e., synchronization) between channels is detected as a co-variation between the phase variance of the channels. On the other hand, out-of-sync channels will not co-vary and the circular covariance will take values close to zero. In the case of Pearson’s correlation,

r_{c i r c u l a r}

will take values in the range [0, 1].

4. Diagnosing Dyslexia by Outlier Detection

The most usual way to deal with the classification of instances of a dataset, consists of obtaining a discriminative model of the classes. However, this is not always possible for biomedical data: firstly, it is difficult to acquire a balanced dataset, according to the prevalence of the disorder under study. Moreover, biomedical data acquisition is time-consuming and requires a careful screening process. On the other hand, depending on the experimental setup, the nature of the biomedical signals and the objective of the experiment, one (or both) groups under study may contain a considerable variability. These issues can generate a biased model, complicating the generalization capabilities. An effective solution to this situation is using an outlier detection method, modelling the most numerous class and identifying samples belonging to a different class as an outlier (or anomaly sample). In this work, we propose the use of an outlier detection method based on self-organizing map, which is trained with the aforementioned circular correlation features.

4.1. Outlier Detection Based on Self-Organizing Maps

Self-Organizing Map (SOM) is a vector quantization algorithm that provides a low dimensional projection of a high dimensional feature space, facilitating the visualization of the data structure as well as the easy identification of clusters in data. Works with SOM have been proposed in a wide range of different fields. In medicine [48], SOM was used to group molecular leukaemia samples using the genes as features. It correctly clustered two types of cancer cells. In the computer vision field, Lawrence et al. [49] applied the SOM to reduce the dimensionality of a image at a first step of a pipeline for face recognition. Betti et al. [50] designed a fault prediction system in large power plants based on SOM. An improved SOM algorithm was used in the work of Cai et al. [51]. They employed SOM to classify different sitting postures. In [52], more general experiments were carried out in order to first the dimensionality of the data and then cluster it. Their results show that it is worth spending time training a SOM before attempting to cluster the data. Other fields such as network security have also taken advantage of the properties of SOMs. For instance, in [53], a SOM-based method allows computing bayesian activation of SOM units to detect anomalies in computer networks.

We use the minicom [54] package for python to work with SOM.

The SOM algorithm is briefly explained as follows. Let X

\in R^{n}

be a n-dimensional dataset. The SOM map is composed of d units, arranged in a 2-dimensional lattice and each represented by a n-dimensional model vector

ω_{i}

. For each input data instance

x

, the Best Matching Unit (BMU) is defined as the unit

ω_{i}

closest to

x

in terms of the euclidean distance

∥ \cdot ∥

:

∥ ω_{i} - x ∥ \leq ∥ ω_{j} - x ∥, \forall x \in X, i \neq j

(8)

Model vectors in a neighbourhood of the BMU are updated in each iteration, according to the euclidean distance to the input data sample.

ω_{i} (t + 1) = ω_{i} (t) + α (t) h_{i} (t) (v - ω_{i} (t))

(9)

where

α (t)

is the learning rate and

h_{i} (t)

is a function which defines the neighbourhood around the BMU

ω_{i}

, and i is a linear index that identify the prototype vectors.

The usual way to progressively reducing

α (t)

with the number of iterations consist of applying an exponential decay rule [55]. Similarly, the neighbourhood of each SOM unit

h_{i}

is a Gaussian hat [55] whose width shrinks in time (iterations). In order to improve the convergence of the SOM, it has been linearly initialized to the principal components of the training data. This way, each dimension of prototypes was arranged proportionally corresponding principal component [55].

According to the SOM definition shown above, a data sample x will be represented by the prototype vector of the corresponding BMU. The representation error, known as quantization error (QE), can be computed as for each BMU as

q e = \frac{1}{n} \sum_{k = 1}^{n} ∥ ω_{i} - x_{k} ∥ \forall x_{k} \in R F_{i}

(10)

where

R F_{i}

is the receptive field of the BMU unit

ω_{i}

, defined as:

R F_{i} = {x_{k} \in X : ‖ x_{k} - ω_{i} ‖ \leq ‖ x_{k} - ω_{j} ‖ \forall i \neq j}

(11)

4.1.1. Band Relevance Using Quantization Error Distribution

QE distribution for Controls and Dyslexia subject can be computed using the values provided by Equation (10). Figure 3 shows the distribution of the QE obtained for each EEG band, clearly showing the discriminant capability, as it is possible to find a threshold in QE to separate QE values provided by samples of different classes. Moreover, it is possible to take advantage of this result to select the most discriminant band, since overlapping between distributions can be seen as a measure of separability. In this case, the Kullback–Leibler divergence (

D_{K L}

) [56], defined as:

D_{K L} (P | Q) = \sum_{i} P (i) \frac{P (i)}{Q (i)}

(12)

is used to quantify the overlapping between P and Q distributions and eventually, the discriminative capability of the corresponding error quantization for each data sample. Although

D_{K L}

is not a symmetric measure and cannot be interpreted as a distance, it is possible to symmetrise by computing a two-ways measure as

{\hat{D}}_{K L} = 0.5 * (D_{K L} (P | Q) + D_{K L} (Q | P))

.

According to this measure,

K L_{g a m m a} > K L_{b e t a} > K L_{t h e t a} > K L_{a l p h a} > K L_{d e l t a}

for 4.8 Hz stimulus and

K L_{a l p h a} > K L_{b e t a} > K L_{t h e t a} > K L_{g a m m a} > K L_{d e l t a}

for 16 HZ stimulus.

4.1.2. Uncertainly in SOM Units Activation

As explained above, SOM is a vector quantization algorithm that projects the high-dimensional input space into a lower dimensional (usually 2D) lattice. The trained map poses interesting properties such as topology preservation and vector space interpolation between units. Since the BMU unit is computed as the nearest prototype according to the euclidean distance, each data sample will have its BMU, regardless of the resemblance to the map prototypes. However, the reliability of the unit activation can be measured in terms of the quantization error. Thus, the prototype of a BMU producing a large quantization error for a data sample, will barely represent that sample. In Figure 4, the activation of SOM units for all the available samples is shown, according to the corresponding mean quantization error.

As shown in Figure 4, dyslexia samples (considered as anomalies) produces a considerably higher quantization error than control samples.

4.2. Bayesian Anomaly Detection for SOM

Naïve Bayes classifiers [57] separate data into different classes by means of the Bayes’ Theorem, along with the assumption of independence among predictors. However, the extent of this premise in the performance of the classifier has been evaluated and discussed in the literature, concluding that it is possible to outperform other classifiers even in the case that all the features are not completely independent [58,59,60]. As a matter of fact, there are different works that take advantage of the benefits of Naïve Bayes classifiers for EEG signal classification. For instance, in [59], a higher number of principal components (PC) used to project the EEG data does not reduce nether the classification accuracy nor the AUC value in comparison to the performance obtained with only one PC, where the feature independence is automatically assumed. On the other hand, Naïve Bayes classifiers provides a good performance with relatively small datasets while avoiding the problems derived from the dimensionality, and without suffering from overfitting. This is especially important with relatively small datasets, where supervised classifiers are easy to overfit, decreasing its generalization capabilities [61].

As shown in the previous section, the anomaly detection method used here is based on the quantization error generated by abnormal patterns in comparison to the one produced by control patterns used to train the map. Since the prototypes only represent control samples, a higher quantization error will be obtained when representing an abnormal pattern using the BMU of the map. Then, it is necessary to compute the threshold for the quantization error to distinguish a normal sample from an anomaly. This threshold can be computed by a Naïve Bayes strategy as follows. Let

C N

be the Normal class and

D D

the Anomaly class. According to Bayes’s theory, the probability of a new sample x to belong to the c class can be expressed as:

P (c | x) = \frac{P (x | c) P (c)}{P (x)}

(13)

where

P (c)

is the prior likelihood of the c class,

P (x | c)

is the probability of x conditioned to c, and

P (x)

is the marginal probability, computed as

P (x) = \frac{# n e a r e s t s a m p l e s t o x i n t e r m s o f q e}{# t o t a l n u m b e r o f s a m p l e s}

(14)

Thus, it is possible to obtain

P (c | x)

and the most discriminating threshold can be computed as the value of QE for which

P (c | x) = 0

. The implementation used for the classifier can be found in the sklearn python package [62].

5. Results

This section presents the main results obtained with the methodology explained above. Since data corresponding to 4.8 and 16 Hz stimuli are stored in the EEG database, we carried out experiments to determine the discriminative power of the different stimuli. As explained in Section 3, we address the searching of differential patterns from a functional connectivity point of view, using the circular correlation as a phase synchrony measure. Thus, in order to show the most significant differences in the connectivity a Mann–Whitney Wilcoxon (distribution agnostic) hypothesis was used. The methodology described at Section 4 to extract features and classify them is applied separately for each EEG band and each stimulus, obtaining the results summarized in Table 2. It is worth noting that this classification methodology is assessed by stratified k-fold cross-validation (k = 5), in order to estimate the generalization errs edit test was performed for each band. Thus, Figure 5 shows the most significant connections as those providing

p < 0.01

in the hypothesis test. Moreover, in order to remove possible spurious connections, we performed the FDR (False Discovery Rate) correction (alpha = 0.05) of the p-values. This also helps to identify significant connections since resultant matrices are sparser.

The methodology described at Section 4 to extract features and classify them is applied separately for each EEG band and each stimulus, obtaining the results summarized in Table 2. It is worth noting that this classification methodology is assessed by stratified k-fold cross-validation (k = 5), in order to estimate the generalization error.

Additionally, results in Table 2 are graphically depicted in Figure 6, where the error bars corresponds to the standard deviation obtained in the cross-validation process. Specifically, Figure 6 left shows the classification results for the 4.8 Hz stimulus, and Figure 6 right for the 16 Hz stimulus.

It is well known that the overall performance of a binary classifier can be assessed by exploring the Receiver Operating Curve (ROC) space, as it provides the cut-off point of the sensitivity-specificity trade-off. In other words, it shows the performance of the classifier in a compact form, regarding it capability of correctly identifying positive and negative samples. Relatedly, AUC is a very useful metric that represents the probability of a random positive to have a more extreme value of a negative one (and vice versa). Indeed, Figure 7a,b show the ROC curves obtained for each band for the 4.8 Hz and 16 Hz stimulus, respectively, along with the corresponding AUC values computed for each curve.

Statistical Significance

As usual in biomedical problems, the number of available samples requires performing statistical tests to ensure that the results are not biased by the classification stage (i.e., due to overfitting effects, for instance). In addition, it is necessary to check the probability of these results was obtained by chance, while these tests that can be relatively relaxed when the database is large enough, they require special attention in real-world databases, where classes are not balanced and the sample size is small (in the case of experimental subjects, it is necessary to take into account the prevalence of the disorder being treated. In the case of dyslexia, it is about 7% as indicated in the introduction).

The test used consists on generating a null distribution by calculating the accuracy of the classifier on 1000 permutations of the labels. This provides the distribution for the null hypothesis, which states the independence between features and labels and allows calculating the probability of reproducing the classification results with shuffled labels. This empirical p-value is then computed as:

p - value = \frac{# p e r m u t a t i o n s w i t h a c c u r a c y h i g h e r t h a n b a s e l i n e}{# N u m b e r o f p e r m u t a t i o n s}

(15)

The results of the permutation test are depicted in Figure 8, where the null distribution obtained by permuting the labels as indicated above is shown in blue, and the accuracy obtained for the non-permuted case is indicated as a vertical red line. A 5-fold stratified cross-validation is carried out at each permutation iteration. Then, the average of the results obtained at these 5 folds is taken for the corresponding permutation iteration. Thus, Figure 8 shows the probability density for the classification. This is the most computing demanding task as the classifier should be trained 1000 times. In our equipment (Dual Intel(R) Xeon(R) CPU E5-2640 v4 (10 cores)@ 2.40 GHz, 128GB of RAM) it took about one day to complete the permutations.

6. Discussion

In this section, discussion about the methodology, results obtained and validation strategy is provided, along with a comparison to previous works addressing the problem of dyslexia classification using biomedical signals.

The proposed method uses the quantization errors corresponding to SOM prototypes as features, which are subsequently used to train a Naïve Bayes classifier that computes the optimum separation threshold. The SOM was trained using all the connections, and the significance of the classification algorithm was assessed by 5-fold cross validation and a permutation test consisting in making 1000 permutations. This ensures the classification results are not obtained by chance, and actual differences are present in patterns belonging to controls and dyslexic subjects.

This validation strategy estimates the generalization error, and bounds the probability of the results to be obtained by chance. With this methodology, Figure 6 also provides a view of the discriminative capabilities of each EEG band. In other words, it can be seen as the influence of each stimulus in the neuronal oscillations at a specific frequency band. Thus, beta and gamma bands are providing the best results for 4.8 Hz stimulus, while the best performance is obtained for Alpha and Beta bands in the case of 16 Hz stimulus. Detailed results in Table 2, show that an accuracy of 0.82 can be obtained with the 4.8 Hz stimulus, along with an AUC of 0.92. The AUC value can be interpreted as the probability of obtaining a higher score for a positive sample than for a negative one, indeed indicating the misclassification probability. In the case of 16 Hz stimulus, the AUC rises up to 0.95, concluding that the most discriminative patterns were found for 16 Hz stimulus in the Beta band.

Moreover, connectivity matrices containing the synchronization measures between each pair of electrodes, compose a connection graph that helps to identify neuroanatomical regions involved in each case. Due to the significance level used and the additional FDR correction, sparse matrices are obtained. This way, the method proposed in this work contributes as a tool to identify synchronized channels that can be related to connections between brain regions. Consequently, the circular correlation used to estimate the phase-synchrony among channels can be effectively used to detect out-of synchrony electrodes, that contribute here to define the outlier (or abnormal sample).

The highest classification performance is obtained for the alpha, beta and gamma bands. The results, as demonstrated later, are highly significant, and therefore we can conclude that there exist connectivity anomalies in the dyslexic brain that the Bayesian classifier can detect. High-frequency patterns (beta, gamma) are relevant for interpreting segment-level representations in the brain [63]; therefore, anomalies in this segmentation would produce cascade effects that would lead into incorrect representation of phonemes in the brain, as suggested by Virtala et al. [64], and inaccurate subsequent predictive processes.

Digging deeper into the matter, Figure 5 depicts the significant inter-channel connections for each stimuli and band. First of all, we find a recurrent interaction between occipital electrodes, such as PO9, PO10, O1, O2 and Oz throughout many bands. These electrodes are very specific to visual processing [65], and could be attributed to differences in visual activity during the auditory stimuli. Many of the other electrodes have been proven to be relevant for current language models [63]. This is the case of the connection between F7/FC5 and P3/CP5, that corresponds to the so-called “dorsal stream” in the Hickock and Poeppel model, that maps acoustic speech signals to frontal lobe articulatory networks. This pattern is present in Theta and Alpha for both stimuli, and Beta/Gamma for the 16 Hz stimulus. The fact that the 16 Hz show additional differences in the Beta/Gamma may be evidence of phonemic-rate stimuli triggering abnormal functioning of higher cognitive tasks.

P3 and CP5 reception fields are both around the temporo-parietal junction, and differences in patterns might be due to differences in placement of the electrodes. However, is fair to assume that they represent the sensorimotor interface in the Hickock and Poeppel model. Similarly, F7 and FC5 might be capturing the activity of the Broca’s area, which is considered an articulatory region in the Hickock and Poeppel model. Occipital-frontal connections such as PO9 to Fz and F4 might be due to the hypothesized conceptual networks, distributed throughout the cortex, that connect lexical and articulatory regions.

On the other hand, the right hemisphere display less, but still relevant anomalies detected by the analysis. There are significant anomalies at the CP6-C4 in the alpha band for both stimuli. wich may correspond to a connection between the right auditory cortex and the right sensorimotor cortex. This deficit has not been found anywhere else. However, it may be coherent with recent research that found atypical rhythmic entrainment in auditory and sensorimotor coupling at low rates [66]. Finally, two interesting differences are found in bilateral connections at the gamma band. First, the P4-F7 anomaly which corresponds to the main result of Molinaro [22] for dyslexia. In that paper, the between the right auditory cortex (P4) Then, the P7-T8 deficit, which involves an anomaly in the connection between the right phonological network (T8) and the right lexical interface on the Hickock and Poeppel model [63].

Regarding the discriminative capability of the proposed method, it has been assessed using the Receiver Operating Curves (ROC), that provide a clearer view of the trade-off between sensitivity and specificity. Hence, ROC curves for all bands are shown in Figure 7 for (a) 4.8 Hz stimulus and (b) 16 Hz stimulus. In both cases a good sensitivity-specificity balance is achieved.

The number of available EEG studies using auditory stimulus for dyslexia diagnosis is limited, and usually focused to exploratory analysis [22,24], in the quest for the biological basis of dyslexia, which is an open research question. Dyslexia classification using biomedical signals is also a current research trend, oppositely to the traditional method based only on behavioural tests. Works using structural imaging [67], MEG [68] and EEG [27,28,29] signals are shown in Table 3. As in [68], the classification problem with biomedical signals is usually addressed using different features extracted from MEG signals, obtaining accuracy values up to 0.93. However, it requires a 255-channel MEG acquisition system. At the same time, MEG data is typically harder to obtain than EEG and more computationally demanding to work with. On the other hand, there are works that use EEG signals for dyslexia classification and obtaining accuracy values up to 0.80 [27]. However, these works usually rely on reading and writing tasks, depending on their own acquired skills and limiting the diagnosis age. The methodology used in this work, based on the analysis of the responses to simple auditory stimuli, overcomes these limitations, makes the test simpler and avoids the possible bias introduced by interactive stimuli or specific task design. On the other hand, the proposed method contributes to the knowledge of the neural basis of dyslexia by showing differential functional connectivity patterns when comparing controls and dyslexic subjects. This has been shown as connectivity matrices to clearly expose the most significant connections. Additionally, our classification results are similar to those obtained with MEG signals, but can be performed with EEG equipment (that can be even portable), facilitating the development of the tasks and their application to young children. This is especially important to our future work continuing this research with pre-readers.

7. Conclusions

In this work, we present a method to detect out-of-synchrony EEG channels using cross frequency coupling measures. This measure is derived from the instantaneous phase of the signal acquired from each electrode. In this way, cross frequency coupling techniques provide the way to compute features related to synchronization between electrodes and taking into account the frequency band in which the synchronization takes place. Our proposed method uses the circular correlation of the unwrapped phase to evaluate phase locking between channels, which is further used to construct a connectivity matrix, indicating those electrodes that are synchronized at each frequency band. Differential connectivity patterns demonstrate that low level auditory stimulus generates different interactions between brain areas in controls and dyslexic children, revealing differences in auditory processing from a connectivity point of view. On the other hand, the low SNR of EEG signals and the inherent differences among subjects introduce a high variability among EEG patterns, especially in experimental subjects. This has been addressed by an anomaly detection approach, modelling only patterns from the control subjects to further detect patterns of experimental subjects as outliers. The proposed anomaly detection system is based on vector quantization using a Self-Organizing Map, by means of the quantization error associated with each BMU. This allows to measure differences in the quantization error provided by the prototypes representing control and experimental samples. Additionally, the threshold in the quantization error that optimally separates between the controls and experimental patterns is computed using a Bayesian method to constitute a Naïve Bayes classifier. The results obtained show sensitivity values up to 95% with specificity values up to 83% (AUC up to 0.95), demonstrating the discriminative capabilities of the proposed method to detect out-of-synchrony channels. Thus, it clearly differentiates connectivity patterns from dyslexia subjects from controls, which paves the way to construct an effective and objective computer-aided diagnosis tool.

It is worth noting that this study is based on non-speech, non-interactive stimuli, whose properties are derived from the Spanish language (related to the different language segmentation tasks developed in the brain during language processing). Thus, the study is limited to Spanish speakers and could not be generalized to other languages. On the other hand, the subject sample (7 years old children) in this study were all Spanish speakers from schools of Junta de Andalucía (Andalucía region in the south of Spain). In this way, the extent of the results are limited to the sample diversity.

As a future work, we plan to use a high-density EEG searching for additional low-level interactions among brain areas that could be interesting from an exploratory point of view.

Author Contributions

Conceptualization, A.O. and J.L.L.; methodology A.O., M.A.F. and F.J.M.-M.; software, M.A.F. and N.G.; validation, A.O., M.A.F. and N.G.; writing-review and editing, M.A.F., A.O. and F.J.M.-M.; supervision, A.O. and F.J.M.-M.; project administration, A.O.; funding acquisition, A.O. and J.L.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by projects PGC2018-098813-B-C32 (Spanish “Ministerio de Ciencia, Innovación y Universidades”) and UMA20-FEDERJA-086 (Consejería de econnomía y conocimiento, Junta de Andalucía) and by European Regional Development Funds (ERDF). Marco A. Formoso Grant PRE2019-087350 funded by MCIN/AEI/10.13039/501100011033 by “ESF Investing in your future”.

Institutional Review Board Statement

The study was carried out with the understanding and written consent of each child’s legal guardian and in the presence thereof, and was approved by the Medical Ethical Committee of the Malaga University (ref. 16-2020-H) and according to the dispositions of the World Medical Association Declaration of Helsinki.

Informed Consent Statement

Data used in this work was provided by the Leeduca research group at the University of Málaga, and obtained as a result of experiments carried out with the understanding and written consent of each child’s legal guardian and in the presence thereof, and was approved by the Medical Ethical Committee of the Málaga University (ref. CEUMA 16-2020-H) and according to the dispositions of the World Medical Association Declaration of Helsinki. It was also supported by the Education Office of the regional government of Andalusia (Spain), which granted our researchers permission to carry out the study in different public schools.

Acknowledgments

This work was supported by projects PGC2018-098813-B-C32 (Spanish “Ministerio de Ciencia, Innovación y Universidades”), by European Regional Development Funds (ERDF), and the BioSiP (TIC-251) research group. Work by F.J.M.-M. was supported by the MICINN “Juan de la Cierva-Incorporación” Fellowship. We also thank the Leeduca research group and Junta de Andalucía for the data supplied and the support. Funding for open access charge: Spanish “Ministerio de Ciencia, Innovación y Universidades”, and European Regional Development Funds (ERDF).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

EEG	Electroencephalography
DD	Developmental Dyslexia
MEG	Magnetoencephalography
SNR	Signal to Noise Ratio
ICA	Independent Component Analysis
SC	Spectral Coherence
CFC	Cross Frequency Coupling
PAC	Phase Amplitude Coupling
LI	Language Impairment
SSD	Speech Sound Disorder
ADHD	Attention Deficit Hyperactivity Disorder
IIR	Infinite Impulse Response
FIR	Finite Impulse Response
HT	Hilbert Transform
PLV	Phase Locking Value
SOM	Self-Organizing Map
BUM	Best Matching Unit
ROC	Receiver Operating Curves
AUC	Area Under ROC Curve

References

Bell, M.A.; Cuevas, K. Using EEG to Study Cognitive Development: Issues and Practices. J. Cogn. Dev. 2012, 13, 281–294. [Google Scholar] [CrossRef] [Green Version]
Mammone, N.; Bonanno, L.; Salvo, S.D.; Marino, S.; Bramanti, P.; Bramanti, A.; Morabito, F.C. Permutation Disalignment Index as an Indirect, EEG-Based, Measure of Brain Connectivity in MCI and AD Patients. Int. J. Neural Syst. 2017, 27, 1750020. [Google Scholar] [CrossRef] [PubMed]
Mirzaei, G.; Adeli, A.; Adeli, H. Imaging and machine learning techniques for diagnosis of Alzheimer’s disease. Rev. Neurosci. 2016, 27, 857–870. [Google Scholar] [CrossRef]
Gálvez, G.; Recuero, M.; Canuet, L.; Del-Pozo, F. Short-Term Effects of Binaural Beats on EEG Power, Functional Connectivity, Cognition, Gait and Anxiety in Parkinson’s Disease. Int. J. Neural Syst. 2018, 28, 1750055. [Google Scholar] [CrossRef]
Sushkova, O.S.; Morozov, A.A.; Gabova, A.V.; Karabanov, A.V.; Illarioshkin, S.N. A Statistical Method for Exploratory Data Analysis Based on 2D and 3D Area under Curve Diagrams: Parkinson’s Disease Investigation. Sensors 2021, 21, 4700. [Google Scholar] [CrossRef]
Adeli, H.; Zhou, Z.; Dadmehr, N. Analysis of EEG records in an epileptic patient using wavelet transform. J. Neurosci. Methods 2003, 123, 69–87. [Google Scholar] [CrossRef]
Smith, S.J.M. EEG in the diagnosis, classification, and management of patients with epilepsy. J. Neurol. Neurosurg. Psychiatry 2005, 76, ii2–ii7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Aileni, R.M.; Pasca, S.; Florescu, A. EEG-Brain Activity Monitoring and Predictive Analysis of Signals Using Artificial Neural Networks. Sensors 2020, 20, 3346. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Tong, S.; Liu, D.; Gai, Y.; Wang, X.; Wang, J.; Qiu, Y.; Zhu, Y. Abnormal EEG complexity in patients with schizophrenia and depression. Clin. Neurophysiol. 2008, 119, 1232–1241. [Google Scholar] [CrossRef]
Peterson, R.; Pennington, B. Developmental Dyslexia. Lancet 2012, 379, 1997–2007. [Google Scholar] [CrossRef] [Green Version]
Thompson, P.A.; Hulme, C.; Nash, H.M.; Gooch, D.; Hayiou-Thomas, E.; Snowling, M.J. Developmental dyslexia: Predicting individual risk. J. Child Psychol. Psychiatry 2015, 56, 976–987. [Google Scholar] [CrossRef] [Green Version]
Braun, U.; Muldoon, S.; Bassett, D. On Human Brain Networks in Health and Disease. eLS 2015, 1–9. [Google Scholar] [CrossRef]
Munilla, J.; Ortiz, A.; Górriz, J.M.; Ramírez, J. Construction and Analysis of Weighted Brain Networks from SICE for the Study of Alzheimer’s Disease. Front. Neuroinform. 2017, 11, 19. [Google Scholar] [CrossRef] [PubMed]
Ortiz, A.; Munilla, J.; Górriz, J.M.; Ramírez, J. Ensembles of Deep Learning Architectures for the Early Diagnosis of the Alzheimer’s Disease. Int. J. Neural Syst. 2016, 26, 1650025. [Google Scholar] [CrossRef] [PubMed]
Popering, L.V.; Tahmassebi, A.; Meyer-Baese, U.; Dyrba, M.; Munilla, J.; Ortiz, A.; Meyer-Baese, A. Identifying the diffusion source of dementia spreading in structural brain networks. In Medical Imaging 2021: Biomedical Applications in Molecular, Structural, and Functional Imaging; Gimi, B.S., Krol, A., Eds.; International Society for Optics and Photonics, SPIE: Cardiff, UK, 2021; Volume 11600, pp. 58–63. [Google Scholar]
Chaturvedi, M.; Bogaarts, J.G.; Kozak (Cozac), V.V.; Hatz, F.; Gschwandtner, U.; Meyer, A.; Fuhr, P.; Roth, V. Phase lag index and spectral power as QEEG features for identification of patients with mild cognitive impairment in Parkinson’s disease. Clin. Neurophysiol. 2019, 130, 1937–1944. [Google Scholar] [CrossRef]
Hata, M.; Kazui, H.; Tanaka, T.; Ishii, R.; Canuet, L.; Pascual-Marqui, R.D.; Aoki, Y.; Ikeda, S.; Kanemoto, H.; Yoshiyama, K.; et al. Functional connectivity assessed by resting state EEG correlates with cognitive decline of Alzheimer’s disease—An eLORETA study. Clin. Neurophysiol. 2016, 127, 1269–1278. [Google Scholar] [CrossRef] [Green Version]
Huang, H.; Zhang, J.; Zhu, L.; Tang, J.; Lin, G.; Kong, W.; Lei, X.; Zhu, L. EEG-Based Sleep Staging Analysis with Functional Connectivity. Sensors 2021, 21, 1988. [Google Scholar] [CrossRef]
Daianu, M.; Jahanshad, N.; Nir, T.; Toga, A.; Jack, C.; Weiner, M.; Thompson, P. Breakdown of Brain Connectivity Between Normal Aging and Alzheimer’s Disease: A Structural k -Core Network Analysis. Brain Connect. 2013, 3, 407–422. [Google Scholar] [CrossRef] [Green Version]
Romeo, R.R.; Segaran, J.; Leonard, J.A.; Robinson, S.T.; West, M.R.; Mackey, A.P.; Yendiki, A.; Rowe, M.L.; Gabrieli, J.D. Language Exposure Relates to Structural Neural Connectivity in Childhood. J. Neurosci. 2018, 38, 7870–7877. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cohen, M.X. Analyzing Neural Time Series Data: Theory and Practice; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
Molinaro, N.; Lizarazu, M.; Lallier, M.; Bourguignon, M.; Carreiras, M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum. Brain Mapp. 2016, 37, 2767–2783. [Google Scholar] [CrossRef]
Flanagan, S.; Goswami, U. The role of phase synchronisation between low frequency amplitude modulations in child phonology and morphology speech tasks. J. Acoust. Soc. Am. 2018, 143, 1366–1375. [Google Scholar] [CrossRef]
Di Liberto, G.; Peter, V.; Kalashnikova, M.; Goswami, U.; Burnham, D.; Lalor, E. Atypical cortical entrainment to speech in the right hemisphere underpins phonemic deficits in dyslexia. NeuroImage 2018, 175, 70–79. [Google Scholar] [CrossRef] [PubMed]
Power, A.J.; Mead, N.; Barnes, L.; Goswami, U. Neural entrainment to rhythmic speech in children with developmental dyslexia. Front. Hum. Neurosci. 2013, 7, 777. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gibbon, S.; Attaheri, A.; Choisdealbha, Á.N.; Rocha, S.; Brusini, P.; Mead, N.; Boutris, P.; Olawole-Scott, H.; Ahmed, H.; Flanagan, S.; et al. Machine learning accurately classifies neural responses to rhythmic speech vs. non-speech from 8-week-old infant EEG. Brain Lang. 2021, 220, 104968. [Google Scholar] [CrossRef]
Perera, H.; Shiratuddin, M.F.; Wong, K.W.; Fullarton, K. EEG signal analysis of writing and typing between adults with dyslexia and normal controls. Int. J. Interact. Multimed. Artif. Intell. 2018, 5, 62. [Google Scholar] [CrossRef] [Green Version]
Ortiz, A.; López, P.; Luque, J.L.; Martínez-Murcia, F.J.; Aquino-Britez, D.; Ortega, J. An anomaly detection approach for dyslexia diagnosis using EEG signals. In Proceedings of the International Work—Conference on the Interplay between Natural and Artificial Computation, Almería, Spain, 3–7 June 2019; Springer: London, UK, 2019; pp. 369–378. [Google Scholar]
Martínez-Murcia, F.J.; Ortiz, A.; Morales-Ortega, R.; López, P.; Luque, J.L.; Castillo-Barnes, D.; Segovia, F.; Illan, I.A.; Ortega, J.; Ramirez, J.; et al. Periodogram connectivity of EEG signals for the detection of dyslexia. In Proceedings of the International Work—Conference on the Interplay between Natural and Artificial Computation, Almería, Spain, 3–7 June 2019; Springer: London, UK, 2019; pp. 350–359. [Google Scholar]
Martinez-Murcia, F.J.; Ortiz, A.; Gorriz, J.M.; Ramirez, J.; Lopez-Abarejo, P.J.; Lopez-Zamora, M.; Luque, J.L. EEG Connectivity Analysis Using Denoising Autoencoders for the Detection of Dyslexia. Int. J. Neural Syst. 2020, 30, 2050037. [Google Scholar] [CrossRef] [PubMed]
Riaz, F.; Hassan, A.; Rehman, S.; Niazi, I.K.; Dremstrup, K. EMD-Based Temporal and Spectral Features for the Classification of EEG Signals Using Supervised Learning. IEEE Trans. Neural Syst. Rehabil. Eng. 2016, 24, 28–35. [Google Scholar] [CrossRef]
Boashash, B. Chapter 16—Time-Frequency Methodologies in Neurosciences. In Time-Frequency Signal Analysis and Processing, 2nd ed.; Academic Press: Oxford, UK, 2016; pp. 915–966. [Google Scholar]
Unde, S.A.; Shriram, R. Coherence Analysis of EEG Signal Using Power Spectral Density. In Proceedings of the 2014 Fourth International Conference on Communication Systems and Network Technologies, Bhopal, India, 7–9 April 2014; pp. 871–874. [Google Scholar]
Munia, T.T.K.; Aviyente, S. Time-Frequency Based Phase-Amplitude Coupling Measure For Neuronal Oscillations. Sci. Rep. 2019, 9, 1–15. [Google Scholar] [CrossRef] [Green Version]
Giménez, A.; Luque, J.L.; López-Zamora, M.; Fernández-Navas, M. A self-report questionnaire on reading-writing difficulties for adults. [Autoinforme de Trastornos Lectores para AdultoS (ATLAS)]. An. Psicol./Ann. Psychol. 2015, 31, 109–119. [Google Scholar] [CrossRef] [Green Version]
De Vos, A.; Vanvooren, S.; Vanderauwera, J.; Ghesquière, P.; Wouters, J. A longitudinal study investigating neural processing of speech envelope modulation rates in children with (a family risk for) dyslexia. Cortex 2017, 93, 206–219. [Google Scholar] [CrossRef]
Li, R.; Principe, J.C. Blinking Artifact Removal in Cognitive EEG Data Using ICA. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 30 August–3 September 2006; pp. 5273–5276. [Google Scholar]
Robertson, D.; Dowling, J. Design and responses of Butterworth and critically damped digital filters. J. Electromyogr. Kinesiol. Off. J. Int. Soc. Electrophysiol. Kinesiol. 2004, 13, 569–573. [Google Scholar] [CrossRef]
Jiménez-Bravo, M.; Marrero, V.; Benítez-Burraco, A. An oscillopathic approach to developmental dyslexia: From genes to speech processing. Behav. Brain Res. 2017, 329, 84–95. [Google Scholar] [CrossRef] [PubMed]
Mormann, F.; Lehnertz, K.; David, P.; Elger, C.E. Mean phase coherence as a measure for phase synchronization and its application to the EEG of epilepsy patients. Phys. D Nonlinear Phenom. 2000, 144, 358–369. [Google Scholar] [CrossRef]
Fraiwan, L.; Lweesy, K.; Khasawneh, N.; Wenz, H.; Dickhaus, H. Automated sleep stage identification system based on time–frequency analysis of a single EEG channel and random forest classifier. Comput. Methods Programs Biomed. 2012, 108, 10–19. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. London Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [Green Version]
Lachaux, J.P.; Rodriguez, E.; Martinerie, J.; Varela, F.J. Measuring phase synchrony in brain signals. Hum. Brain Mapp. 1999, 8, 194–208. [Google Scholar] [CrossRef] [Green Version]
Burgess, A. On the Interpretation of Synchronization in EEG Hyperscanning Studies: A Cautionary Note. Front. Hum. Neurosci. 2013, 7, 881. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rothmaler, K.; Ivanova, G. Circular Correlation Coefficients versus the Phase-Locking-Value. Biomed. Tech. Biomed. Eng. 2013, 58. [Google Scholar] [CrossRef]
Jammalamadaka, S.R.; SenGupta, A. Topics in Circular Statistics, 1st ed.; World Scientific: Singapore, 2016. [Google Scholar] [CrossRef]
Golub, T.R.; Slonim, D.K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.P.; Coller, H.; Loh, M.L.; Downing, J.R.; Caligiuri, M.A.; et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999, 286, 531–537. [Google Scholar] [CrossRef] [Green Version]
Lawrence, S.; Giles, C.L.; Tsoi, A.C.; Back, A.D. Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 1997, 8, 98–113. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Betti, A.; Tucci, M.; Crisostomi, E.; Piazzi, A.; Barmada, S.; Thomopulos, D. Fault Prediction and Early-Detection in Large PV Power Plants Based on Self-Organizing Maps. Sensors 2021, 21, 1687. [Google Scholar] [CrossRef] [PubMed]
Cai, W.; Zhao, D.; Zhang, M.; Xu, Y.; Li, Z. Improved Self-Organizing Map-Based Unsupervised Learning Algorithm for Sitting Posture Recognition System. Sensors 2021, 21, 6246. [Google Scholar] [CrossRef]
Vesanto, J.; Alhoniemi, E. Clustering of the self-organizing map. IEEE Trans. Neural Netw. 2000, 11, 586–600. [Google Scholar] [CrossRef] [PubMed]
De la Hoz, E.; De La Hoz, E.; Ortiz, A.; Ortega, J.; Prieto, B. PCA filtering and probabilistic SOM for network intrusion detection. Neurocomputing 2015, 164, 71–81. [Google Scholar] [CrossRef]
Vettigli, G. MiniSom: Minimalistic and NumPy-Based Implementation of the Self Organizing Map. Available online: https://github.com/JustGlowing/minisom/ (accessed on 10 October 2021).
Kohonen, T. Self-Organizing Maps; Springer: London, UK, 2001. [Google Scholar]
Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
John, G.; Langley, P. Estimating Continuous Distributions in Bayesian Classifiers. arXiv 2013, arXiv:1302.4964. [Google Scholar]
Bhattacharyya, S.; Khasnobish, A.; Konar, A.; Tibarewala, D.N.; Nagar, A.K. Performance analysis of left/right hand movement classification from EEG signal by intelligent algorithms. In Proceedings of the 2011 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain (CCMB), Paris, France, 11–15 April 2011; pp. 1–8. [Google Scholar] [CrossRef]
Čukić, M.; Stokić, M.; Simic, S.; Pokrajac, D. The successful discrimination of depression from EEG could be attributed to proper feature extraction and not to a particular classification method. Cogn. Neurodyn. 2020, 14, 443–455. [Google Scholar] [CrossRef] [Green Version]
Siuly, S.; Wang, H.; Zhang, Y. Detection of motor imagery EEG signals employing Naïve Bayes based learning process. Measurement 2016, 86, 148–158. [Google Scholar] [CrossRef]
Duin, R.P. Classifiers in almost empty spaces. In Proceedings of the 15th International Conference on Pattern Recognition, ICPR-2000, Barcelona, Spain, 3–7 September 2000; Volume 2, pp. 1–7. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Hickok, G.; Poeppel, D. The Cortical Organization of Speech Processing. Nat. Rev. Neurosci. 2007, 8, 393–402. [Google Scholar] [CrossRef]
Virtala, P.; Talola, S.; Partanen, E.; Kujala, T. Poor neural and perceptual phoneme discrimination during acoustic variation in dyslexia. Sci. Rep. 2020, 10, 1–11. [Google Scholar] [CrossRef]
Giehl, J.; Noury, N.; Siegel, M. Dissociating harmonic and non-harmonic phase-amplitude coupling in the human brain. NeuroImage 2021, 227, 117648. [Google Scholar] [CrossRef] [PubMed]
Colling, L.J.; Noble, H.L.; Goswami, U. Neural entrainment and sensorimotor synchronization to the beat in children with developmental dyslexia: An EEG study. Front. Neurosci. 2017, 11, 360. [Google Scholar] [CrossRef] [PubMed]
Tamboer, P.; Vorst, H.; Ghebreab, S.; Scholte, H. Machine learning and dyslexia: Classification of individual structural neuro-imaging scans of students with and without dyslexia. Neuroimage Clin. 2016, 11, 508–514. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dimitriadis, S.I.; Simos, P.G.; Fletcher, J.M.; Papanicolaou, A.C. Aberrant resting-state functional brain networks in dyslexia: Symbolic mutual information analysis of neuromagnetic signals. Int. J. Psychophysiol. 2018, 126, 20–29. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Electrode montage in the extended 10–20 system used in the experiments. All 32 channels plus GND. Cz is used as reference.

Figure 2. Five main electrophysiological (EEG) frequency bands [39].

Figure 3. Quantization error distribution for Controls and Dyslexia subjects. The (a) 4.8 Hz and (b) 16 Hz stimulus.

Figure 4. SOM map activation for each data sample, according to the corresponding mean quantization error. Example for EEG Delta band using 4.8 Hz stimulus. (a) Control (b) Dyslexia.

Figure 5. Statistically significant connections for meaningful bands. The axes represent the channels names according to the extended 10–20 EEG montage. Only connections with FDR-corrected

p < 0.01

are shown. The (a) 4.8Hz and (b)16 Hz stimulus.

Figure 5. Statistically significant connections for meaningful bands. The axes represent the channels names according to the extended 10–20 EEG montage. Only connections with FDR-corrected

p < 0.01

are shown. The (a) 4.8Hz and (b)16 Hz stimulus.

Figure 6. Classification results for 4.8 Hz (left) and 16 Hz (right).

Figure 7. ROC Curves corresponding to the classification results for (a) 4.8 Hz and (b) 16 Hz stimulus.

Figure 8. Permutation test results to assess the statistical significance of the results. Blue distribution corresponds to the null distribution and red line corresponds to the accuracy obtained with true labels. The (a) 4.8 Hz stimulus and (b) 16 Hz stimulus.

Table 1. Database. Age range: 88–100 (

t (1) = - 1.4, ° p > 0.05

).

Table 1. Database. Age range: 88–100 (

t (1) = - 1.4, ° p > 0.05

).

Group	Male/Female	Mean Age (Months)	Observations
Control	17/15	$94.1 \pm 3.3$	No reported reading or spelling difficulties
Dyslexia	7/9	$95.6 \pm 2.9$	Formal diagnosis by a clinician expert

Table 2. Summary of the best classification results obtained for the different EEG bands. Bands providing the best accuracy values are highlighted in bold.

Stimulus	Band	Accuracy	Sensitivity	Specificity	AUC
4.8 Hz	Delta	0.74 ± 0.03	0.81 ± 0.03	0.65 ± 0.06	0.81 ± 0.05
	Theta	0.85 ± 0.04	0.91 ± 0.03	0.80 ± 0.08	0.88 ± 0.05
	Alpha	0.76 ± 0.04	0.82 ± 0.02	0.70 ± 0.09	0.85 ± 0.08
	Beta	0.86 ± 0.03	0.92 ± 0.02	0.78 ± 0.06	0.92 ± 0.08
	Gamma	0.83 ± 0.02	0.87 ± 0.02	0.76 ± 0.04	0.83 ± 0.05
16 Hz	Delta	0.73 ± 0.04	0.78 ± 0.03	0.67 ± 0.08	0.86 ± 0.07
	Theta	0.76 ± 0.04	0.79 ± 0.02	0.71 ± 0.08	0.91 ± 0.05
	Alpha	0.90 ± 0.02	0.93 ± 0.01	0.82 ± 0.04	0.93 ± 0.09
	Beta	0.90 ± 0.02	0.93 ± 0.02	0.86 ± 0.03	0.95 ± 0.09
	Gamma	0.79 ± 0.01	0.82 ± 0.02	0.74 ± 0.03	0.93 ± 0.10

Table 3. Classification results for various studies related to dyslexia classification. Standard deviation is indicated in each case. (*) data not reported in the source.

Method	Channels	Acq.Time	Accuracy	Sensitivity	Specificity	AUC
MRI + SVC [67]	T1-MRI	*	0.8 ± *	0.82 ± *	0.78 ± *	*
MEG + SVC + GC [68]	253	3 min	0.63 ± 4.13	0.64 ± 4.01	0.65 ± 4.15	*
MEG + SVC + GE [68]	253	3 min	0.94 ± 1.78	0.93 ± 1.39	0.93 ± 2.32	*
MEG + SVC + CI [68]	253	3 min	0.80 ± 1.14	0.80 ± 1.41	0.79 ± 2.17	*
MEG + SVC + wIFCG [68]	253	3 min	0.97 ± 1.89	0.96 ± 1.89	0.95 ± 1.98	*
EEG + SVC [27] (Writing Task)	32	1 min	0.59 ± *	0.64 ± *	0.53 ± *	*
EEG + SVC [27] (Typing Task)	32	1 min	0.78 ± *	0.88 ± *	0.66 ± *	*
EEG + OCSVC [28]	32	5 min	0.71 ± *	0.53 ± *	0.78 ± *	0.79 ± *
EEG + DAE [30]	32	5 min	0.56 ± *	0.76 ± *	0.66 ± *	0.74 ± *
Proposed	32	5 min	0.90 ± 0.02	0.93 ± 0.02	0.86 ± 0.03	0.95 ± 0.09

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Formoso, M.A.; Ortiz, A.; Martinez-Murcia, F.J.; Gallego, N.; Luque, J.L. Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis. Sensors 2021, 21, 7061. https://0-doi-org.brum.beds.ac.uk/10.3390/s21217061

AMA Style

Formoso MA, Ortiz A, Martinez-Murcia FJ, Gallego N, Luque JL. Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis. Sensors. 2021; 21(21):7061. https://0-doi-org.brum.beds.ac.uk/10.3390/s21217061

Chicago/Turabian Style

Formoso, Marco A., Andrés Ortiz, Francisco J. Martinez-Murcia, Nicolás Gallego, and Juan L. Luque. 2021. "Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis" Sensors 21, no. 21: 7061. https://0-doi-org.brum.beds.ac.uk/10.3390/s21217061

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Detecting Phase-Synchrony Connectivity Anomalies in EEG Signals. Application to Dyslexia Diagnosis

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Data Preprocessing

3. Functional Connectivity from EEG Signals

3.1. Phase-Based Connectivity

Hilbert Filter

3.2. Channel Synchronization by Pearson’s Circular Correlation

4. Diagnosing Dyslexia by Outlier Detection

4.1. Outlier Detection Based on Self-Organizing Maps

4.1.1. Band Relevance Using Quantization Error Distribution

4.1.2. Uncertainly in SOM Units Activation

4.2. Bayesian Anomaly Detection for SOM

5. Results

Statistical Significance

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI