entropy-logo

Journal Browser

Journal Browser

Entropy in Data Analysis

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: closed (28 February 2021) | Viewed by 29074

Special Issue Editors

Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02129, USA
Interests: biomedical signal processing; nonlinear analysis; brain–computer interface; machine learning
Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
Interests: Alzheimer's disease; biomedical signal processing; cardiovascular dynamics; fractal physiology; healthy aging; sleep
Special Issues, Collections and Topics in MDPI journals
Computational NeuroEngineering Lab, University of Florida, Gainesville, FL 32611, USA
Interests: information theoretic learning; kernel methods; adaptive signal processing; brain machine interfaces
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Entropy is a powerful nonlinear metric widely used to assess the dynamical characteristics of data. A number of methods, such as sample entropy, fuzzy entropy, permutation entropy, distribution entropy, and dispersion entropy, have been introduced to quantify the irregularity or uncertainty of signals and images. Their multiscale extensions have been developed to quantify the complexity of data to deal with the multiple time scales inherent in such signals and images. For a better understanding of the underlying signal-generating system, multivariate multiscale entropy methods have also been proposed to take into account both the time and spatial domains at the same time.

These entropy approaches have been used in a wide range of real-world applications ranging from neuroscience and biomedical engineering to mechanical and financial studies. In particular, they have been successfully used for physiological signals, such as electrocardiograms (ECG), electroencephalograms (EEG), electromyograms (EMG), electrooculograms (EOG), gait fluctuations, and respiratory signals to help the diagnosis of different diseases such as Alzheimer’s disease, Parkinson’s disease, ALS, and ataxia.

The main goal of this Special Issue is to disseminate new and original research based on entropy analyses in order to assist in a better understanding of the physiology and data-generating mechanism, early diagnosing disorders or diseases, treatment monitoring, and planning healthcare strategies, required to prevent the occurrences of certain pathologies. Another goal is dealing with practical challenges while using these entropy-based approaches such as the effect of various noises, the quantization influences, the lengths of data, or parameters tuning. This Special Issue also seeks contributions for signal analysis based on correntropy, mutual information, divergences, and so on, which can capture higher-order statistics and the information content of signals.

Potential topics include but are not limited to the following:

  • Spectral, sample, fuzzy, permutation, distribution, dispersion, and fluctuation dispersion entropies;
  • Kullback–Leibler divergence (relative entropy), correntropy, and causality analysis;
  • Analysis of physiological signals at multiple temporal, frequency, and spatial scales (ECG, EEG, EMG, MEG, etc.);
  • Underlying mechanisms behind entropy-based results used for physiological data to improve our understanding of the disease diagnosis/pathogenesis/progression;
  • Psychophysiological signals (physical/mental/emotional analysis), especially in newborns and the elderly;
  • Univariate and multivariate multiscale entropy and complexity loss theory for the diagnosing diseases and monitoring treatments;
  • Complexity loss theory in different diseases and disorders, especially dementia, epilepsy, and sleep disorders;
  • Practical considerations: data length, embedding dimension, time delay, noise power, and signal modality characterization for health;
  • Two- and three-dimensional entropy methods for image analysis;
  • Mechanical and financial applications of entropy methods.

Prof. Jose C. Principe
Dr. Hamed Azami
Dr. Peng Li
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Related Special Issue

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 1677 KiB  
Article
Bearing Fault Diagnosis Using Refined Composite Generalized Multiscale Dispersion Entropy-Based Skewness and Variance and Multiclass FCM-ANFIS
by Mostafa Rostaghi, Mohammad Mahdi Khatibi, Mohammad Reza Ashory and Hamed Azami
Entropy 2021, 23(11), 1510; https://0-doi-org.brum.beds.ac.uk/10.3390/e23111510 - 14 Nov 2021
Cited by 15 | Viewed by 2102
Abstract
Bearing vibration signals typically have nonlinear components due to their interaction and coupling effects, friction, damping, and nonlinear stiffness. Bearing faults affect the signal complexity at various scales. Hence, measuring signal complexity at different scales is helpful to diagnosis of bearing faults. Numerous [...] Read more.
Bearing vibration signals typically have nonlinear components due to their interaction and coupling effects, friction, damping, and nonlinear stiffness. Bearing faults affect the signal complexity at various scales. Hence, measuring signal complexity at different scales is helpful to diagnosis of bearing faults. Numerous studies have investigated multiscale algorithms; nevertheless, multiscale algorithms using the first moment lose important complexity data. Accordingly, generalized multiscale algorithms have been recently introduced. The present research examined the use of refined composite generalized multiscale dispersion entropy (RCGMDispEn) based on the second moment (variance) and third moment (skewness) along with refined composite multiscale dispersion entropy (RCMDispEn) in bearing fault diagnosis. Moreover, multiclass FCM-ANFIS, which is a combination of adaptive network-based fuzzy inference systems (ANFIS), was developed to improve the efficiency of rotating machinery fault classification. According to the results, it is recommended that generalized multiscale algorithms based on variance and skewness be examined for diagnosis, along with multiscale algorithms, and be used to achieve an improvement in the results. The simultaneous usage of the multiscale algorithm and generalized multiscale algorithms improved the results in all three real datasets used in this study. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

19 pages, 4078 KiB  
Article
Discrimination of Patients with Varying Degrees of Coronary Artery Stenosis by ECG and PCG Signals Based on Entropy
by Huan Zhang, Xinpei Wang, Changchun Liu, Yuanyang Li, Yuanyuan Liu, Yu Jiao, Tongtong Liu, Huiwen Dong and Jikuo Wang
Entropy 2021, 23(7), 823; https://0-doi-org.brum.beds.ac.uk/10.3390/e23070823 - 28 Jun 2021
Cited by 5 | Viewed by 2020
Abstract
Coronary heart disease (CHD) is the leading cause of cardiovascular death. This study aimed to propose an effective method for mining cardiac mechano-electric coupling information and to evaluate its ability to distinguish patients with varying degrees of coronary artery stenosis (VDCAS). Five minutes [...] Read more.
Coronary heart disease (CHD) is the leading cause of cardiovascular death. This study aimed to propose an effective method for mining cardiac mechano-electric coupling information and to evaluate its ability to distinguish patients with varying degrees of coronary artery stenosis (VDCAS). Five minutes of electrocardiogram and phonocardiogram signals was collected synchronously from 191 VDCAS patients to construct heartbeat interval (RRI)–systolic time interval (STI), RRI–diastolic time interval (DTI), HR-corrected QT interval (QTcI)–STI, QTcI–DTI, Tpeak–Tend interval (TpeI)–STI, TpeI–DTI, Tpe/QT interval (Tpe/QTI)–STI, and Tpe/QTI–DTI series. Then, the cross sample entropy (XSampEn), cross fuzzy entropy (XFuzzyEn), joint distribution entropy (JDistEn), magnitude-squared coherence function, cross power spectral density, and mutual information were applied to evaluate the coupling of the series. Subsequently, support vector machine recursive feature elimination and XGBoost were utilized for feature selection and classification, respectively. Results showed that the joint analysis of XSampEn, XFuzzyEn, and JDistEn had the best ability to distinguish patients with VDCAS. The classification accuracy of severe CHD—mild-to-moderate CHD group, severe CHD—chest pain and normal coronary angiography (CPNCA) group, and mild-to-moderate CHD—CPNCA group were 0.8043, 0.7659, and 0.7500, respectively. The study indicates that the joint analysis of XSampEn, XFuzzyEn, and JDistEn can effectively capture the cardiac mechano-electric coupling information of patients with VDCAS, which can provide valuable information for clinicians to diagnose CHD. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

19 pages, 2344 KiB  
Article
Detection of Coronary Artery Disease Using Multi-Domain Feature Fusion of Multi-Channel Heart Sound Signals
by Tongtong Liu, Peng Li, Yuanyuan Liu, Huan Zhang, Yuanyang Li, Yu Jiao, Changchun Liu, Chandan Karmakar, Xiaohong Liang, Mengli Ren and Xinpei Wang
Entropy 2021, 23(6), 642; https://0-doi-org.brum.beds.ac.uk/10.3390/e23060642 - 21 May 2021
Cited by 13 | Viewed by 2416
Abstract
Heart sound signals reflect valuable information about heart condition. Previous studies have suggested that the information contained in single-channel heart sound signals can be used to detect coronary artery disease (CAD). But accuracy based on single-channel heart sound signal is not satisfactory. This [...] Read more.
Heart sound signals reflect valuable information about heart condition. Previous studies have suggested that the information contained in single-channel heart sound signals can be used to detect coronary artery disease (CAD). But accuracy based on single-channel heart sound signal is not satisfactory. This paper proposed a method based on multi-domain feature fusion of multi-channel heart sound signals, in which entropy features and cross entropy features are also included. A total of 36 subjects enrolled in the data collection, including 21 CAD patients and 15 non-CAD subjects. For each subject, five-channel heart sound signals were recorded synchronously for 5 min. After data segmentation and quality evaluation, 553 samples were left in the CAD group and 438 samples in the non-CAD group. The time-domain, frequency-domain, entropy, and cross entropy features were extracted. After feature selection, the optimal feature set was fed into the support vector machine for classification. The results showed that from single-channel to multi-channel, the classification accuracy has increased from 78.75% to 86.70%. After adding entropy features and cross entropy features, the classification accuracy continued to increase to 90.92%. The study indicated that the method based on multi-domain feature fusion of multi-channel heart sound signals could provide more information for CAD detection, and entropy features and cross entropy features played an important role in it. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

12 pages, 1077 KiB  
Article
Short-Term Effect of Percutaneous Coronary Intervention on Heart Rate Variability in Patients with Coronary Artery Disease
by Chang Yan, Changchun Liu, Lianke Yao, Xinpei Wang, Jikuo Wang and Peng Li
Entropy 2021, 23(5), 540; https://0-doi-org.brum.beds.ac.uk/10.3390/e23050540 - 28 Apr 2021
Cited by 5 | Viewed by 1752
Abstract
Myocardial ischemia in patients with coronary artery disease (CAD) leads to imbalanced autonomic control that increases the risk of morbidity and mortality. To systematically examine how autonomic function responds to percutaneous coronary intervention (PCI) treatment, we analyzed data of 27 CAD patients who [...] Read more.
Myocardial ischemia in patients with coronary artery disease (CAD) leads to imbalanced autonomic control that increases the risk of morbidity and mortality. To systematically examine how autonomic function responds to percutaneous coronary intervention (PCI) treatment, we analyzed data of 27 CAD patients who had admitted for PCI in this pilot study. For each patient, five-minute resting electrocardiogram (ECG) signals were collected before and after the PCI procedure. The time intervals between ECG collection and PCI were both within 24 h. To assess autonomic function, normal sinus RR intervals were extracted and were analyzed quantitatively using traditional linear time- and frequency-domain measures [i.e., standard deviation of the normal-normal intervals (SDNN), the root mean square of successive differences (RMSSD), powers of low frequency (LF) and high frequency (HF) components, LF/HF] and nonlinear entropy measures [i.e., sample entropy (SampEn), distribution entropy (DistEn), and conditional entropy (CE)], as well as graphical metrics derived from Poincaré plot [i.e., Porta’s index (PI), Guzik’s index (GI), slope index (SI) and area index (AI)]. Results showed that after PCI, AI and PI decreased significantly (p < 0.002 and 0.015, respectively) with effect sizes of 0.88 and 0.70 as measured by Cohen’s d static. These changes were independent of sex. The results suggest that graphical AI and PI metrics derived from Poincaré plot of short-term ECG may be potential for sensing the beneficial effect of PCI on cardiovascular autonomic control. Further studies with bigger sample sizes are warranted to verify these observations. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

21 pages, 2661 KiB  
Article
Brain Dynamics Altered by Photic Stimulation in Patients with Alzheimer’s Disease and Mild Cognitive Impairment
by Wei-Yang Yu, Intan Low, Chien Chen, Jong-Ling Fuh and Li-Fen Chen
Entropy 2021, 23(4), 427; https://0-doi-org.brum.beds.ac.uk/10.3390/e23040427 - 04 Apr 2021
Cited by 5 | Viewed by 2638
Abstract
Individuals with mild cognitive impairment (MCI) are at high risk of developing Alzheimer’s disease (AD). Repetitive photic stimulation (PS) is commonly used in routine electroencephalogram (EEG) examinations for rapid assessment of perceptual functioning. This study aimed to evaluate neural oscillatory responses and nonlinear [...] Read more.
Individuals with mild cognitive impairment (MCI) are at high risk of developing Alzheimer’s disease (AD). Repetitive photic stimulation (PS) is commonly used in routine electroencephalogram (EEG) examinations for rapid assessment of perceptual functioning. This study aimed to evaluate neural oscillatory responses and nonlinear brain dynamics under the effects of PS in patients with mild AD, moderate AD, severe AD, and MCI, as well as healthy elderly controls (HC). EEG power ratios during PS were estimated as an index of oscillatory responses. Multiscale sample entropy (MSE) was estimated as an index of brain dynamics before, during, and after PS. During PS, EEG harmonic responses were lower and MSE values were higher in the AD subgroups than in HC and MCI groups. PS-induced changes in EEG complexity were less pronounced in the AD subgroups than in HC and MCI groups. Brain dynamics revealed a “transitional change” between MCI and Mild AD. Our findings suggest a deficiency in brain adaptability in AD patients, which hinders their ability to adapt to repetitive perceptual stimulation. This study highlights the importance of combining spectral and nonlinear dynamical analysis when seeking to unravel perceptual functioning and brain adaptability in the various stages of neurodegenerative diseases. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

11 pages, 1750 KiB  
Article
Gait Stability Measurement by Using Average Entropy
by Han-Ping Huang, Chang Francis Hsu, Yi-Chih Mao, Long Hsu and Sien Chi
Entropy 2021, 23(4), 412; https://0-doi-org.brum.beds.ac.uk/10.3390/e23040412 - 31 Mar 2021
Cited by 3 | Viewed by 1791
Abstract
Gait stability has been measured by using many entropy-based methods. However, the relation between the entropy values and gait stability is worth further investigation. A research reported that average entropy (AE), a measure of disorder, could measure the static standing postural stability better [...] Read more.
Gait stability has been measured by using many entropy-based methods. However, the relation between the entropy values and gait stability is worth further investigation. A research reported that average entropy (AE), a measure of disorder, could measure the static standing postural stability better than multiscale entropy and entropy of entropy (EoE), two measures of complexity. This study tested the validity of AE in gait stability measurement from the viewpoint of the disorder. For comparison, another five disorders, the EoE, and two traditional metrics methods were, respectively, used to measure the degrees of disorder and complexity of 10 step interval (SPI) and 79 stride interval (SI) time series, individually. As a result, every one of the 10 participants exhibited a relatively high AE value of the SPI when walking with eyes closed and a relatively low AE value when walking with eyes open. Most of the AE values of the SI of the 53 diseased subjects were greater than those of the 26 healthy subjects. A maximal overall accuracy of AE in differentiating the healthy from the diseased was 91.1%. Similar features also exists on those 5 disorder measurements but do not exist on the EoE values. Nevertheless, the EoE versus AE plot of the SI also exhibits an inverted U relation, consistent with the hypothesis for physiologic signals. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

19 pages, 2166 KiB  
Article
An Entropy Metric for Regular Grammar Classification and Learning with Recurrent Neural Networks
by Kaixuan Zhang, Qinglong Wang and C. Lee Giles
Entropy 2021, 23(1), 127; https://0-doi-org.brum.beds.ac.uk/10.3390/e23010127 - 19 Jan 2021
Viewed by 2658
Abstract
Recently, there has been a resurgence of formal language theory in deep learning research. However, most research focused on the more practical problems of attempting to represent symbolic knowledge by machine learning. In contrast, there has been limited research on exploring the fundamental [...] Read more.
Recently, there has been a resurgence of formal language theory in deep learning research. However, most research focused on the more practical problems of attempting to represent symbolic knowledge by machine learning. In contrast, there has been limited research on exploring the fundamental connection between them. To obtain a better understanding of the internal structures of regular grammars and their corresponding complexity, we focus on categorizing regular grammars by using both theoretical analysis and empirical evidence. Specifically, motivated by the concentric ring representation, we relaxed the original order information and introduced an entropy metric for describing the complexity of different regular grammars. Based on the entropy metric, we categorized regular grammars into three disjoint subclasses: the polynomial, exponential and proportional classes. In addition, several classification theorems are provided for different representations of regular grammars. Our analysis was validated by examining the process of learning grammars with multiple recurrent neural networks. Our results show that as expected more complex grammars are generally more difficult to learn. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

9 pages, 727 KiB  
Article
Bivariate Entropy Analysis of Electrocardiographic RR–QT Time Series
by Bo Shi, Mohammod Abdul Motin, Xinpei Wang, Chandan Karmakar and Peng Li
Entropy 2020, 22(12), 1439; https://0-doi-org.brum.beds.ac.uk/10.3390/e22121439 - 20 Dec 2020
Cited by 3 | Viewed by 2171
Abstract
QT interval variability (QTV) and heart rate variability (HRV) are both accepted biomarkers for cardiovascular events. QTV characterizes the variations in ventricular depolarization and repolarization. It is a predominant element of HRV. However, QTV is also believed to accept direct inputs from upstream [...] Read more.
QT interval variability (QTV) and heart rate variability (HRV) are both accepted biomarkers for cardiovascular events. QTV characterizes the variations in ventricular depolarization and repolarization. It is a predominant element of HRV. However, QTV is also believed to accept direct inputs from upstream control system. How QTV varies along with HRV is yet to be elucidated. We studied the dynamic relationship of QTV and HRV during different physiological conditions from resting, to cycling, and to recovering. We applied several entropy-based measures to examine their bivariate relationships, including cross sample entropy (XSampEn), cross fuzzy entropy (XFuzzyEn), cross conditional entropy (XCE), and joint distribution entropy (JDistEn). Results showed no statistically significant differences in XSampEn, XFuzzyEn, and XCE across different physiological states. Interestingly, JDistEn demonstrated significant decreases during cycling as compared with that during the resting state. Besides, JDistEn also showed a progressively recovering trend from cycling to the first 3 min during recovering, and further to the second 3 min during recovering. It appeared to be fully recovered to its level in the resting state during the second 3 min during the recovering phase. The results suggest that there is certain nonlinear temporal relationship between QTV and HRV, and that the JDistEn could help unravel this nuanced property. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

15 pages, 621 KiB  
Article
Modified Distribution Entropy as a Complexity Measure of Heart Rate Variability (HRV) Signal
by Radhagayathri Udhayakumar, Chandan Karmakar, Peng Li, Xinpei Wang and Marimuthu Palaniswami
Entropy 2020, 22(10), 1077; https://0-doi-org.brum.beds.ac.uk/10.3390/e22101077 - 24 Sep 2020
Cited by 4 | Viewed by 2651
Abstract
The complexity of a heart rate variability (HRV) signal is considered an important nonlinear feature to detect cardiac abnormalities. This work aims at explaining the physiological meaning of a recently developed complexity measurement method, namely, distribution entropy ( [...] Read more.
The complexity of a heart rate variability (HRV) signal is considered an important nonlinear feature to detect cardiac abnormalities. This work aims at explaining the physiological meaning of a recently developed complexity measurement method, namely, distribution entropy (DistEn), in the context of HRV signal analysis. We thereby propose modified distribution entropy (mDistEn) to remove the physiological discrepancy involved in the computation of DistEn. The proposed method generates a distance matrix that is devoid of over-exerted multi-lag signal changes. Restricted element selection in the distance matrix makes “mDistEn” a computationally inexpensive and physiologically more relevant complexity measure in comparison to DistEn. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

13 pages, 2227 KiB  
Article
Multiscale Entropy Analysis: Application to Cardio-Respiratory Coupling
by Mirjana M. Platiša, Nikola N. Radovanović, Aleksandar Kalauzi, Goran Milašinović and Siniša U. Pavlović
Entropy 2020, 22(9), 1042; https://0-doi-org.brum.beds.ac.uk/10.3390/e22091042 - 18 Sep 2020
Cited by 10 | Viewed by 4250
Abstract
It is known that in pathological conditions, physiological systems develop changes in the multiscale properties of physiological signals. However, in real life, little is known about how changes in the function of one of the two coupled physiological systems induce changes in function [...] Read more.
It is known that in pathological conditions, physiological systems develop changes in the multiscale properties of physiological signals. However, in real life, little is known about how changes in the function of one of the two coupled physiological systems induce changes in function of the other one, especially on their multiscale behavior. Hence, in this work we aimed to examine the complexity of cardio-respiratory coupled systems control using multiscale entropy (MSE) analysis of cardiac intervals MSE (RR), respiratory time series MSE (Resp), and synchrony of these rhythms by cross multiscale entropy (CMSE) analysis, in the heart failure (HF) patients and healthy subjects. We analyzed 20 min of synchronously recorded RR intervals and respiratory signal during relaxation in the supine position in 42 heart failure patients and 14 control healthy subjects. Heart failure group was divided into three subgroups, according to the RR interval time series characteristics (atrial fibrillation (HFAF), sinus rhythm (HFSin), and sinus rhythm with ventricular extrasystoles (HFVES)). Compared with healthy control subjects, alterations in respiratory signal properties were observed in patients from the HFSin and HFVES groups. Further, mean MSE curves of RR intervals and respiratory signal were not statistically different only in the HFSin group (p = 0.43). The level of synchrony between these time series was significantly higher in HFSin and HFVES patients than in control subjects and HFAF patients (p < 0.01). In conclusion, depending on the specific pathologies, primary alterations in the regularity of cardiac rhythm resulted in changes in the regularity of the respiratory rhythm, as well as in the level of their asynchrony. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

30 pages, 3163 KiB  
Article
Representational Rényi Heterogeneity
by Abraham Nunes, Martin Alda, Timothy Bardouille and Thomas Trappenberg
Entropy 2020, 22(4), 417; https://0-doi-org.brum.beds.ac.uk/10.3390/e22040417 - 07 Apr 2020
Cited by 5 | Viewed by 3141
Abstract
A discrete system’s heterogeneity is measured by the Rényi heterogeneity family of indices (also known as Hill numbers or Hannah–Kay indices), whose units are the numbers equivalent. Unfortunately, numbers equivalent heterogeneity measures for non-categorical data require a priori (A) categorical partitioning and (B) [...] Read more.
A discrete system’s heterogeneity is measured by the Rényi heterogeneity family of indices (also known as Hill numbers or Hannah–Kay indices), whose units are the numbers equivalent. Unfortunately, numbers equivalent heterogeneity measures for non-categorical data require a priori (A) categorical partitioning and (B) pairwise distance measurement on the observable data space, thereby precluding application to problems with ill-defined categories or where semantically relevant features must be learned as abstractions from some data. We thus introduce representational Rényi heterogeneity (RRH), which transforms an observable domain onto a latent space upon which the Rényi heterogeneity is both tractable and semantically relevant. This method requires neither a priori binning nor definition of a distance function on the observable space. We show that RRH can generalize existing biodiversity and economic equality indices. Compared with existing indices on a beta-mixture distribution, we show that RRH responds more appropriately to changes in mixture component separation and weighting. Finally, we demonstrate the measurement of RRH in a set of natural images, with respect to abstract representations learned by a deep neural network. The RRH approach will further enable heterogeneity measurement in disciplines whose data do not easily conform to the assumptions of existing indices. Full article
(This article belongs to the Special Issue Entropy in Data Analysis)
Show Figures

Figure 1

Back to TopTop