entropy-logo

Journal Browser

Journal Browser

Big Data Analytics and Information Science for Business and Biomedical Applications II

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: closed (15 December 2021) | Viewed by 18673

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mathematics and Statistics, Brock University, St. Catharines, ON L2S 3A1, Canada
Interests: model selection; post-estimation and prediction; shrinkage and empirical Bayes; Bayesian data analysis; machine learning; business; information science; statistical genetics; image analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Mathematics and Statistics, University of Victoria, Victoria, BC V8W 3P4, Canada
Interests: Bayesian methods; statistical computing; spatial statistics; high-dimensional data; statistical modeling; neuroimaging statistics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In today’s data-centric world, there is a host of buzzwords appearing everywhere in digital and print media. We encounter data in every walk of life, and the information it contains can be used to improve society, business, health, and medicine. This presents a substantial opportunity for analytically and objectively minded researchers. Making sense of data and extracting meaningful information from it may not be an easy task. The rapid growth in the size and scope of datasets in a host of disciplines has created the need for innovative statistical strategies for analyzing and visualizing such data.

An enormous trove of digital data has been produced by biomedicine researchers worldwide, including genetic variants genotyped or sequenced at genome-wide scales, gene expression measured under different experimental conditions, biomedical imaging data including neuroimaging data, electronic medical records (EMR) of patients, and many more.

The rise of ‘Big Data’ will not only deepen our understanding of complex human traits and diseases, but will also shed light on disease prevention, diagnosis, and treatment. Undoubtedly, comprehensive analysis of Big Data in genomics and neuroimaging calls for statistically rigorous methods. Various statistical methods have been developed to accommodate the features of genomic studies as well as studies examining the function and structure of the brain. Meanwhile, statistical theories have also correspondingly been developed.

Alongside biomedical applications, there has been a tremendous increase and interest in the use of Big Data towards business and financial applications. Financial time series analysis and prediction problems present many challenges for the development of statistical methodology and computational strategies for streaming data.

The analysis of Big Data in biomedical as well as business and financial research has drawn much attention from researchers worldwide. This Special Issue aims to provide a platform for the deep discussion of novel statistical methods developed for the analysis of Big Data in these areas. Both applied and theoretical contributions to these areas will be showcased.

The contributions to this Special Issue will present new and original research in statistical methods and applications in biomedical and business research. Contributions can have either an applied or theoretical perspective and emphasize different statistical problems with special emphasis on data analytics and statistical methodology. Manuscripts summarizing the most recent state-of-the-art on these topics are welcome.

Prof. Dr. S. Ejaz Ahmed
Dr. Farouk Nathoo
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Related Special Issues

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

23 pages, 396 KiB  
Article
Nonparametric Causal Structure Learning in High Dimensions
by Shubhadeep Chakraborty and Ali Shojaie
Entropy 2022, 24(3), 351; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030351 - 28 Feb 2022
Cited by 1 | Viewed by 1928
Abstract
The PC and FCI algorithms are popular constraint-based methods for learning the structure of directed acyclic graphs (DAGs) in the absence and presence of latent and selection variables, respectively. These algorithms (and their order-independent variants, PC-stable and FCI-stable) have been shown to be [...] Read more.
The PC and FCI algorithms are popular constraint-based methods for learning the structure of directed acyclic graphs (DAGs) in the absence and presence of latent and selection variables, respectively. These algorithms (and their order-independent variants, PC-stable and FCI-stable) have been shown to be consistent for learning sparse high-dimensional DAGs based on partial correlations. However, inferring conditional independences from partial correlations is valid if the data are jointly Gaussian or generated from a linear structural equation model—an assumption that may be violated in many applications. To broaden the scope of high-dimensional causal structure learning, we propose nonparametric variants of the PC-stable and FCI-stable algorithms that employ the conditional distance covariance (CdCov) to test for conditional independence relationships. As the key theoretical contribution, we prove that the high-dimensional consistency of the PC-stable and FCI-stable algorithms carry over to general distributions over DAGs when we implement CdCov-based nonparametric tests for conditional independence. Numerical studies demonstrate that our proposed algorithms perform nearly as good as the PC-stable and FCI-stable for Gaussian distributions, and offer advantages in non-Gaussian graphical models. Full article
Show Figures

Figure 1

13 pages, 1821 KiB  
Article
Transfer-Learning-Based Approach for the Diagnosis of Lung Diseases from Chest X-ray Images
by Rong Fan and Shengrong Bu
Entropy 2022, 24(3), 313; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030313 - 22 Feb 2022
Cited by 7 | Viewed by 1810
Abstract
Using chest X-ray images is one of the least expensive and easiest ways to diagnose patients who suffer from lung diseases such as pneumonia and bronchitis. Inspired by existing work, a deep learning model is proposed to classify chest X-ray images into 14 [...] Read more.
Using chest X-ray images is one of the least expensive and easiest ways to diagnose patients who suffer from lung diseases such as pneumonia and bronchitis. Inspired by existing work, a deep learning model is proposed to classify chest X-ray images into 14 lung-related pathological conditions. However, small datasets are not sufficient to train the deep learning model. Two methods were used to tackle this: (1) transfer learning based on two pretrained neural networks, DenseNet and ResNet, was employed; (2) data were preprocessed, including checking data leakage, handling class imbalance, and performing data augmentation, before feeding the neural network. The proposed model was evaluated according to the classification accuracy and receiver operating characteristic (ROC) curves, as well as visualized by class activation maps. DenseNet121 and ResNet50 were used in the simulations, and the results showed that the model trained by DenseNet121 had better accuracy than that trained by ResNet50. Full article
Show Figures

Figure 1

11 pages, 311 KiB  
Article
Associations between Longitudinal Gestational Weight Gain and Scalar Infant Birth Weight: A Bayesian Joint Modeling Approach
by Matthew Pietrosanu, Linglong Kong, Yan Yuan, Rhonda C. Bell, Nicole Letourneau and Bei Jiang
Entropy 2022, 24(2), 232; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020232 - 02 Feb 2022
Viewed by 1360
Abstract
Despite the importance of maternal gestational weight gain, it is not yet conclusively understood how weight gain during different stages of pregnancy influences health outcomes for either mother or child. We partially attribute this to differences in and the validity of statistical methods [...] Read more.
Despite the importance of maternal gestational weight gain, it is not yet conclusively understood how weight gain during different stages of pregnancy influences health outcomes for either mother or child. We partially attribute this to differences in and the validity of statistical methods for the analysis of longitudinal and scalar outcome data. In this paper, we propose a Bayesian joint regression model that estimates and uses trajectory parameters as predictors of a scalar response. Our model remedies notable issues with traditional linear regression approaches found in the clinical literature. In particular, our methodology accommodates nonprospective designs by correcting for bias in self-reported prestudy measures; truly accommodates sparse longitudinal observations and short-term variation without data aggregation or precomputation; and is more robust to the choice of model changepoints. We demonstrate these advantages through a real-world application to the Alberta Pregnancy Outcomes and Nutrition (APrON) dataset and a comparison to a linear regression approach from the clinical literature. Our methods extend naturally to other maternal and infant outcomes as well as to areas of research that employ similarly structured data. Full article
Show Figures

Figure 1

24 pages, 477 KiB  
Article
Multivariate Functional Kernel Machine Regression and Sparse Functional Feature Selection
by Joseph Naiman and Peter Xuekun Song
Entropy 2022, 24(2), 203; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020203 - 28 Jan 2022
Cited by 1 | Viewed by 1848
Abstract
Motivated by mobile devices that record data at a high frequency, we propose a new methodological framework for analyzing a semi-parametric regression model that allow us to study a nonlinear relationship between a scalar response and multiple functional predictors in the presence of [...] Read more.
Motivated by mobile devices that record data at a high frequency, we propose a new methodological framework for analyzing a semi-parametric regression model that allow us to study a nonlinear relationship between a scalar response and multiple functional predictors in the presence of scalar covariates. Utilizing functional principal component analysis (FPCA) and the least-squares kernel machine method (LSKM), we are able to substantially extend the framework of semi-parametric regression models of scalar responses on scalar predictors by allowing multiple functional predictors to enter the nonlinear model. Regularization is established for feature selection in the setting of reproducing kernel Hilbert spaces. Our method performs simultaneously model fitting and variable selection on functional features. For the implementation, we propose an effective algorithm to solve related optimization problems in that iterations take place between both linear mixed-effects models and a variable selection method (e.g., sparse group lasso). We show algorithmic convergence results and theoretical guarantees for the proposed methodology. We illustrate its performance through simulation experiments and an analysis of accelerometer data. Full article
Show Figures

Figure 1

14 pages, 4504 KiB  
Article
Comparative Analysis of Social Support in Online Health Communities Using a Word Co-Occurrence Network Analysis Approach
by Mengque Liu, Xia Zou, Jiyin Chen and Shuangge Ma
Entropy 2022, 24(2), 174; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020174 - 25 Jan 2022
Cited by 2 | Viewed by 2314
Abstract
Online health communities (OHCs) have become a major source of social support for people with health problems. Members of OHCs interact online with others facing similar health problems and receive multiple types of social support, including but not limited to informational support, emotional [...] Read more.
Online health communities (OHCs) have become a major source of social support for people with health problems. Members of OHCs interact online with others facing similar health problems and receive multiple types of social support, including but not limited to informational support, emotional support, and companionship. The aim of this study is to examine the differences in social support communication among people with different types of cancers. A novel approach is developed to better understand the types of social support embedded in OHC posts. Our approach, based on the word co-occurrence network analysis, preserves the semantic structures of the texts. Information extraction from the semantic structures is supported by the interplay of quantitative and qualitative analyses of the network structures. Our analysis shows that significant differences in social support exist across cancer types, and evidence for the differences across diseases in terms of communication preferences and language use is also identified. Overall, this study can establish a new venue for extracting and analyzing information, so as to inform social support for clinical care. Full article
Show Figures

Figure 1

29 pages, 1264 KiB  
Article
Improved Dividend Estimation from Intraday Quotes
by Pontus Söderbäck, Jörgen Blomvall and Martin Singull
Entropy 2022, 24(1), 95; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010095 - 07 Jan 2022
Cited by 1 | Viewed by 1890
Abstract
Liquid financial markets, such as the options market of the S&P 500 index, create vast amounts of data every day, i.e., so-called intraday data. However, this highly granular data is often reduced to single-time when used to estimate financial quantities. This under-utilization of [...] Read more.
Liquid financial markets, such as the options market of the S&P 500 index, create vast amounts of data every day, i.e., so-called intraday data. However, this highly granular data is often reduced to single-time when used to estimate financial quantities. This under-utilization of the data may reduce the quality of the estimates. In this paper, we study the impacts on estimation quality when using intraday data to estimate dividends. The methodology is based on earlier linear regression (ordinary least squares) estimates, which have been adapted to intraday data. Further, the method is also generalized in two aspects. First, the dividends are expressed as present values of future dividends rather than dividend yields. Second, to account for heteroscedasticity, the estimation methodology was formulated as a weighted least squares, where the weights are determined from the market data. This method is compared with a traditional method on out-of-sample S&P 500 European options market data. The results show that estimations based on intraday data have, with statistical significance, a higher quality than the corresponding single-times estimates. Additionally, the two generalizations of the methodology are shown to improve the estimation quality further. Full article
Show Figures

Figure 1

24 pages, 583 KiB  
Article
Sparse Estimation Strategies in Linear Mixed Effect Models for High-Dimensional Data Application
by Eugene A. Opoku, Syed Ejaz Ahmed and Farouk S. Nathoo
Entropy 2021, 23(10), 1348; https://0-doi-org.brum.beds.ac.uk/10.3390/e23101348 - 15 Oct 2021
Cited by 2 | Viewed by 1424
Abstract
In a host of business applications, biomedical and epidemiological studies, the problem of multicollinearity among predictor variables is a frequent issue in longitudinal data analysis for linear mixed models (LMM). We consider an efficient estimation strategy for high-dimensional data application, where the dimensions [...] Read more.
In a host of business applications, biomedical and epidemiological studies, the problem of multicollinearity among predictor variables is a frequent issue in longitudinal data analysis for linear mixed models (LMM). We consider an efficient estimation strategy for high-dimensional data application, where the dimensions of the parameters are larger than the number of observations. In this paper, we are interested in estimating the fixed effects parameters of the LMM when it is assumed that some prior information is available in the form of linear restrictions on the parameters. We propose the pretest and shrinkage estimation strategies using the ridge full model as the base estimator. We establish the asymptotic distributional bias and risks of the suggested estimators and investigate their relative performance with respect to the ridge full model estimator. Furthermore, we compare the numerical performance of the LASSO-type estimators with the pretest and shrinkage ridge estimators. The methodology is investigated using simulation studies and then demonstrated on an application exploring how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer’s disease. Full article
Show Figures

Figure 1

24 pages, 3412 KiB  
Article
Edge-Preserving Denoising of Image Sequences
by Fan Yi and Peihua Qiu
Entropy 2021, 23(10), 1332; https://0-doi-org.brum.beds.ac.uk/10.3390/e23101332 - 12 Oct 2021
Viewed by 1473
Abstract
To monitor the Earth’s surface, the satellite of the NASA Landsat program provides us image sequences of any region on the Earth constantly over time. These image sequences give us a unique resource to study the Earth’s surface, changes of the Earth resource [...] Read more.
To monitor the Earth’s surface, the satellite of the NASA Landsat program provides us image sequences of any region on the Earth constantly over time. These image sequences give us a unique resource to study the Earth’s surface, changes of the Earth resource over time, and their implications in agriculture, geology, forestry, and more. Besides natural sciences, image sequences are also commonly used in functional magnetic resonance imaging (fMRI) of medical studies for understanding the functioning of brains and other organs. In practice, observed images almost always contain noise and other contaminations. For a reliable subsequent image analysis, it is important to remove such contaminations in advance. This paper focuses on image sequence denoising, which has not been well-discussed in the literature yet. To this end, an edge-preserving image denoising procedure is suggested. The suggested method is based on a jump-preserving local smoothing procedure, in which the bandwidths are chosen such that the possible spatio-temporal correlations in the observed image intensities are accommodated properly. Both theoretical arguments and numerical studies show that this method works well in the various cases considered. Full article
Show Figures

Figure 1

Review

Jump to: Research

18 pages, 1750 KiB  
Review
Functional Connectivity Methods and Their Applications in fMRI Data
by Yasaman Shahhosseini and Michelle F. Miranda
Entropy 2022, 24(3), 390; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030390 - 11 Mar 2022
Cited by 9 | Viewed by 3692
Abstract
The availability of powerful non-invasive neuroimaging techniques has given rise to various studies that aim to map the human brain. These studies focus on not only finding brain activation signatures but also on understanding the overall organization of functional communication in the brain [...] Read more.
The availability of powerful non-invasive neuroimaging techniques has given rise to various studies that aim to map the human brain. These studies focus on not only finding brain activation signatures but also on understanding the overall organization of functional communication in the brain network. Based on the principle that distinct brain regions are functionally connected and continuously share information with each other, various approaches to finding these functional networks have been proposed in the literature. In this paper, we present an overview of the most common methods to estimate and characterize functional connectivity in fMRI data. We illustrate these methodologies with resting-state functional MRI data from the Human Connectome Project, providing details of their implementation and insights on the interpretations of the results. We aim to guide researchers that are new to the field of neuroimaging by providing the necessary tools to estimate and characterize brain circuitry. Full article
Show Figures

Figure 1

Back to TopTop