Computational Approaches for Data Inspection in Biomedicine

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematical Biology".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 23133

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mathematics, University of Bari Aldo Moro via E. Orabona, 4, I -70125 Bari, Italy
Interests: numerical analysis

E-Mail Website
Guest Editor
Department of Mathematics, University of Bari Aldo Moro, 70125 Bari, Italy
Interests: numerical analysis; bioinformatics; data analysis; dimensionality reduction

Special Issue Information

Dear Colleagues,

In recent decades, the rapid development of high-performance technologies has produced an explosive growth of digitized medical data, including radiology images, omics data, laboratory test results, and medical and personal statistics. These data need to be processed and studied to extract information which is useful to better understand mechanisms of pathogenesis of complex diseases and to potentially improve care and outcomes for patients based on predictive analytics. The analysis of biomedical data often requires the construction of unified frameworks using various machine learning, statistical techniques, mathematical and computational methods to provide insights into the biological task under study. The main aim of this Special Issue is to introduce and discuss major problems for the preprocessing, analysis, and interpretation of biomedical data, to review the state-of-the-art of mathematical and computational approaches for biomedical applications, and to explore current and emerging algorithms and techniques able to unravel patterns, associations and correlations in large amounts of biomedical datasets.

Keywords Topics of interest include but are not limited to:  

- Low-rank methods for biomedical data;

- Feature extraction and selection algorithms in biomedical data;

- Anomaly detection methods for biomedical applications;

- Clustering algorithms applied to omics and biomedical data;

- Bioimaging for omics data;

- Optimization and machine learning algorithms for biomedical data.

Prof. Dr. Nicoletta Del Buono
Dr. Flavia Esposito
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Low-rank methods for biomedical data
  • Feature extraction and selection algorithms in biomedical data
  • Anomaly detection methods for biomedical applications
  • Clustering algorithms applied to omics and biomedical data
  • Bioimaging for omics data
  • Optimization and machine learning algorithms for biomedical data

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1362 KiB  
Article
A Two-Step Data Normalization Approach for Improving Classification Accuracy in the Medical Diagnosis Domain
by Ivan Izonin, Roman Tkachenko, Nataliya Shakhovska, Bohdan Ilchyshyn and Krishna Kant Singh
Mathematics 2022, 10(11), 1942; https://0-doi-org.brum.beds.ac.uk/10.3390/math10111942 - 06 Jun 2022
Cited by 22 | Viewed by 3586
Abstract
Data normalization is a data preprocessing task and one of the first to be performed during intellectual analysis, particularly in the case of tabular data. The importance of its implementation is determined by the need to reduce the sensitivity of the artificial intelligence [...] Read more.
Data normalization is a data preprocessing task and one of the first to be performed during intellectual analysis, particularly in the case of tabular data. The importance of its implementation is determined by the need to reduce the sensitivity of the artificial intelligence model to the values of the features in the dataset to increase the studied model’s adequacy. This paper focuses on the problem of effectively preprocessing data to improve the accuracy of intellectual analysis in the case of performing medical diagnostic tasks. We developed a new two-step method for data normalization of numerical medical datasets. It is based on the possibility of considering both the interdependencies between the features of each observation from the dataset and their absolute values to improve the accuracy when performing medical data mining tasks. We describe and substantiate each step of the algorithmic implementation of the method. We also visualize the results of the proposed method. The proposed method was modeled using six different machine learning methods based on decision trees when performing binary and multiclass classification tasks. We used six real-world, freely available medical datasets with different numbers of vectors, attributes, and classes to conduct experiments. A comparison between the effectiveness of the developed method and that of five existing data normalization methods was carried out. It was experimentally established that the developed method increases the accuracy of the Decision Tree and Extra Trees Classifier by 1–5% in the case of performing the binary classification task and the accuracy of the Bagging, Decision Tree, and Extra Trees Classifier by 1–6% in the case of performing the multiclass classification task. Increasing the accuracy of these classifiers only by using the new data normalization method satisfies all the prerequisites for its application in practice when performing various medical data mining tasks. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

25 pages, 1653 KiB  
Article
COSMONET: An R Package for Survival Analysis Using Screening-Network Methods
by Antonella Iuliano, Annalisa Occhipinti, Claudia Angelini, Italia De Feis and Pietro Liò
Mathematics 2021, 9(24), 3262; https://0-doi-org.brum.beds.ac.uk/10.3390/math9243262 - 15 Dec 2021
Cited by 4 | Viewed by 3609
Abstract
Identifying relevant genomic features that can act as prognostic markers for building predictive survival models is one of the central themes in medical research, affecting the future of personalized medicine and omics technologies. However, the high dimension of genome-wide omic data, the strong [...] Read more.
Identifying relevant genomic features that can act as prognostic markers for building predictive survival models is one of the central themes in medical research, affecting the future of personalized medicine and omics technologies. However, the high dimension of genome-wide omic data, the strong correlation among the features, and the low sample size significantly increase the complexity of cancer survival analysis, demanding the development of specific statistical methods and software. Here, we present a novel R package, COSMONET (COx Survival Methods based On NETworks), that provides a complete workflow from the pre-processing of omics data to the selection of gene signatures and prediction of survival outcomes. In particular, COSMONET implements (i) three different screening approaches to reduce the initial dimension of the data from a high-dimensional space p to a moderate scale d, (ii) a network-penalized Cox regression algorithm to identify the gene signature, (iii) several approaches to determine an optimal cut-off on the prognostic index (PI) to separate high- and low-risk patients, and (iv) a prediction step for patients’ risk class based on the evaluation of PIs. Moreover, COSMONET provides functions for data pre-processing, visualization, survival prediction, and gene enrichment analysis. We illustrate COSMONET through a step-by-step R vignette using two cancer datasets. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

16 pages, 677 KiB  
Article
Modeling and Simulation of a miRNA Regulatory Network of the PTEN Gene
by Gionmattia Carancini, Margherita Carletti and Giulia Spaletta
Mathematics 2021, 9(15), 1803; https://0-doi-org.brum.beds.ac.uk/10.3390/math9151803 - 30 Jul 2021
Cited by 2 | Viewed by 1867
Abstract
The PTEN onco-suppressor gene is likely to play an important role in the onset of brain cancer, namely glioblastoma multiforme. Consequently, the PTEN regulatory network, involving microRNAs and competitive endogenous RNAs, becomes a crucial tool for understanding the mechanism related to low levels [...] Read more.
The PTEN onco-suppressor gene is likely to play an important role in the onset of brain cancer, namely glioblastoma multiforme. Consequently, the PTEN regulatory network, involving microRNAs and competitive endogenous RNAs, becomes a crucial tool for understanding the mechanism related to low levels of expression in cancer patients. This paper introduces a novel model for the regulation of PTEN whose solution is approximated by a high-dimensional system of ordinary differential equations under the assumption that the Law of Mass Action applies. Extensive numerical simulations are presented that mirror parts of the biological subtext that lies behind various alterations. Given the complexity of processes involved in the acquisition of empirical data, initial conditions and reaction rates were inferred from the literature. Despite this, the proposed model is shown to be capable of capturing biologically reasonable behaviors of inter-species interactions, thus representing a positive result, which encourages pursuing the possibility of experimenting on data hopefully provided by omics disciplines. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

27 pages, 5406 KiB  
Article
A Fast and Effective Method to Identify Relevant Sets of Variables in Complex Systems
by Gianluca D’Addese, Martina Casari, Roberto Serra and Marco Villani
Mathematics 2021, 9(9), 1022; https://0-doi-org.brum.beds.ac.uk/10.3390/math9091022 - 30 Apr 2021
Cited by 4 | Viewed by 1676
Abstract
In many complex systems one observes the formation of medium-level structures, whose detection could allow a high-level description of the dynamical organization of the system itself, and thus to its better understanding. We have developed in the past a powerful method to achieve [...] Read more.
In many complex systems one observes the formation of medium-level structures, whose detection could allow a high-level description of the dynamical organization of the system itself, and thus to its better understanding. We have developed in the past a powerful method to achieve this goal, which however requires a heavy computational cost in several real-world cases. In this work we introduce a modified version of our approach, which reduces the computational burden. The design of the new algorithm allowed the realization of an original suite of methods able to work simultaneously at the micro level (that of the binary relationships of the single variables) and at meso level (the identification of dynamically relevant groups). We apply this suite to a particularly relevant case, in which we look for the dynamic organization of a gene regulatory network when it is subject to knock-outs. The approach combines information theory, graph analysis, and an iterated sieving algorithm in order to describe rather complex situations. Its application allowed to derive some general observations on the dynamical organization of gene regulatory networks, and to observe interesting characteristics in an experimental case. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

17 pages, 407 KiB  
Article
A Review on Initialization Methods for Nonnegative Matrix Factorization: Towards Omics Data Experiments
by Flavia Esposito
Mathematics 2021, 9(9), 1006; https://0-doi-org.brum.beds.ac.uk/10.3390/math9091006 - 29 Apr 2021
Cited by 15 | Viewed by 3188
Abstract
Nonnegative Matrix Factorization (NMF) has acquired a relevant role in the panorama of knowledge extraction, thanks to the peculiarity that non-negativity applies to both bases and weights, which allows meaningful interpretations and is consistent with the natural human part-based learning process. Nevertheless, most [...] Read more.
Nonnegative Matrix Factorization (NMF) has acquired a relevant role in the panorama of knowledge extraction, thanks to the peculiarity that non-negativity applies to both bases and weights, which allows meaningful interpretations and is consistent with the natural human part-based learning process. Nevertheless, most NMF algorithms are iterative, so initialization methods affect convergence behaviour, the quality of the final solution, and NMF performance in terms of the residual of the cost function. Studies on the impact of NMF initialization techniques have been conducted for text or image datasets, but very few considerations can be found in the literature when biological datasets are studied, even though NMFs have largely demonstrated their usefulness in better understanding biological mechanisms with omic datasets. This paper aims to present the state-of-the-art on NMF initialization schemes along with some initial considerations on the impact of initialization methods when microarrays (a simple instance of omic data) are evaluated with NMF mechanisms. Using a series of measures to qualitatively examine the biological information extracted by a given NMF scheme, it preliminary appears that some information (e.g., represented by genes) can be extracted regardless of the initialization scheme used. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

26 pages, 6047 KiB  
Article
A New Ensemble Method for Detecting Anomalies in Gene Expression Matrices
by Laura Selicato, Flavia Esposito, Grazia Gargano, Maria Carmela Vegliante, Giuseppina Opinto, Gian Maria Zaccaria, Sabino Ciavarella, Attilio Guarini and Nicoletta Del Buono
Mathematics 2021, 9(8), 882; https://0-doi-org.brum.beds.ac.uk/10.3390/math9080882 - 16 Apr 2021
Cited by 13 | Viewed by 2364
Abstract
One of the main problems in the analysis of real data is often related to the presence of anomalies. Namely, anomalous cases can both spoil the resulting analysis and contain valuable information at the same time. In both cases, the ability to detect [...] Read more.
One of the main problems in the analysis of real data is often related to the presence of anomalies. Namely, anomalous cases can both spoil the resulting analysis and contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. In the biomedical field, a correct identification of outliers could allow the development of new biological hypotheses that are not considered when looking at experimental biological data. In this work, we address the problem of detecting outliers in gene expression data, focusing on microarray analysis. We propose an ensemble approach for detecting anomalies in gene expression matrices based on the use of Hierarchical Clustering and Robust Principal Component Analysis, which allows us to derive a novel pseudo-mathematical classification of anomalies. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

17 pages, 18482 KiB  
Article
Emotion Recognition and Regulation Based on Stacked Sparse Auto-Encoder Network and Personalized Reconfigurable Music
by Yinsheng Li and Wei Zheng
Mathematics 2021, 9(6), 593; https://0-doi-org.brum.beds.ac.uk/10.3390/math9060593 - 10 Mar 2021
Cited by 9 | Viewed by 2434
Abstract
Music can regulate and improve the emotions of the brain. Traditional emotional regulation approaches often adopt complete music. As is well-known, complete music may vary in pitch, volume, and other ups and downs. An individual’s emotions may also adopt multiple states, and music [...] Read more.
Music can regulate and improve the emotions of the brain. Traditional emotional regulation approaches often adopt complete music. As is well-known, complete music may vary in pitch, volume, and other ups and downs. An individual’s emotions may also adopt multiple states, and music preference varies from person to person. Therefore, traditional music regulation methods have problems, such as long duration, variable emotional states, and poor adaptability. In view of these problems, we use different music processing methods and stacked sparse auto-encoder neural networks to identify and regulate the emotional state of the brain in this paper. We construct a multi-channel EEG sensor network, divide brainwave signals and the corresponding music separately, and build a personalized reconfigurable music-EEG library. The 17 features in the EEG signal are extracted as joint features, and the stacked sparse auto-encoder neural network is used to classify the emotions, in order to establish a music emotion evaluation index. According to the goal of emotional regulation, music fragments are selected from the personalized reconfigurable music-EEG library, then reconstructed and combined for emotional adjustment. The results show that, compared with complete music, the reconfigurable combined music was less time-consuming for emotional regulation (76.29% less), and the number of irrelevant emotional states was reduced by 69.92%. In terms of adaptability to different participants, the reconfigurable music improved the recognition rate of emotional states by 31.32%. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

15 pages, 509 KiB  
Article
A Proposal of Quantum-Inspired Machine Learning for Medical Purposes: An Application Case
by Domenico Pomarico, Annarita Fanizzi, Nicola Amoroso, Roberto Bellotti, Albino Biafora, Samantha Bove, Vittorio Didonna, Daniele La Forgia, Maria Irene Pastena, Pasquale Tamborra, Alfredo Zito, Vito Lorusso and Raffaella Massafra
Mathematics 2021, 9(4), 410; https://0-doi-org.brum.beds.ac.uk/10.3390/math9040410 - 19 Feb 2021
Cited by 7 | Viewed by 2729
Abstract
Learning tasks are implemented via mappings of the sampled data set, including both the classical and the quantum framework. Biomedical data characterizing complex diseases such as cancer typically require an algorithmic support for clinical decisions, especially for early stage tumors that typify breast [...] Read more.
Learning tasks are implemented via mappings of the sampled data set, including both the classical and the quantum framework. Biomedical data characterizing complex diseases such as cancer typically require an algorithmic support for clinical decisions, especially for early stage tumors that typify breast cancer patients, which are still controllable in a therapeutic and surgical way. Our case study consists of the prediction during the pre-operative stage of lymph node metastasis in breast cancer patients resulting in a negative diagnosis after clinical and radiological exams. The classifier adopted to establish a baseline is characterized by the result invariance for the order permutation of the input features, and it exploits stratifications in the training procedure. The quantum one mimics support vector machine mapping in a high-dimensional feature space, yielded by encoding into qubits, while being characterized by complexity. Feature selection is exploited to study the performances associated with a low number of features, thus implemented in a feasible time. Wide variations in sensitivity and specificity are observed in the selected optimal classifiers during cross-validations for both classification system types, with an easier detection of negative or positive cases depending on the choice between the two training schemes. Clinical practice is still far from being reached, even if the flexible structure of quantum-inspired classifier circuits guarantees further developments to rule interactions among features: this preliminary study is solely intended to provide an overview of the particular tree tensor network scheme in a simplified version adopting just product states, as well as to introduce typical machine learning procedures consisting of feature selection and classifier performance evaluation. Full article
(This article belongs to the Special Issue Computational Approaches for Data Inspection in Biomedicine)
Show Figures

Figure 1

Back to TopTop