ijms-logo

Journal Browser

Journal Browser

Complex Networks, Bio-Molecular Systems, and Machine Learning

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (28 February 2022) | Viewed by 17127

Special Issue Editor

Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940 Leioa, Biscay, Spain
Interests: complex networks; bio-molecular systems; machine learning; cheminformatics; bioinformatics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleague,

Both, Artificial Intelligence and/or Machine Learning (AI/ML) and Complex Networks algorithms are important tools for the computational study of molecular systems. Some of these methods are Artificial Neural Networks (ANN), Deep Learning Networks, Support Vector Machines (SVM), Random Forests (RF), Genetic Algorithms (GA), Deep Neural Networks (DNN), Deep Belief Networks (DBN), Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), etc. We can use structural parameters, molecular descriptors, experimental conditions, chemometrics measurements, etc. as input to train these AI/ML algorithms. As a reult we can obtain predictive models for Drug Discovery, Vaccine Design, Nanotechnoloy, etc. 

On the other side, Complex networks are very useful for the study of complex bio-molecular systems. In fact, we can use complex networks to represent complex structural-function patterns in complex molecular bio-systems. This includes, but is not limited to, the structure chemical compounds, synthetic chemical reactions routes, proteins, polymers, viral structure, RNA secondary structure, etc. The method is highly flexible, so we can also represent larger bio-systems such as: metabolic pathways, protein interaction networks (PINs), gene regulatory networks, brain cortex, ecosystems, internet, market, social networks, etc. In this approach, it is common to represent the parts of the system (atoms, aminoacids, monomers, proteins, reactions, neurons, organisms, etc.) as nodes and the structure-function relationships among them (chemical bonds, hydrogen bonds, reactions, activation, co-expression, etc.) as edges or links. This opens a gate to the study of complex bio-molecular systems with graph and complex networks theory. In consequence, we can calculate multiple graph invariants (numeric parameters) useful to quantify the complex structure of these systems. It includes software/algorithms for the representation, study of distributions, emergent properties, transport phenomena, multiplex networks, dynamic systems properties, etc. In addition, though not mandatory, we can also train AI/ML algorithms using as input the numerical parameters of complex networks and bio-molecular systems in order to predict the structure-function relationships and in consequence the properties of these systems.
This framework opens the door to the development of new methods, algorithms, databases, and software for the study of complex bio-molecular systems using AI/ML and/or Complex Networks algorithms. Consequently, the topic of the issue is: “Complex Networks, Bio-Molecular Systems, and Machine Learning.” Authors are welcome to submit papers using AI/ML algorithms alone and we aslo welcome papers using Complex Network algorithms only. All in all, we especially welcome papers combining both AI/ML and Complex Networks areas. Accepted papers will be published in the International Journal of Molecular Sciences (IJMS), which is an open access journal published by MDPI (https://0-www-mdpi-com.brum.beds.ac.uk/journal/ijms). The authors of the papers can opt also to publish online short communications or posters about their papers in the MOL2NET International Conference Series on Multidisciplinary Sciences, 2020. The conference has multiple workshops/sessions in universities of USA, Europe, China, India, Brazil, etc. The conference is published at the Sciforum platform, supported by MDPI editorial. MOL2NET 2020 link: https://mol2net-06.sciforum.net/.

Prof. Dr. Humberto González-Díaz
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Complex Networks
  • Bio-molecular systems
  • Protein Interaction Networks (PIN)
  • Metabolic Pathway networks
  • Brain Networks
  • Social, Financial, and Legal Networks
  • Machine Learning
  • Bioinformatics
  • Cheminformatics and Drug Discovery
  • Graph theory
  • Artificial Neural Networks
  • Support Vector Machines
  • Deep Learning

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 2267 KiB  
Article
Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties
by Huan Zhu, Chun-Yan Ao, Yi-Jie Ding, Hong-Xia Hao and Liang Yu
Int. J. Mol. Sci. 2022, 23(6), 3044; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms23063044 - 11 Mar 2022
Cited by 4 | Viewed by 1582
Abstract
Dihydrouridine (D) is an abundant post-transcriptional modification present in transfer RNA from eukaryotes, bacteria, and archaea. D has contributed to treatments for cancerous diseases. Therefore, the precise detection of D modification sites can enable further understanding of its functional roles. Traditional experimental techniques [...] Read more.
Dihydrouridine (D) is an abundant post-transcriptional modification present in transfer RNA from eukaryotes, bacteria, and archaea. D has contributed to treatments for cancerous diseases. Therefore, the precise detection of D modification sites can enable further understanding of its functional roles. Traditional experimental techniques to identify D are laborious and time-consuming. In addition, there are few computational tools for such analysis. In this study, we utilized eleven sequence-derived feature extraction methods and implemented five popular machine algorithms to identify an optimal model. During data preprocessing, data were partitioned for training and testing. Oversampling was also adopted to reduce the effect of the imbalance between positive and negative samples. The best-performing model was obtained through a combination of random forest and nucleotide chemical property modeling. The optimized model presented high sensitivity and specificity values of 0.9688 and 0.9706 in independent tests, respectively. Our proposed model surpassed published tools in independent tests. Furthermore, a series of validations across several aspects was conducted in order to demonstrate the robustness and reliability of our model. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

13 pages, 22342 KiB  
Article
Rapid Discrimination of Neuromyelitis Optica Spectrum Disorder and Multiple Sclerosis Using Machine Learning on Infrared Spectra of Sera
by Youssef El Khoury, Marie Gebelin, Jérôme de Sèze, Christine Patte-Mensah, Gilles Marcou, Alexandre Varnek, Ayikoé-Guy Mensah-Nyagan, Petra Hellwig and Nicolas Collongues
Int. J. Mol. Sci. 2022, 23(5), 2791; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms23052791 - 03 Mar 2022
Cited by 5 | Viewed by 2387
Abstract
Neuromyelitis optica spectrum disorder (NMOSD) and multiple sclerosis (MS) are both autoimmune inflammatory and demyelinating diseases of the central nervous system. NMOSD is a highly disabling disease and rapid introduction of the appropriate treatment at the acute phase is crucial to prevent sequelae. [...] Read more.
Neuromyelitis optica spectrum disorder (NMOSD) and multiple sclerosis (MS) are both autoimmune inflammatory and demyelinating diseases of the central nervous system. NMOSD is a highly disabling disease and rapid introduction of the appropriate treatment at the acute phase is crucial to prevent sequelae. Specific criteria were established in 2015 and provide keys to distinguish NMOSD and MS. One of the most reliable criteria for NMOSD diagnosis is detection in patient’s serum of an antibody that attacks the water channel aquaporin-4 (AQP-4). Another target in NMOSD is myelin oligodendrocyte glycoprotein (MOG), delineating a new spectrum of diseases called MOG-associated diseases. Lastly, patients with NMOSD can be negative for both AQP-4 and MOG antibodies. At disease onset, NMOSD symptoms are very similar to MS symptoms from a clinical and radiological perspective. Thus, at first episode, given the urgency of starting the anti-inflammatory treatment, there is an unmet need to differentiate NMOSD subtypes from MS. Here, we used Fourier transform infrared spectroscopy in combination with a machine learning algorithm with the aim of distinguishing the infrared signatures of sera of a first episode of NMOSD from those of a first episode of relapsing-remitting MS, as well as from those of healthy subjects and patients with chronic inflammatory demyelinating polyneuropathy. Our results showed that NMOSD patients were distinguished from MS patients and healthy subjects with a sensitivity of 100% and a specificity of 100%. We also discuss the distinction between the different NMOSD serostatuses. The coupling of infrared spectroscopy of sera to machine learning is a promising cost-effective, rapid and reliable differential diagnosis tool capable of helping to gain valuable time in patients’ treatment. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

19 pages, 3398 KiB  
Article
A Hybrid Machine Learning and Network Analysis Approach Reveals Two Parkinson’s Disease Subtypes from 115 RNA-Seq Post-Mortem Brain Samples
by Andrea Termine, Carlo Fabrizio, Claudia Strafella, Valerio Caputo, Laura Petrosini, Carlo Caltagirone, Raffaella Cascella and Emiliano Giardina
Int. J. Mol. Sci. 2022, 23(5), 2557; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms23052557 - 25 Feb 2022
Cited by 3 | Viewed by 2659
Abstract
Precision medicine emphasizes fine-grained diagnostics, taking individual variability into account to enhance treatment effectiveness. Parkinson’s disease (PD) heterogeneity among individuals proves the existence of disease subtypes, so subgrouping patients is vital for better understanding disease mechanisms and designing precise treatment. The purpose of [...] Read more.
Precision medicine emphasizes fine-grained diagnostics, taking individual variability into account to enhance treatment effectiveness. Parkinson’s disease (PD) heterogeneity among individuals proves the existence of disease subtypes, so subgrouping patients is vital for better understanding disease mechanisms and designing precise treatment. The purpose of this study was to identify PD subtypes using RNA-Seq data in a combined pipeline including unsupervised machine learning, bioinformatics, and network analysis. Two hundred and ten post mortem brain RNA-Seq samples from PD (n = 115) and normal controls (NCs, n = 95) were obtained with systematic data retrieval following PRISMA statements and a fully data-driven clustering pipeline was performed to identify PD subtypes. Bioinformatics and network analyses were performed to characterize the disease mechanisms of the identified PD subtypes and to identify target genes for drug repurposing. Two PD clusters were identified and 42 DEGs were found (p adjusted ≤ 0.01). PD clusters had significantly different gene network structures (p < 0.0001) and phenotype-specific disease mechanisms, highlighting the differential involvement of the Wnt/β-catenin pathway regulating adult neurogenesis. NEUROD1 was identified as a key regulator of gene networks and ISX9 and PD98059 were identified as NEUROD1-interacting compounds with disease-modifying potential, reducing the effects of dopaminergic neurodegeneration. This hybrid data analysis approach could enable precision medicine applications by providing insights for the identification and characterization of pathological subtypes. This workflow has proven useful on PD brain RNA-Seq, but its application to other neurodegenerative diseases is encouraged. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Graphical abstract

20 pages, 5437 KiB  
Article
IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds
by Viviana Quevedo-Tumailli, Bernabe Ortega-Tenezaca and Humberto González-Díaz
Int. J. Mol. Sci. 2021, 22(23), 13066; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms222313066 - 02 Dec 2021
Cited by 2 | Viewed by 1668
Abstract
The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. [...] Read more.
The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information—Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (cassayj = caj and cdataj = cdj) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (caj) or about the nature and quality of data (cdj). These categorical variables include information about 22 parameters of biological activity (ca0), 28 target proteins (ca1), and 9 organisms of assay (ca2), etc. We also created another partition of (cprotj = cpj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (cp0), 10 chromosomes (cp1), gene orientation (cp2), and 31 protein functions (cp3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon’s entropy measure Shk (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

27 pages, 31786 KiB  
Article
The Crosstalk between SARS-CoV-2 Infection and the RAA System in Essential Hypertension—Analyses Using Systems Approach
by Dorota Formanowicz, Kaja Gutowska, Bartłomiej Szawulak and Piotr Formanowicz
Int. J. Mol. Sci. 2021, 22(19), 10518; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms221910518 - 29 Sep 2021
Cited by 3 | Viewed by 1841
Abstract
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), responsible for the coronavirus disease of 2019 (COVID-19) pandemic, has affected and continues to affect millions of people across the world. Patients with essential arterial hypertension and renal complications are at particular risk of the fatal [...] Read more.
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), responsible for the coronavirus disease of 2019 (COVID-19) pandemic, has affected and continues to affect millions of people across the world. Patients with essential arterial hypertension and renal complications are at particular risk of the fatal course of this infection. In our study, we have modeled the selected processes in a patient with essential hypertension and chronic kidney disease (CKD) suffering from COVID-19, emphasizing the function of the renin-angiotensin-aldosterone (RAA) system. The model has been built in the language of Petri nets theory. Using the systems approach, we have analyzed how COVID-19 may affect the studied organism, and we have checked whether the administration of selected anti-hypertensive drugs (angiotensin-converting enzyme inhibitors (ACEIs) and/or angiotensin receptor blockers (ARBs)) may impact the severity of the infection. Besides, we have assessed whether these drugs effectively lower blood pressure in the case of SARS-CoV-2 infection affecting essential hypertensive patients. Our research has shown that neither the ACEIs nor the ARBs worsens the course infection. However, when assessing the treatment of hypertension in the active SARS-CoV-2 infection, we have observed that ARBs might not effectively reduce blood pressure; they may even have the slightly opposite effect. On the other hand, we have confirmed the effectiveness of arterial hypertension treatment in patients receiving ACEIs. Moreover, we have found that the simultaneous use of ARBs and ACEIs averages the effects of taking both drugs, thus leading to only a slight decrease in blood pressure. We are a way from suggesting that ARBs in all hypertensive patients with COVID-19 are ineffective, but we have shown that research in this area should still be continued. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Graphical abstract

18 pages, 3855 KiB  
Article
Hybrid Deep Learning Based on a Heterogeneous Network Profile for Functional Annotations of Plasmodium falciparum Genes
by Apichat Suratanee and Kitiporn Plaimas
Int. J. Mol. Sci. 2021, 22(18), 10019; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms221810019 - 16 Sep 2021
Cited by 5 | Viewed by 1752
Abstract
Functional annotation of unknown function genes reveals unidentified functions that can enhance our understanding of complex genome communications. A common approach for inferring gene function involves the ortholog-based method. However, genetic data alone are often not enough to provide information for function annotation. [...] Read more.
Functional annotation of unknown function genes reveals unidentified functions that can enhance our understanding of complex genome communications. A common approach for inferring gene function involves the ortholog-based method. However, genetic data alone are often not enough to provide information for function annotation. Thus, integrating other sources of data can potentially increase the possibility of retrieving annotations. Network-based methods are efficient techniques for exploring interactions among genes and can be used for functional inference. In this study, we present an analysis framework for inferring the functions of Plasmodium falciparum genes based on connection profiles in a heterogeneous network between human and Plasmodium falciparum proteins. These profiles were fed into a hybrid deep learning algorithm to predict the orthologs of unknown function genes. The results show high performance of the model’s predictions, with an AUC of 0.89. One hundred and twenty-one predicted pairs with high prediction scores were selected for inferring the functions using statistical enrichment analysis. Using this method, PF3D7_1248700 and PF3D7_0401800 were found to be involved with muscle contraction and striated muscle tissue development, while PF3D7_1303800 and PF3D7_1201000 were found to be related to protein dephosphorylation. In conclusion, combining a heterogeneous network and a hybrid deep learning technique can allow us to identify unknown gene functions of malaria parasites. This approach is generalized and can be applied to other diseases that enhance the field of biomedical science. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

21 pages, 8779 KiB  
Article
Probabilistic Critical Controllability Analysis of Protein Interaction Networks Integrating Normal Brain Ageing Gene Expression Profiles
by Eimi Yamaguchi, Tatsuya Akutsu and Jose C. Nacher
Int. J. Mol. Sci. 2021, 22(18), 9891; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms22189891 - 13 Sep 2021
Viewed by 1447
Abstract
Recently, network controllability studies have proposed several frameworks for the control of large complex biological networks using a small number of life molecules. However, age-related changes in the brain have not been investigated from a controllability perspective. In this study, we compiled the [...] Read more.
Recently, network controllability studies have proposed several frameworks for the control of large complex biological networks using a small number of life molecules. However, age-related changes in the brain have not been investigated from a controllability perspective. In this study, we compiled the gene expression profiles of four normal brain regions from individuals aged 20–99 years and generated dynamic probabilistic protein networks across their lifespan. We developed a new algorithm that efficiently identified critical proteins in probabilistic complex networks, in the context of a minimum dominating set controllability model. The results showed that the identified critical proteins were significantly enriched with well-known ageing genes collected from the GenAge database. In particular, the enrichment observed in replicative and premature senescence biological processes with critical proteins for male samples in the hippocampal region led to the identification of possible new ageing gene candidates. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

12 pages, 1994 KiB  
Article
Molecular Topology for the Search of New Anti-MRSA Compounds
by Jose I. Bueso-Bordils, Pedro A. Alemán-López, Rafael Martín-Algarra, Maria J. Duart, Antonio Falcó and Gerardo M. Antón-Fos
Int. J. Mol. Sci. 2021, 22(11), 5823; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms22115823 - 29 May 2021
Cited by 2 | Viewed by 1786
Abstract
The variability of methicillin-resistant Staphylococcus aureus (MRSA), its rapid adaptive response against environmental changes, and its continued acquisition of antibiotic resistance determinants have made it commonplace in hospitals, where it causes the problem of multidrug resistance. In this study, we used molecular topology [...] Read more.
The variability of methicillin-resistant Staphylococcus aureus (MRSA), its rapid adaptive response against environmental changes, and its continued acquisition of antibiotic resistance determinants have made it commonplace in hospitals, where it causes the problem of multidrug resistance. In this study, we used molecular topology to develop several discriminant equations capable of classifying compounds according to their anti-MRSA activity. Topological indices were used as structural descriptors and their relationship with anti-MRSA activity was determined by applying linear discriminant analysis (LDA) on a group of quinolones and quinolone-like compounds. Four extra equations were constructed, named DFMRSA1, DFMRSA2, DFMRSA3 and DFMRSA4 (DFMRSA was built in a previous study), all with good statistical parameters, such as Fisher–Snedecor F (>68 in all cases), Wilk’s lambda (<0.13 in all cases), and percentage of correct classification (>94% in all cases), which allows a reliable extrapolation prediction of antibacterial activity in any organic compound. The results obtained clearly reveal the high efficiency of combining molecular topology with LDA for the prediction of anti-MRSA activity. Full article
(This article belongs to the Special Issue Complex Networks, Bio-Molecular Systems, and Machine Learning)
Show Figures

Figure 1

Back to TopTop