Inference of Gene Regulatory Networks Using Randomized Algorithms

A special issue of Computation (ISSN 2079-3197). This special issue belongs to the section "Computational Biology".

Deadline for manuscript submissions: closed (28 February 2021) | Viewed by 6455

Special Issue Editors

EducatedGuess.ai, 57000 Siegen, Germany
Interests: randomized algorithms; machine learning; computational biology
Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829 Cologne, Germany
Interests: transcriptional regulation; transcriptome analysis; bioinformatics; plant hormones; plant genomics; genetic engineering; maize (epi)genetics; drought stress

Special Issue Information

Dear Colleagues,

Gene regulation describes the complex mechanism of inducing or repressing the expression of a gene. This highly dynamic process allows an organism to respond to a variety of environmental stimuli, as well as to control cell growth and differentiation during development. Gene regulation is orchestrated by a vast number of molecules, including transcription factors and cofactors, chromatin regulators, as well as other epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders.

As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. At the same time, next generation sequencing has been revolutionizing transcriptome analysis, providing large-scale quantification of gene expression at single-cell resolution, as well as the identification of novel genes and noncoding RNAs at unprecedented levels.

Thus, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches should be capable of efficiently integrating and exploiting the biological heterogeneity of these datasets—entailing different datatypes, experimental treatments, developmental stages, cell types, and even organisms—in order to best infer underlying regulatory networks, often in the absence of sufficient ground truth data for model training. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology (and beyond). To give just a few examples, one of the top performers in benchmarks on gene regulatory network inference from gene expression data is based on random forest regression, and randomized dimensionality reduction techniques, such as randomized PCA, have been successfully employed to efficiently analyze genome-wide single-nucleotide polymorphism datasets.

This Special Issue aims at promoting randomized methods that will help to improve scalable gene regulatory network inference from a plethora of heterogeneous datasets. Papers are solicited on all areas directly related to these topics, including but not limited to:

  • Randomized algorithms for scalable gene regulatory network inference;
  • Module detection and clustering in gene regulatory networks;
  • Cis-regulatory variant detection;
  • (Large-scale) heterogeneous biological data integration;
  • Randomized low-rank matrix approximation and dimensionality reduction;
  • Feature extraction and latent factor identification;
  • (Deep) randomized neural networks and reservoir computing;
  • Sampling and Markov chain Monte Carlo methods.

Dr. Michael Banf
Dr. Thomas Hartwig
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Computation is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Randomized algorithms
  • Gene regulatory network inference
  • Heterogeneous biological data integration
  • Feature extraction
  • Markov chain Monte Carlo
  • Randomized neural networks

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

28 pages, 1461 KiB  
Editorial
The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
by Michael Banf and Thomas Hartwig
Computation 2021, 9(12), 146; https://doi.org/10.3390/computation9120146 - 20 Dec 2021
Viewed by 2942
Abstract
Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of [...] Read more.
Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders. As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. However, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches need to be capable of efficiently integrating and exploiting the biological and technological heterogeneity of such datasets in order to best infer the underlying, highly dynamic regulatory networks, often in the absence of sufficient ground truth data for model training or testing. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology. As an example, one of the top performing algorithms in a community challenge on gene regulatory network inference from transcriptomic data is based on a random forest regression model. In this concise survey, we aim to highlight how randomized methods may serve as a highly valuable tool, in particular, with increasing amounts of large-scale, biological experiments and datasets being collected. Given the complexity and interdisciplinary nature of the gene regulatory network inference problem, we hope our survey maybe helpful to both computational and biological scientists. It is our aim to provide a starting point for a dialogue about the concepts, benefits, and caveats of the toolbox of randomized methods, since unravelling the intricate web of highly dynamic, regulatory events will be one fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases. Full article
(This article belongs to the Special Issue Inference of Gene Regulatory Networks Using Randomized Algorithms)
Show Figures

Figure 1

Research

Jump to: Editorial

12 pages, 2302 KiB  
Article
XGRN: Reconstruction of Biological Networks Based on Boosted Trees Regression
by Georgios N. Dimitrakopoulos
Computation 2021, 9(4), 48; https://0-doi-org.brum.beds.ac.uk/10.3390/computation9040048 - 20 Apr 2021
Cited by 2 | Viewed by 2531
Abstract
In Systems Biology, the complex relationships between different entities in the cells are modeled and analyzed using networks. Towards this aim, a rich variety of gene regulatory network (GRN) inference algorithms has been developed in recent years. However, most algorithms rely solely on [...] Read more.
In Systems Biology, the complex relationships between different entities in the cells are modeled and analyzed using networks. Towards this aim, a rich variety of gene regulatory network (GRN) inference algorithms has been developed in recent years. However, most algorithms rely solely on gene expression data to reconstruct the network. Due to possible expression profile similarity, predictions can contain connections between biologically unrelated genes. Therefore, previously known biological information should also be considered by computational methods to obtain more consistent results, such as experimentally validated interactions between transcription factors and target genes. In this work, we propose XGBoost for gene regulatory networks (XGRN), a supervised algorithm, which combines gene expression data with previously known interactions for GRN inference. The key idea of our method is to train a regression model for each known interaction of the network and then utilize this model to predict new interactions. The regression is performed by XGBoost, a state-of-the-art algorithm using an ensemble of decision trees. In detail, XGRN learns a regression model based on gene expression of the two interactors and then provides predictions using as input the gene expression of other candidate interactors. Application on benchmark datasets and a real large single-cell RNA-Seq experiment resulted in high performance compared to other unsupervised and supervised methods, demonstrating the ability of XGRN to provide reliable predictions. Full article
(This article belongs to the Special Issue Inference of Gene Regulatory Networks Using Randomized Algorithms)
Show Figures

Figure 1

Back to TopTop