Bioinformatic Analysis for Rare Diseases

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Technologies and Resources for Genetics".

Deadline for manuscript submissions: closed (15 June 2019) | Viewed by 50071

Special Issue Editors

Department of Population Health Sciences, Augusta University, Augusta, GA 30912-4900, USA
Interests: statistical genetics; genomics; bioinformatics; population genetics; data science
Special Issues, Collections and Topics in MDPI journals
1. The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
2. Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Interests: human genomics; disease genomics; bioinformatics; computational biology; machine learning; next-generation sequencing; mutation prediction

Special Issue Information

Dear colleagues,

Rare diseases, especially Mendelian and monogenic, have played a critical role in elucidating the genetic basis of human diseases. The development of modern techniques such as next generation sequencing (NGS) combined with the increase of computing power and bioinformatic software development has enabled broader-scale research of rare diseases to better understand their biological mechanisms, genomic and proteomic basis, environment, and the combination of these different contributing factors. Therefore, especially in the light of incomplete clinical penetrance, some rare diseases are increasingly viewed as complex diseases. Large volumes of biological data at various levels have been exponentially accumulated over the last decade, including NGS (whole genome/exome sequencing, RNA-seq, DNA methylation), proteomics, and metabolomics. These complex and large data pose challenges for bioinformatic analyses, especially in the interpretation and understanding of biological mechanisms underlying diseases.

In this Special Issue, we will focus on recent development of bioinformatic analysis approaches, computational tools, algorithms, software, and resources for rare diseases. We encourage a broad range of bioinformatic approaches applied to rare diseases: from novel methods to databases, servers, pipelines, integration of several data types, modeling and systems biology, and biological discoveries made by applying these tools. We welcome submissions of reviews, research articles, short communications, and concept papers.

Dr. Yuval Itan
Prof. Hongyan Xu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

We are happy to offer a 15% discount from our APC to all planned contributions. Please contact and inform [email protected] in advance for this purpose. 

Keywords

  • rare diseases
  • genomic and proteomic basis
  • whole genome sequencing
  • whole exome sequencing
  • bioinformatics
  • pipelines, software, and resources
  • haplotype analysis
  • linkage disequilibrium
  • single nucleotide polymorphism (SNP)
  • variant discovery
  • genome-wide association studies (GWAS)
  • post-GWAS
  • human rare variants
  • gene function discovery

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

25 pages, 1207 KiB  
Article
An Improved Phenotype-Driven Tool for Rare Mendelian Variant Prioritization: Benchmarking Exomiser on Real Patient Whole-Exome Data
by Valentina Cipriani, Nikolas Pontikos, Gavin Arno, Panagiotis I. Sergouniotis, Eva Lenassi, Penpitcha Thawong, Daniel Danis, Michel Michaelides, Andrew R. Webster, Anthony T. Moore, Peter N. Robinson, Julius O.B. Jacobsen and Damian Smedley
Genes 2020, 11(4), 460; https://0-doi-org.brum.beds.ac.uk/10.3390/genes11040460 - 23 Apr 2020
Cited by 35 | Viewed by 8235
Abstract
Next-generation sequencing has revolutionized rare disease diagnostics, but many patients remain without a molecular diagnosis, particularly because many candidate variants usually survive despite strict filtering. Exomiser was launched in 2014 as a Java tool that performs an integrative analysis of patients’ sequencing data [...] Read more.
Next-generation sequencing has revolutionized rare disease diagnostics, but many patients remain without a molecular diagnosis, particularly because many candidate variants usually survive despite strict filtering. Exomiser was launched in 2014 as a Java tool that performs an integrative analysis of patients’ sequencing data and their phenotypes encoded with Human Phenotype Ontology (HPO) terms. It prioritizes variants by leveraging information on variant frequency, predicted pathogenicity, and gene-phenotype associations derived from human diseases, model organisms, and protein–protein interactions. Early published releases of Exomiser were able to prioritize disease-causative variants as top candidates in up to 97% of simulated whole-exomes. The size of the tested real patient datasets published so far are very limited. Here, we present the latest Exomiser version 12.0.1 with many new features. We assessed the performance using a set of 134 whole-exomes from patients with a range of rare retinal diseases and known molecular diagnosis. Using default settings, Exomiser ranked the correct diagnosed variants as the top candidate in 74% of the dataset and top 5 in 94%; not using the patients’ HPO profiles (i.e., variant-only analysis) decreased the performance to 3% and 27%, respectively. In conclusion, Exomiser is an effective support tool for rare Mendelian phenotype-driven variant prioritization. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

5 pages, 608 KiB  
Communication
TGStools: A Bioinformatics Suit to Facilitate Transcriptome Analysis of Long Reads from Third Generation Sequencing Platform
by Danze Chen, Qianqian Zhao, Leiming Jiang, Shuaiyuan Liao, Zhigang Meng and Jianzhen Xu
Genes 2019, 10(7), 519; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10070519 - 10 Jul 2019
Cited by 1 | Viewed by 2722
Abstract
Recent analyses show that transcriptome sequencing can be utilized as a diagnostic tool for rare Mendelian diseases. The third generation sequencing de novo detects long reads of thousands of base pairs, thus greatly expanding the isoform discovery and identification of novel long noncoding [...] Read more.
Recent analyses show that transcriptome sequencing can be utilized as a diagnostic tool for rare Mendelian diseases. The third generation sequencing de novo detects long reads of thousands of base pairs, thus greatly expanding the isoform discovery and identification of novel long noncoding RNAs. In this study, we developed TGStools, a bioinformatics suite to facilitate routine tasks such as characterizing full-length transcripts, detecting shifted types of alternative splicing, and long noncoding RNAs (lncRNAs) identification in transcriptome analysis. It also prioritizes the transcripts with a visualization framework that automatically integrates rich annotation with known genomic features. TGStools is a Python package freely available at Github. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

16 pages, 1603 KiB  
Article
Computational Methods for Detection of Differentially Methylated Regions Using Kernel Distance and Scan Statistics
by Faith Dunbar, Hongyan Xu, Duchwan Ryu, Santu Ghosh, Huidong Shi and Varghese George
Genes 2019, 10(4), 298; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10040298 - 12 Apr 2019
Cited by 2 | Viewed by 2548
Abstract
Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose [...] Read more.
Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

17 pages, 4489 KiB  
Article
Inferring Drug-Protein–Side Effect Relationships from Biomedical Text
by Min Song, Seung Han Baek, Go Eun Heo and Jeong-Hoon Lee
Genes 2019, 10(2), 159; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10020159 - 19 Feb 2019
Cited by 7 | Viewed by 3125
Abstract
Background: Although there are many studies of drugs and their side effects, the underlying mechanisms of these side effects are not well understood. It is also difficult to understand the specific pathways between drugs and side effects. Objective: The present study [...] Read more.
Background: Although there are many studies of drugs and their side effects, the underlying mechanisms of these side effects are not well understood. It is also difficult to understand the specific pathways between drugs and side effects. Objective: The present study seeks to construct putative paths between drugs and their side effects by applying text-mining techniques to free text of biomedical studies, and to develop ranking metrics that could identify the most-likely paths. Materials and Methods: We extracted three types of relationships—drug-protein, protein-protein, and protein–side effect—from biomedical texts by using text mining and predefined relation-extraction rules. Based on the extracted relationships, we constructed whole drug-protein–side effect paths. For each path, we calculated its ranking score by a new ranking function that combines corpus- and ontology-based semantic similarity as well as co-occurrence frequency. Results: We extracted 13 plausible biomedical paths connecting drugs and their side effects from cancer-related abstracts in the PubMed database. The top 20 paths were examined, and the proposed ranking function outperformed the other methods tested, including co-occurrence, COALS, and UMLS by P@5-P@20. In addition, we confirmed that the paths are novel hypotheses that are worth investigating further. Discussion: The risk of side effects has been an important issue for the US Food and Drug Administration (FDA). However, the causes and mechanisms of such side effects have not been fully elucidated. This study extends previous research on understanding drug side effects by using various techniques such as Named Entity Recognition (NER), Relation Extraction (RE), and semantic similarity. Conclusion: It is not easy to reveal the biomedical mechanisms of side effects due to a huge number of possible paths. However, we automatically generated predictable paths using the proposed approach, which could provide meaningful information to biomedical researchers to generate plausible hypotheses for the understanding of such mechanisms. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

Review

Jump to: Research

19 pages, 1767 KiB  
Review
Genetic Modifiers and Rare Mendelian Disease
by K. M. Tahsin Hassan Rahit and Maja Tarailo-Graovac
Genes 2020, 11(3), 239; https://0-doi-org.brum.beds.ac.uk/10.3390/genes11030239 - 25 Feb 2020
Cited by 85 | Viewed by 10727
Abstract
Despite advances in high-throughput sequencing that have revolutionized the discovery of gene defects in rare Mendelian diseases, there are still gaps in translating individual genome variation to observed phenotypic outcomes. While we continue to improve genomics approaches to identify primary disease-causing variants, it [...] Read more.
Despite advances in high-throughput sequencing that have revolutionized the discovery of gene defects in rare Mendelian diseases, there are still gaps in translating individual genome variation to observed phenotypic outcomes. While we continue to improve genomics approaches to identify primary disease-causing variants, it is evident that no genetic variant acts alone. In other words, some other variants in the genome (genetic modifiers) may alleviate (suppress) or exacerbate (enhance) the severity of the disease, resulting in the variability of phenotypic outcomes. Thus, to truly understand the disease, we need to consider how the disease-causing variants interact with the rest of the genome in an individual. Here, we review the current state-of-the-field in the identification of genetic modifiers in rare Mendelian diseases and discuss the potential for future approaches that could bridge the existing gap. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

24 pages, 375 KiB  
Review
Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter?
by Sandra Brasil, Carlota Pascoal, Rita Francisco, Vanessa dos Reis Ferreira, Paula A. Videira and Gonçalo Valadão
Genes 2019, 10(12), 978; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10120978 - 27 Nov 2019
Cited by 62 | Viewed by 9674
Abstract
The amount of data collected and managed in (bio)medicine is ever-increasing. Thus, there is a need to rapidly and efficiently collect, analyze, and characterize all this information. Artificial intelligence (AI), with an emphasis on deep learning, holds great promise in this area and [...] Read more.
The amount of data collected and managed in (bio)medicine is ever-increasing. Thus, there is a need to rapidly and efficiently collect, analyze, and characterize all this information. Artificial intelligence (AI), with an emphasis on deep learning, holds great promise in this area and is already being successfully applied to basic research, diagnosis, drug discovery, and clinical trials. Rare diseases (RDs), which are severely underrepresented in basic and clinical research, can particularly benefit from AI technologies. Of the more than 7000 RDs described worldwide, only 5% have a treatment. The ability of AI technologies to integrate and analyze data from different sources (e.g., multi-omics, patient registries, and so on) can be used to overcome RDs’ challenges (e.g., low diagnostic rates, reduced number of patients, geographical dispersion, and so on). Ultimately, RDs’ AI-mediated knowledge could significantly boost therapy development. Presently, there are AI approaches being used in RDs and this review aims to collect and summarize these advances. A section dedicated to congenital disorders of glycosylation (CDG), a particular group of orphan RDs that can serve as a potential study model for other common diseases and RDs, has also been included. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
15 pages, 662 KiB  
Review
Biological Network Approaches and Applications in Rare Disease Studies
by Peng Zhang and Yuval Itan
Genes 2019, 10(10), 797; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10100797 - 12 Oct 2019
Cited by 23 | Viewed by 4468
Abstract
Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that [...] Read more.
Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

18 pages, 1475 KiB  
Review
Uncovering Missing Heritability in Rare Diseases
by Tatiana Maroilley and Maja Tarailo-Graovac
Genes 2019, 10(4), 275; https://0-doi-org.brum.beds.ac.uk/10.3390/genes10040275 - 04 Apr 2019
Cited by 35 | Viewed by 6449
Abstract
The problem of ‘missing heritability’ affects both common and rare diseases hindering: discovery, diagnosis, and patient care. The ‘missing heritability’ concept has been mainly associated with common and complex diseases where promising modern technological advances, like genome-wide association studies (GWAS), were unable to [...] Read more.
The problem of ‘missing heritability’ affects both common and rare diseases hindering: discovery, diagnosis, and patient care. The ‘missing heritability’ concept has been mainly associated with common and complex diseases where promising modern technological advances, like genome-wide association studies (GWAS), were unable to uncover the complete genetic mechanism of the disease/trait. Although rare diseases (RDs) have low prevalence individually, collectively they are common. Furthermore, multi-level genetic and phenotypic complexity when combined with the individual rarity of these conditions poses an important challenge in the quest to identify causative genetic changes in RD patients. In recent years, high throughput sequencing has accelerated discovery and diagnosis in RDs. However, despite the several-fold increase (from ~10% using traditional to ~40% using genome-wide genetic testing) in finding genetic causes of these diseases in RD patients, as is the case in common diseases—the majority of RDs are also facing the ‘missing heritability’ problem. This review outlines the key role of high throughput sequencing in uncovering genetics behind RDs, with a particular focus on genome sequencing. We review current advances and challenges of sequencing technologies, bioinformatics approaches, and resources. Full article
(This article belongs to the Special Issue Bioinformatic Analysis for Rare Diseases)
Show Figures

Figure 1

Back to TopTop