Bioinformatics and Computational Approaches in Viral Genomics and Evolution

A special issue of Viruses (ISSN 1999-4915).

Deadline for manuscript submissions: closed (31 August 2020) | Viewed by 47727

Special Issue Editors


E-Mail Website
Guest Editor
The Westmead Institute for Medical Research and the University of Sydney, Sydney, Australia
Interests: virology; virus evolution; RNA viruses; phylogenetics; bioinformatics; pathogen discovery; metagenomics; RNA-seq

E-Mail Website
Guest Editor
The University of Melbourne and the Peter Doherty Institute for Infection and Immunity, Parkville, Australia
Interests: phylodynamics; phylogenetics; infectious disease epidemiology; molecular evolution

E-Mail Website
Guest Editor
Centre for Infection and Immunity Studies, School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China
Interests: virus evolution; metagenomics; meta-transcriptomics; macroevolution; pathogen discovery; virology
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear colleague,

In recent years, bioinformatics and computational methods have become a core component of virus research, underpinning the fields of virus genomics and evolution. The scale and complexity of analyses have grown along with major advances in genetic sequencing and, consequently, have required a paradigm shift with regard to the statistical and computational requirements of viral genomic analysis. For example, in phylogenetics, new methods have been developed to estimate trees with large numbers of taxa (>10,000 sequences) and to infer complex transmission dynamics using machine learning techniques. Similarly, integrative phylodynamic methods can combine key phenotypic “traits” such as sampling location, case counts, and time with virus genetic data to obtain new insights into the epidemic spread of important pathogens such as influenza virus, Zika virus, and HIV. From a macro-evolution perspective, the use of genomic- and metagenomic-based approaches has expanded our knowledge of the diversity and evolutionary history of the entire virosphere, providing new insight into many old questions such as virus origin, genome evolution, evolution time scales, and virus–host interactions.

The purpose of this Special Issue is to bring together a series of articles (both reviews and original research) related the development and application of novel sequencing and analytical approaches to better understand the discovery, transmission, evolution, and molecular epidemiology of viruses across all hosts.

Dr. John-Sebastian Eden
Dr. Sebastián Duchêne
Prof. Dr. Mang Shi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Viruses is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • bioinformatics
  • computational biology
  • phylogenetics
  • virus evolution
  • virus genomics
  • metagenomics
  • macroevolution
  • virus–host interactions

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

11 pages, 8345 KiB  
Article
An Amplicon-Based Approach for the Whole-Genome Sequencing of Human Metapneumovirus
by Rachel L. Tulloch, Jen Kok, Ian Carter, Dominic E. Dwyer and John-Sebastian Eden
Viruses 2021, 13(3), 499; https://0-doi-org.brum.beds.ac.uk/10.3390/v13030499 - 18 Mar 2021
Cited by 9 | Viewed by 2508
Abstract
Human metapneumovirus (HMPV) is an important cause of upper and lower respiratory tract disease in individuals of all ages. It is estimated that most individuals will be infected by HMPV by the age of five years old. Despite this burden of disease, there [...] Read more.
Human metapneumovirus (HMPV) is an important cause of upper and lower respiratory tract disease in individuals of all ages. It is estimated that most individuals will be infected by HMPV by the age of five years old. Despite this burden of disease, there remain caveats in our knowledge of global genetic diversity due to a lack of HMPV sequencing, particularly at the whole-genome scale. The purpose of this study was to create a simple and robust approach for HMPV whole-genome sequencing to be used for genomic epidemiological studies. To design our assay, all available HMPV full-length genome sequences were downloaded from the National Center for Biotechnology Information (NCBI) GenBank database and used to design four primer sets to amplify long, overlapping amplicons spanning the viral genome and, importantly, specific to all known HMPV subtypes. These amplicons were then pooled and sequenced on an Illumina iSeq 100 (Illumina, San Diego, CA, USA); however, the approach is suitable to other common sequencing platforms. We demonstrate the utility of this method using a representative subset of clinical samples and examine these sequences using a phylogenetic approach. Here we present an amplicon-based method for the whole-genome sequencing of HMPV from clinical extracts that can be used to better inform genomic studies of HMPV epidemiology and evolution. Full article
Show Figures

Figure 1

18 pages, 1403 KiB  
Article
Characterizing and Evaluating the Zoonotic Potential of Novel Viruses Discovered in Vampire Bats
by Laura M. Bergner, Nardus Mollentze, Richard J. Orton, Carlos Tello, Alice Broos, Roman Biek and Daniel G. Streicker
Viruses 2021, 13(2), 252; https://0-doi-org.brum.beds.ac.uk/10.3390/v13020252 - 06 Feb 2021
Cited by 44 | Viewed by 6134
Abstract
The contemporary surge in metagenomic sequencing has transformed knowledge of viral diversity in wildlife. However, evaluating which newly discovered viruses pose sufficient risk of infecting humans to merit detailed laboratory characterization and surveillance remains largely speculative. Machine learning algorithms have been developed to [...] Read more.
The contemporary surge in metagenomic sequencing has transformed knowledge of viral diversity in wildlife. However, evaluating which newly discovered viruses pose sufficient risk of infecting humans to merit detailed laboratory characterization and surveillance remains largely speculative. Machine learning algorithms have been developed to address this imbalance by ranking the relative likelihood of human infection based on viral genome sequences, but are not yet routinely applied to viruses at the time of their discovery. Here, we characterized viral genomes detected through metagenomic sequencing of feces and saliva from common vampire bats (Desmodus rotundus) and used these data as a case study in evaluating zoonotic potential using molecular sequencing data. Of 58 detected viral families, including 17 which infect mammals, the only known zoonosis detected was rabies virus; however, additional genomes were detected from the families Hepeviridae, Coronaviridae, Reoviridae, Astroviridae and Picornaviridae, all of which contain human-infecting species. In phylogenetic analyses, novel vampire bat viruses most frequently grouped with other bat viruses that are not currently known to infect humans. In agreement, machine learning models built from only phylogenetic information ranked all novel viruses similarly, yielding little insight into zoonotic potential. In contrast, genome composition-based machine learning models estimated different levels of zoonotic potential, even for closely related viruses, categorizing one out of four detected hepeviruses and two out of three picornaviruses as having high priority for further research. We highlight the value of evaluating zoonotic potential beyond ad hoc consideration of phylogeny and provide surveillance recommendations for novel viruses in a wildlife host which has frequent contact with humans and domestic animals. Full article
Show Figures

Figure 1

12 pages, 3693 KiB  
Article
The Impacts of Low Diversity Sequence Data on Phylodynamic Inference during an Emerging Epidemic
by Anthony Lam and Sebastian Duchene
Viruses 2021, 13(1), 79; https://0-doi-org.brum.beds.ac.uk/10.3390/v13010079 - 08 Jan 2021
Cited by 3 | Viewed by 2360
Abstract
Phylodynamic inference is a pivotal tool in understanding transmission dynamics of viral outbreaks. These analyses are strongly guided by the input of an epidemiological model as well as sequence data that must contain sufficient intersequence variability in order to be informative. These criteria, [...] Read more.
Phylodynamic inference is a pivotal tool in understanding transmission dynamics of viral outbreaks. These analyses are strongly guided by the input of an epidemiological model as well as sequence data that must contain sufficient intersequence variability in order to be informative. These criteria, however, may not be met during the early stages of an outbreak. Here we investigate the impact of low diversity sequence data on phylodynamic inference using the birth–death and coalescent exponential models. Through our simulation study, estimating the molecular evolutionary rate required enough sequence diversity and is an essential first step for any phylodynamic inference. Following this, the birth–death model outperforms the coalescent exponential model in estimating epidemiological parameters, when faced with low diversity sequence data due to explicitly exploiting the sampling times. In contrast, the coalescent model requires additional samples and therefore variability in sequence data before accurate estimates can be obtained. These findings were also supported through our empirical data analyses of an Australian and a New Zealand cluster outbreaks of SARS-CoV-2. Overall, the birth–death model is more robust when applied to datasets with low sequence diversity given sampling is specified and this should be considered for future viral outbreak investigations. Full article
Show Figures

Figure 1

19 pages, 2261 KiB  
Article
Genomic Diversity and Evolution of Quasispecies in Newcastle Disease Virus Infections
by Archana Jadhav, Lele Zhao, Weiwei Liu, Chan Ding, Venugopal Nair, Sebastian E. Ramos-Onsins and Luca Ferretti
Viruses 2020, 12(11), 1305; https://0-doi-org.brum.beds.ac.uk/10.3390/v12111305 - 14 Nov 2020
Cited by 7 | Viewed by 3002
Abstract
Newcastle disease virus (NDV) infections are well known to harbour quasispecies, due to the error-prone nature of the RNA polymerase. Quasispecies variants in the fusion cleavage site of the virus are known to significantly change its virulence. However, little is known about the [...] Read more.
Newcastle disease virus (NDV) infections are well known to harbour quasispecies, due to the error-prone nature of the RNA polymerase. Quasispecies variants in the fusion cleavage site of the virus are known to significantly change its virulence. However, little is known about the genomic patterns of diversity and selection in NDV viral swarms. We analyse deep sequencing data from in vitro and in vivo NDV infections to uncover the genomic patterns of diversity and the signatures of selection within NDV swarms. Variants in viruses from in vitro samples are mostly localised in non-coding regions and 3′ and 5′ untranslated regions (3′UTRs or 5′UTRs), while in vivo samples contain an order of magnitude more variants. We find different patterns of genomic divergence and diversity among NDV genotypes, as well as differences in the genomic distribution of intra-host variants among in vitro and in vivo infections of the same strain. The frequency spectrum shows clear signatures of intra-host purifying selection in vivo on the matrix protein (M) coding gene and positive or diversifying selection on nucleocapsid (NP) and haemagglutinin-neuraminidase (HN). The comparison between within-host polymorphisms and phylogenetic divergence reveals complex patterns of selective pressure on the NDV genome at between- and within-host level. The M sequence is strongly constrained both between and within hosts, fusion protein (F) coding gene is under intra-host positive selection, and NP and HN show contrasting patterns: HN RNA sequence is positively selected between hosts while its protein sequence is positively selected within hosts, and NP is under intra-host positive selection at the RNA level and negative selection at the protein level. Full article
Show Figures

Figure 1

10 pages, 1924 KiB  
Article
Patterns of RNA Editing in Newcastle Disease Virus Infections
by Archana Jadhav, Lele Zhao, Alice Ledda, Weiwei Liu, Chan Ding, Venugopal Nair and Luca Ferretti
Viruses 2020, 12(11), 1249; https://0-doi-org.brum.beds.ac.uk/10.3390/v12111249 - 02 Nov 2020
Cited by 11 | Viewed by 2434
Abstract
The expression of accessory non-structural proteins V and W in Newcastle disease virus (NDV) infections depends on RNA editing. These proteins are derived from frameshifts of the sequence coding for the P protein via co-transcriptional insertion of one or two guanines in the [...] Read more.
The expression of accessory non-structural proteins V and W in Newcastle disease virus (NDV) infections depends on RNA editing. These proteins are derived from frameshifts of the sequence coding for the P protein via co-transcriptional insertion of one or two guanines in the mRNA. However, a larger number of guanines can be inserted with lower frequencies. We analysed data from deep RNA sequencing of samples from in vitro and in vivo NDV infections to uncover the patterns of mRNA editing in NDV. The distribution of insertions is well described by a simple Markov model of polymerase stuttering, providing strong quantitative confirmation of the molecular process hypothesised by Kolakofsky and collaborators three decades ago. Our results suggest that the probability that the NDV polymerase would stutter is about 0.45 initially, and 0.3 for further subsequent insertions. The latter probability is approximately independent of the number of previous insertions, the host cell, and viral strain. However, in LaSota infections, we also observe deviations from the predicted V/W ratio of about 3:1 according to this model, which could be attributed to deviations from this stuttering model or to further mechanisms downregulating the abundance of W protein. Full article
Show Figures

Figure 1

20 pages, 2545 KiB  
Article
A Systematic Evaluation of High-Throughput Sequencing Approaches to Identify Low-Frequency Single Nucleotide Variants in Viral Populations
by David J. King, Graham Freimanis, Lidia Lasecka-Dykes, Amin Asfor, Paolo Ribeca, Ryan Waters, Donald P. King and Emma Laing
Viruses 2020, 12(10), 1187; https://0-doi-org.brum.beds.ac.uk/10.3390/v12101187 - 20 Oct 2020
Cited by 9 | Viewed by 3361
Abstract
High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have [...] Read more.
High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA “populations” were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (107 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (105 copies) required more technical replicates to maintain accuracy, while low RNA inputs (103 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants. Full article
Show Figures

Figure 1

11 pages, 1868 KiB  
Article
A Mutation Network Method for Transmission Analysis of Human Influenza H3N2
by Chi Zhang, Yinghan Wang, Cai Chen, Haoyu Long, Junbo Bai, Jinfeng Zeng, Zicheng Cao, Bing Zhang, Wei Shen, Feng Tang, Shiwen Liang, Caijun Sun, Yuelong Shu and Xiangjun Du
Viruses 2020, 12(10), 1125; https://0-doi-org.brum.beds.ac.uk/10.3390/v12101125 - 03 Oct 2020
Cited by 3 | Viewed by 3120
Abstract
Characterizing the spatial transmission pattern is critical for better surveillance and control of human influenza. Here, we propose a mutation network framework that utilizes network theory to study the transmission of human influenza H3N2. On the basis of the mutation network, the transmission [...] Read more.
Characterizing the spatial transmission pattern is critical for better surveillance and control of human influenza. Here, we propose a mutation network framework that utilizes network theory to study the transmission of human influenza H3N2. On the basis of the mutation network, the transmission analysis captured the circulation pattern from a global simulation of human influenza H3N2. Furthermore, this method was applied to explore, in detail, the transmission patterns within Europe, the United States, and China, revealing the regional spread of human influenza H3N2. The mutation network framework proposed here could facilitate the understanding, surveillance, and control of other infectious diseases. Full article
Show Figures

Figure 1

16 pages, 2006 KiB  
Article
Genetic Diversity Analysis of Coxsackievirus A8 Circulating in China and Worldwide Reveals a Highly Divergent Genotype
by Yang Song, Dongyan Wang, Yong Zhang, Zhenzhi Han, Jinbo Xiao, Huanhuan Lu, Dongmei Yan, Tianjiao Ji, Qian Yang, Shuangli Zhu and Wenbo Xu
Viruses 2020, 12(10), 1061; https://0-doi-org.brum.beds.ac.uk/10.3390/v12101061 - 23 Sep 2020
Cited by 5 | Viewed by 2388
Abstract
Coxsackievirus A8 (CV-A8) is one of the pathogens associated with hand, foot and mouth disease (HFMD) and herpangina (HA), occasionally leading to severe neurological disorders such as acute flaccid paralysis (AFP). Only one study aimed at CV-A8 has been published to date, and [...] Read more.
Coxsackievirus A8 (CV-A8) is one of the pathogens associated with hand, foot and mouth disease (HFMD) and herpangina (HA), occasionally leading to severe neurological disorders such as acute flaccid paralysis (AFP). Only one study aimed at CV-A8 has been published to date, and only 12 whole-genome sequences are publicly available. In this study, complete genome sequences from 11 CV-A8 strains isolated from HFMD patients in extensive regions from China between 2013 and 2018 were determined, and all sequences from GenBank were retrieved. A phylogenetic analysis based on a total of 34 complete VP1 sequences of CV-A8 revealed five genotypes: A, B, C, D and E. The newly emerging genotype E presented a highly phylogenetic divergence compared with the other genotypes and was composed of the majority of the strains sequenced in this study. Markov chain Monte Carlo (MCMC) analysis revealed that genotype E has been evolving for nearly a century and somehow arose in approximately 2010. The Bayesian skyline plot showed that the population size of CV-A8 has experienced three dynamic fluctuations since 2001. Amino acid residues of VP1100N, 103Y, 240T and 241V, which were embedded in the potential capsid loops of genotype E, might enhance genotype E adaption to the human hosts. The CV-A8 whole genomes displayed significant intra-genotypic genetic diversity in the non-capsid region, and a total of six recombinant lineages were detected. The Chinese viruses from genotype E might have emerged recently from recombining with European CV-A6 strains. CV-A8 is a less important HFMD pathogen, and the capsid gene diversity and non-capsid recombination variety observed in CV-A8 strains indicated that the constant generation of deleterious genomes and a constant selection pressure against these deleterious mutations is still ongoing within CV-A8 quasispecies. It is possible that CV-A8 could become an important pathogen in the HFMD spectrum in the future. Further surveillance of CV-A8 is greatly needed. Full article
Show Figures

Figure 1

13 pages, 1561 KiB  
Article
Evolutionary Study of the Crassphage Virus at Gene Level
by Alessandro Rossi, Laura Treu, Stefano Toppo, Henrike Zschach, Stefano Campanaro and Bas E. Dutilh
Viruses 2020, 12(9), 1035; https://0-doi-org.brum.beds.ac.uk/10.3390/v12091035 - 17 Sep 2020
Cited by 8 | Viewed by 3628
Abstract
crAss-like viruses are a putative family of bacteriophages recently discovered. The eponym of the clade, crAssphage, is an enteric bacteriophage estimated to be present in at least half of the human population and it constitutes up to 90% of the sequences in some [...] Read more.
crAss-like viruses are a putative family of bacteriophages recently discovered. The eponym of the clade, crAssphage, is an enteric bacteriophage estimated to be present in at least half of the human population and it constitutes up to 90% of the sequences in some human fecal viral metagenomic datasets. We focused on the evolutionary dynamics of the genes encoded on the crAssphage genome. By investigating the conservation of the genes, a consistent variation in the evolutionary rates across the different functional groups was found. Gene duplications in crAss-like genomes were detected. By exploring the differences among the functional categories of the genes, we confirmed that the genes encoding capsid proteins were the most ubiquitous, despite their overall low sequence conservation. It was possible to identify a core of proteins whose evolutionary trees strongly correlate with each other, suggesting their genetic interaction. This group includes the capsid proteins, which are thus established as extremely suitable for rebuilding the phylogenetic tree of this viral clade. A negative correlation between the ubiquity and the conservation of viral protein sequences was shown. Together, this study provides an in-depth picture of the evolution of different genes in crAss-like viruses. Full article
Show Figures

Figure 1

18 pages, 3259 KiB  
Article
The Diversity and Distribution of Viruses Associated with Culex annulirostris Mosquitoes from the Kimberley Region of Western Australia
by Simon H. Williams, Avram Levy, Rachel A. Yates, Nilusha Somaweera, Peter J. Neville, Jay Nicholson, Michael D. A. Lindsay, John S. Mackenzie, Komal Jain, Allison Imrie, David W. Smith and W. Ian Lipkin
Viruses 2020, 12(7), 717; https://0-doi-org.brum.beds.ac.uk/10.3390/v12070717 - 02 Jul 2020
Cited by 15 | Viewed by 3002
Abstract
Metagenomics revealed an impressive breadth of previously unrecognized viruses. Here, we report the virome of the Culex annulirostris Skuse mosquito, an important vector of pathogenic arboviruses in Australia. Mosquitoes were collected from three sites in the Kimberley region of Western Australia. Unbiased high-throughput [...] Read more.
Metagenomics revealed an impressive breadth of previously unrecognized viruses. Here, we report the virome of the Culex annulirostris Skuse mosquito, an important vector of pathogenic arboviruses in Australia. Mosquitoes were collected from three sites in the Kimberley region of Western Australia. Unbiased high-throughput sequencing (HTS) revealed the presence of 16 novel viral sequences that share less than 90% identity with known viruses. None were closely related to pathogenic arboviruses. Viruses were distributed unevenly across sites, indicating a heterogeneous Cx. annulirostris virome. Polymerase chain reaction assays confirmed HTS data and identified marked variation between the virus prevalence identified at each site. Full article
Show Figures

Figure 1

10 pages, 1654 KiB  
Article
Bacsnp: Using Single Nucleotide Polymorphism (SNP) Specificities and Frequencies to Identify Genotype Composition in Baculoviruses
by Jörg T. Wennmann, Jiangbin Fan and Johannes A. Jehle
Viruses 2020, 12(6), 625; https://0-doi-org.brum.beds.ac.uk/10.3390/v12060625 - 09 Jun 2020
Cited by 8 | Viewed by 2861
Abstract
Natural isolates of baculoviruses (as well as other dsDNA viruses) generally consist of homogenous or heterogenous populations of genotypes. The number and positions of single nucleotide polymorphisms (SNPs) from sequencing data are often used as suitable markers to study their genotypic composition. Identifying [...] Read more.
Natural isolates of baculoviruses (as well as other dsDNA viruses) generally consist of homogenous or heterogenous populations of genotypes. The number and positions of single nucleotide polymorphisms (SNPs) from sequencing data are often used as suitable markers to study their genotypic composition. Identifying and assigning the specificities and frequencies of SNPs from high-throughput genome sequencing data can be very challenging, especially when comparing between several sequenced isolates or samples. In this study, the new tool “bacsnp”, written in R programming langue, was developed as a downstream process, enabling the detection of SNP specificities across several virus isolates. The basis of this analysis is the use of a common, closely related reference to which the sequencing reads of an isolate are mapped. Thereby, the specificities of SNPs are linked and their frequencies can be used to analyze the genetic composition across the sequenced isolate. Here, the downstream process and analysis of detected SNP positions is demonstrated on the example of three baculovirus isolates showing the fast and reliable detection of a mixed sequenced sample. Full article
Show Figures

Figure 1

24 pages, 6139 KiB  
Article
Drug Resistance Prediction Using Deep Learning Techniques on HIV-1 Sequence Data
by Margaret C. Steiner, Keylie M. Gibson and Keith A. Crandall
Viruses 2020, 12(5), 560; https://0-doi-org.brum.beds.ac.uk/10.3390/v12050560 - 19 May 2020
Cited by 29 | Viewed by 5664
Abstract
The fast replication rate and lack of repair mechanisms of human immunodeficiency virus (HIV) contribute to its high mutation frequency, with some mutations resulting in the evolution of resistance to antiretroviral therapies (ART). As such, studying HIV drug resistance allows for real-time evaluation [...] Read more.
The fast replication rate and lack of repair mechanisms of human immunodeficiency virus (HIV) contribute to its high mutation frequency, with some mutations resulting in the evolution of resistance to antiretroviral therapies (ART). As such, studying HIV drug resistance allows for real-time evaluation of evolutionary mechanisms. Characterizing the biological process of drug resistance is also critically important for sustained effectiveness of ART. Investigating the link between “black box” deep learning methods applied to this problem and evolutionary principles governing drug resistance has been overlooked to date. Here, we utilized publicly available HIV-1 sequence data and drug resistance assay results for 18 ART drugs to evaluate the performance of three architectures (multilayer perceptron, bidirectional recurrent neural network, and convolutional neural network) for drug resistance prediction, jointly with biological analysis. We identified convolutional neural networks as the best performing architecture and displayed a correspondence between the importance of biologically relevant features in the classifier and overall performance. Our results suggest that the high classification performance of deep learning models is indeed dependent on drug resistance mutations (DRMs). These models heavily weighted several features that are not known DRM locations, indicating the utility of model interpretability to address causal relationships in viral genotype-phenotype data. Full article
Show Figures

Figure 1

14 pages, 1745 KiB  
Article
Genomic Analyses of Potential Novel Recombinant Human Adenovirus C in Brazil
by Roozbeh Tahmasebi, Antonio Charlys da Costa, Kaelan Tardy, Rory J. Tinker, Flavio Augusto de Padua Milagres, Rafael Brustulin, Maria da Aparecida Rodrigues Teles, Rogério Togisaki das Chagas, Cassia Vitória de Deus Alves Soares, Aripuana Sakurada Aranha Watanabe, Cecilia Salete Alencar, Fabiola Villanova, Xutao Deng, Eric Delwart, Adriana Luchs, Élcio Leal and Ester Cerdeira Sabino
Viruses 2020, 12(5), 508; https://0-doi-org.brum.beds.ac.uk/10.3390/v12050508 - 04 May 2020
Cited by 8 | Viewed by 2778
Abstract
Human Adenovirus species C (HAdV-C) is the most common etiologic agent of respiratory disease. In the present study, we characterized the nearly full-length genome of one potential new HAdV-C recombinant strain constituted by Penton and Fiber proteins belonging to type 89 and a [...] Read more.
Human Adenovirus species C (HAdV-C) is the most common etiologic agent of respiratory disease. In the present study, we characterized the nearly full-length genome of one potential new HAdV-C recombinant strain constituted by Penton and Fiber proteins belonging to type 89 and a chimeric Hexon protein of types 1 and 89. By using viral metagenomics techniques, we screened out, in the states of Tocantins and Pará, Northern and North regions of Brazil, from 2010 to 2016, 251 fecal samples of children between 0.5 to 2.5 years old. These children were presenting acute diarrhea not associated with common pathogens (i.e., rotavirus, norovirus). We identified two HAdV-C strains in two distinct patients. Phylogenetic analysis performed using all complete genomes available at GenBank database indicated that one strain (HAdV-C BR-245) belonged to type 1. The phylogenetic analysis also indicated that the second strain (HAdV-C BR-211) was located at the base of the clade formed by the newly HAdV-C strains type 89. Recombination analysis revealed that strain HAdV-C BR-211 is a chimera in which the variable regions of Hexon gene combined HAdV-C1 and HAdV-C89 sequences. Therefore, HAdV-C BR-211 strain possesses a genomic backbone of type HAdV-C89 and a unique insertion of HAdV-C1 in the Hexon sequence. Recombination may play an important driving force in HAdV-C diversity and evolution. Studies employing complete genomic sequencing on circulating HAdV-C strains in Brazil are needed to understand the clinical significance of the presented data. Full article
Show Figures

Figure 1

7 pages, 2099 KiB  
Communication
A Novel Hepe-Like Virus from Farmed Giant Freshwater Prawn Macrobrachium rosenbergii
by Xuan Dong, Tao Hu, Qingyuan Liu, Chen Li, Yani Sun, Yiting Wang, Weifeng Shi, Qin Zhao and Jie Huang
Viruses 2020, 12(3), 323; https://0-doi-org.brum.beds.ac.uk/10.3390/v12030323 - 17 Mar 2020
Cited by 14 | Viewed by 3207
Abstract
The family Hepeviridae includes several positive-stranded RNA viruses, which infect a wide range of mammalian species, chicken, and trout. However, few hepatitis E viruses (HEVs) have been characterized from invertebrates. In this study, a hepevirus, tentatively named Crustacea hepe-like virus 1 (CHEV1), from [...] Read more.
The family Hepeviridae includes several positive-stranded RNA viruses, which infect a wide range of mammalian species, chicken, and trout. However, few hepatitis E viruses (HEVs) have been characterized from invertebrates. In this study, a hepevirus, tentatively named Crustacea hepe-like virus 1 (CHEV1), from the economically important crustacean, the giant freshwater prawn Macrobrachium rosenbergii, was characterized. The complete genome consisted of 7750 nucleotides and had a similar structure to known hepatitis E virus genomes. Phylogenetic analyses suggested it might be a novel hepe-like virus within the family Hepeviridae. To our knowledge, this is the first hepe-like virus characterized from crustaceans. Full article
Show Figures

Figure 1

Back to TopTop