Advanced Strategies and Tools for Metabolomics Data Analysis, Metabolite Annotation and Identification

A special issue of Metabolites (ISSN 2218-1989). This special issue belongs to the section "Bioinformatics and Data Analysis".

Deadline for manuscript submissions: closed (30 June 2022) | Viewed by 22505

Special Issue Editors


E-Mail Website
Guest Editor
Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
Interests: mass spectrometry; metabolomics; lipidomics; bioinformatics; systems biology

E-Mail Website
Guest Editor
Principal Investigator, EURECAT, Technology Centre of Catalonia, 08005 Barcelona, Spain
Interests: metabolomics; bioinformatics; computational biology; mass spectrometry

Special Issue Information

Dear Colleagues,

Metabolomics, the newest ‘omics’ discipline, focuses on the interrogation of metabolites in complex biological systems, with the goal of identifying correlations between dysregulated pathways and specific biological processes and diseases. However, the exponential rate increase at which metabolomics data can now be acquired has surpassed our ability to completely turn these data into interpretable clinical or biological information. Particularly, revealing the identity of underlying metabolites in biological samples is an imposing bottleneck due to the vast number and broad chemical diversity and complexity of the metabolome. Additionally, elucidating the mechanism by which these metabolites interact and modulate disease is crucial for identifying bioactive metabolites and designing targeted drugs and personalized therapies. Computational tools to expedite these processes are urgently needed.

In this context, advanced data analysis and metabolite annotation strategies have the potential to overcome the current limitations in metabolomics workflows. In this Special Issue of Metabolites, “Advanced Strategies and Tools for Metabolomics Data Analysis, Metabolite Annotation, and Identification”, we welcome studies revolving around the development of new computational tools and strategies to be applied but not limited to metabolite and lipid annotation and identification, systems biology, and mass spectrometry or nuclear magnetic resonance data analysis.

Dr. J. Rafael Montenegro-Burke
Dr. Xavier Domingo-Almenara
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Metabolites is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • metabolomics data analysis
  • metabolite annotation and identification
  • systems biology
  • signal processing
  • untargeted metabolomics
  • lipidomics
  • mass spectrometry 
  • nuclear magnetic resonance

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1312 KiB  
Article
Margaritaria nobilis L.F. (Phyllanthaceae): Ethnopharmacology and Application of Computational Tools in the Annotation of Bioactive Molecules
by Johan Carlos C. Santiago, Carlos Alberto B. Albuquerque, Abraão de Jesus B. Muribeca, Paulo Roberto C. Sá, Sônia das Graças Santa R. Pamplona, Consuelo Yumiko Y. e Silva, Paula Cardoso Ribera, Enéas de Andrade Fontes-Júnior and Milton Nascimento da Silva
Metabolites 2022, 12(8), 681; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12080681 - 25 Jul 2022
Cited by 6 | Viewed by 2174
Abstract
Margaritaria nobilis is a shrubby species widely distributed in Brazil from the Amazon to the Atlantic Rainforest. Its bark and fruit are used in the Peruvian Amazon for disinfecting abscesses and as a tonic in pregnancy, respectively, and its leaves are used to [...] Read more.
Margaritaria nobilis is a shrubby species widely distributed in Brazil from the Amazon to the Atlantic Rainforest. Its bark and fruit are used in the Peruvian Amazon for disinfecting abscesses and as a tonic in pregnancy, respectively, and its leaves are used to treat cancer symptoms. From analyses via UHPLC-MS/MS, we sought to determine the chemical profile of the ethanolic extract of M. nobilis leaves by means of putative analyses supported by computational tools and spectral libraries. Thus, it was possible to annotate 44 compounds, of which 12 are phenolic acid derivatives, 16 are O-glycosylated flavonoids and 16 hydrolysable tannins. Among the flavonoids, although they are known, except for kaempferol, which has already been isolated from this species, the other flavonoids (10, 14, 15, 21, 2426, 2830, 3335, 40 and 41) are being reported for the first time in the genus. Among the hydrolysable tannins, six ellagitannins present the HHDP group (6, 19, 22, 31, 38 and 43), one presents the DHHDP group (5), and four contain oxidatively modified congeners (12, 20, 37 and 39). Through the annotation of these compounds, we hope to contribute to the improved chemosystematics knowledge of the genus. Furthermore, supported by a metric review of the literature, we observed that many of the compounds reported here are congeners of authentically bioactive compounds. Thus, we believe that this work may help in understanding future pharmacological activities. Full article
Show Figures

Graphical abstract

13 pages, 9965 KiB  
Article
Discovery of Synergistic Drug Combinations for Colorectal Cancer Driven by Tumor Barcode Derived from Metabolomics “Big Data”
by Bo Lv, Ruijie Xu, Xinrui Xing, Chuyao Liao, Zunjian Zhang, Pei Zhang and Fengguo Xu
Metabolites 2022, 12(6), 494; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12060494 - 30 May 2022
Cited by 1 | Viewed by 3096
Abstract
The accumulation of cancer metabolomics data in the past decade provides exceptional opportunities for deeper investigations into cancer metabolism. However, integrating a large amount of heterogeneous metabolomics data to draw a full picture of the metabolic reprogramming and to discover oncometabolites of certain [...] Read more.
The accumulation of cancer metabolomics data in the past decade provides exceptional opportunities for deeper investigations into cancer metabolism. However, integrating a large amount of heterogeneous metabolomics data to draw a full picture of the metabolic reprogramming and to discover oncometabolites of certain cancers remains challenging. In this study, a tumor barcode constructed based upon existing metabolomics “big data” using the Bayesian vote-counting method is proposed to identify oncometabolites in colorectal cancer (CRC). Specifically, a panel of oncometabolites of CRC was generated from 39 clinical studies with 3202 blood samples (1332 CRC vs. 1870 controls) and 990 tissue samples (495 CRC vs. 495 controls). Next, an oncometabolite-protein network was constructed by combining the tumor barcode and its involved proteins/enzymes. The effect of anti-cancer drugs or drug combinations was then mapped into this network by the random walk with restart process. Utilizing this network, potential Irinotecan (CPT-11)-sensitizing agents for CRC treatment were discovered by random forest and Xgboost. Finally, a compound named MK-2206 was highlighted and its synergy with CPT-11 was validated on two CRC cell lines. To summarize, we demonstrate in the present study that the metabolomics “big data”-based tumor barcodes and the subsequent network analyses are potentially useful for drug combination discovery or drug repositioning. Full article
Show Figures

Figure 1

13 pages, 4741 KiB  
Article
A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
by Johannes Rainer, Andrea Vicini, Liesa Salzer, Jan Stanstrup, Josep M. Badia, Steffen Neumann, Michael A. Stravs, Vinicius Verri Hernandes, Laurent Gatto, Sebastian Gibb and Michael Witting
Metabolites 2022, 12(2), 173; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12020173 - 11 Feb 2022
Cited by 32 | Viewed by 8398
Abstract
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting [...] Read more.
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS1 properties such as m/z and retention times, followed by an MS2-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor. Full article
Show Figures

Figure 1

12 pages, 859 KiB  
Article
Automated Recommendation of Research Keywords from PubMed That Suggest the Molecular Mechanism Associated with Biomarker Metabolites
by Shinji Kanazawa, Satoshi Shimizu, Shigeki Kajihara, Norio Mukai, Junko Iida and Fumio Matsuda
Metabolites 2022, 12(2), 133; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12020133 - 01 Feb 2022
Cited by 1 | Viewed by 2081
Abstract
Metabolomics can help identify candidate biomarker metabolites whose levels are altered in response to disease development or drug administration. However, assessment of the underlying molecular mechanism is challenging considering it depends on the researcher’s knowledge. This study reports a novel method for the [...] Read more.
Metabolomics can help identify candidate biomarker metabolites whose levels are altered in response to disease development or drug administration. However, assessment of the underlying molecular mechanism is challenging considering it depends on the researcher’s knowledge. This study reports a novel method for the automated recommendation of keywords known in the literature that may be overlooked by researchers. The proposed method aided in the identification of Medical Subject Headings (MeSH) terms in PubMed using MeSH co-occurrence data. The intended users are biocurators who have identified specific biomarker metabolites from a metabolomics study and would like to identify literature-reported molecular mechanisms that are associated with both the metabolite and their research area of interest. The proposed method finds MeSH terms that co-occur with a MeSH term of the candidate biomarker metabolite as well as a MeSH term of a researcher’s known keyword, such as the name of a disease. The connectivity score S was determined using association analysis. Pilot analyses demonstrated that, while the biological significance of the obtained MeSH terms could not be guaranteed, the developed method can be useful for finding keywords to further investigate molecular mechanisms in association with candidate biomarker molecules. Full article
Show Figures

Figure 1

19 pages, 4128 KiB  
Article
Evaluating the Accuracy of the QCEIMS Approach for Computational Prediction of Electron Ionization Mass Spectra of Purines and Pyrimidines
by Jesi Lee, Tobias Kind, Dean Joseph Tantillo, Lee-Ping Wang and Oliver Fiehn
Metabolites 2022, 12(1), 68; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12010068 - 12 Jan 2022
Cited by 4 | Viewed by 2439
Abstract
Mass spectrometry is the most commonly used method for compound annotation in metabolomics. However, most mass spectra in untargeted assays cannot be annotated with specific compound structures because reference mass spectral libraries are far smaller than the complement of known molecules. Theoretically predicted [...] Read more.
Mass spectrometry is the most commonly used method for compound annotation in metabolomics. However, most mass spectra in untargeted assays cannot be annotated with specific compound structures because reference mass spectral libraries are far smaller than the complement of known molecules. Theoretically predicted mass spectra might be used as a substitute for experimental spectra especially for compounds that are not commercially available. For example, the Quantum Chemistry Electron Ionization Mass Spectra (QCEIMS) method can predict 70 eV electron ionization mass spectra from any given input molecular structure. In this work, we investigated the accuracy of QCEIMS predictions of electron ionization (EI) mass spectra for 80 purine and pyrimidine derivatives in comparison to experimental data in the NIST 17 database. Similarity scores between every pair of predicted and experimental spectra revealed that 45% of the compounds were found as the correct top hit when QCEIMS predicted spectra were matched against the NIST17 library of >267,000 EI spectra, and 74% of the compounds were found within the top 10 hits. We then investigated the impact of matching, missing, and additional fragment ions in predicted EI mass spectra versus ion abundances in MS similarity scores. We further include detailed studies of fragmentation pathways such as retro Diels–Alder reactions to predict neutral losses of (iso)cyanic acid, hydrogen cyanide, or cyanamide in the mass spectra of purines and pyrimidines. We describe how trends in prediction accuracy correlate with the chemistry of the input compounds to better understand how mechanisms of QCEIMS predictions could be improved in future developments. We conclude that QCEIMS is useful for generating large-scale predicted mass spectral libraries for identification of compounds that are absent from experimental libraries and that are not commercially available. Full article
Show Figures

Figure 1

13 pages, 4282 KiB  
Article
Opening the Random Forest Black Box of the Metabolome by the Application of Surrogate Minimal Depth
by Soeren Wenck, Marina Creydt, Jule Hansen, Florian Gärber, Markus Fischer and Stephan Seifert
Metabolites 2022, 12(1), 5; https://0-doi-org.brum.beds.ac.uk/10.3390/metabo12010005 - 21 Dec 2021
Cited by 8 | Viewed by 2928
Abstract
For the untargeted analysis of the metabolome of biological samples with liquid chromatography–mass spectrometry (LC-MS), high-dimensional data sets containing many different metabolites are obtained. Since the utilization of these complex data is challenging, different machine learning approaches have been developed. Those methods are [...] Read more.
For the untargeted analysis of the metabolome of biological samples with liquid chromatography–mass spectrometry (LC-MS), high-dimensional data sets containing many different metabolites are obtained. Since the utilization of these complex data is challenging, different machine learning approaches have been developed. Those methods are usually applied as black box classification tools, and detailed information about class differences that result from the complex interplay of the metabolites are not obtained. Here, we demonstrate that this information is accessible by the application of random forest (RF) approaches and especially by surrogate minimal depth (SMD) that is applied to metabolomics data for the first time. We show this by the selection of important features and the evaluation of their mutual impact on the multi-level classification of white asparagus regarding provenance and biological identity. SMD enables the identification of multiple features from the same metabolites and reveals meaningful biological relations, proving its high potential for the comprehensive utilization of high-dimensional metabolomics data. Full article
Show Figures

Graphical abstract

Back to TopTop