Submit to Molecules Review for Molecules Propose a Special Issue

Journal Browser

► Journal Browser

The Machine Learning Applications in the Discovery of New Bioactive Molecules

Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Molecules (ISSN 1420-3049). This special issue belongs to the section "Medicinal Chemistry".

Deadline for manuscript submissions: closed (31 May 2022) | Viewed by 35976

Share This Special Issue

Special Issue Editors

Dr. Sabina Podlewska

E-Mail Website
Guest Editor

Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
Interests: computer-aided drug design; docking; machine learning; homology modeling; QSAR

Dr. Rita Guedes

E-Mail Website
Guest Editor

Research Institute for Medicines and Pharmaceutical Sciences (iMed.UL), Faculty of Pharmacy, University of Lisbon, Av. Prof. Gama Pinto, 1649-019 Lisbon, Portugal
Interests: computational medicinal chemistry; design of new drugs; anti-infectious agents; anti-cancer agents; in silico methods; virtual screening; molecular docking; de novo design; homology modelling; pharmacophore modelling; molecular dynamics; monte carlo and quantum chemistry

Dr. Stanisław Jastrzębski

E-Mail Website
Guest Editor

Molecule.one, Jagiellonian University, Kraków, Poland
Interests: Deep learning; Drug discovery; Optimization

Special Issue Information

Dear Colleagues,

Various computational approaches support the process of development of new biologically active substances at its all stages. Among them, machine learning (ML) methods are gaining great popularity due to their high prediction power and ability to handle a huge amount of data in a relatively short time. ML-based tools assist not only in the search for new ligands with a particular activity profile, but they also help to predict and optimize = physicochemical and pharmacokinetic properties, as well as avoid side effects. In addition, ML also takes part in the enumeration of compound libraries, covering desired activity and property profiles via the application of deep learning methods.

The present Special Issue is aimed to cover all aspects of ML-based tool applications in computer-aided drug design—from ligand-based approaches (in both activity and physicochemical/ADMET properties predictions) via assistance in structure-based protocols (e.g., for post-processing of docking results) to generation of new ligands (e.g., with the use of deep learning). Manuscripts presenting methods which are experimentally verified are particularly welcome, but researchers focusing on theoretical studies are also cordially welcome to contribute to the issue.

Dr. Sabina Podlewska
Dr. Rita Guedes
Dr. Stanisław Jastrzębski
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Molecules is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Machine learning
Deep learning
Computer-aided drug design
Ligand-based approaches
Structure-based approaches
In silico compound profiling
Virtual screening
ADMET properties evaluation

Published Papers (14 papers)

Download All Papers

Research

28 pages, 10164 KiB

Open AccessFeature PaperArticle

Harnessing Protein-Ligand Interaction Fingerprints to Predict New Scaffolds of RIPK1 Inhibitors

by Natália Aniceto, Vanda Marques, Joana D. Amaral, Patrícia A. Serra, Rui Moreira, Cecília M. P. Rodrigues and Rita C. Guedes

Molecules 2022, 27(15), 4718; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27154718 - 23 Jul 2022

Cited by 1 | Viewed by 1877

Abstract

Necroptosis has emerged as an exciting target in oncological, inflammatory, neurodegenerative, and autoimmune diseases, in addition to acute ischemic injuries. It is known to play a role in innate immune response, as well as in antiviral cellular response. Here we devised a concerted in silico and experimental framework to identify novel RIPK1 inhibitors, a key necroptosis factor. We propose the first in silico model for the prediction of new RIPK1 inhibitor scaffolds by combining docking and machine learning methodologies. Through the data analysis of patterns in docking results, we derived two rules, where rule #1 consisted of a four-residue signature filter, and rule #2 consisted of a six-residue similarity filter based on docking calculations. These were used in consensus with a machine learning QSAR model from data collated from ChEMBL, the literature, in patents, and from PubChem data. The models allowed for good prediction of actives of >90, 92, and 96.4% precision, respectively. As a proof-of-concept, we selected 50 compounds from the ChemBridge database, using a consensus of both molecular docking and machine learning methods, and tested them in a phenotypic necroptosis assay and a biochemical RIPK1 inhibition assay. A total of 7 of the 47 tested compounds demonstrated around 20–25% inhibition of RIPK1’s kinase activity but, more importantly, these compounds were discovered to occupy new areas of chemical space. Although no strong actives were found, they could be candidates for further optimization, particularly because they have new scaffolds. In conclusion, this screening method may prove valuable for future screening efforts as it allows for the exploration of new areas of the chemical space in a very fast and inexpensive manner, therefore providing efficient starting points amenable to further hit-optimization campaigns. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

14 pages, 944 KiB

Open AccessArticle

Using Big Data Analytics to “Back Engineer” Protein Conformational Selection Mechanisms

by Shivangi Gupta, Jerome Baudry and Vineetha Menon

Molecules 2022, 27(8), 2509; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27082509 - 13 Apr 2022

Cited by 2 | Viewed by 1872

Abstract

In the living cells, proteins bind small molecules (or “ligands”) through a “conformational selection” mechanism, where a subset of protein structures are capable of binding the small molecules well while most other protein structures are not capable of such binding. The present work uses machine learning approaches to identify, in a very large amount of protein:ligand complexes, what protein properties are associated with their capacity to bind small molecules. In order to do so, we calculate 40 physicochemical properties on about 1.5 millions of protein conformations: ligand and protein conformations. This work describes a machine learning approach to identify the unique physico-chemical descriptors of a protein that maximize the prediction rate of potential protein molecular conformations for the test case proteins ADORA2A (Adenosine A2a Receptor), ADRB2 (Adrenoceptor Beta 2) and OPRK1 (Opioid Receptor Kappa 1). We find adequate machine learning techniques can increase by an order of magnitude the identification of “binding protein conformations” in an otherwise very large ensemble of protein conformations, compared to random selection of protein conformations. This opens the door to the systematic identification of such “binding conformations” for proteins and provides a big data approach to the conformational selection mechanism. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

9 pages, 1897 KiB

Open AccessArticle

Prediction of Antibacterial Peptides against Propionibacterium acnes from the Peptidomes of Achatina fulica Mucus Fractions

by Suwapitch Chalongkulasak, Teerasak E-kobon and Pramote Chumnanpuen

Molecules 2022, 27(7), 2290; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27072290 - 31 Mar 2022

Cited by 7 | Viewed by 2640

Abstract

Acne vulgaris is a common skin disease mainly caused by the Gram-positive pathogenic bacterium, Propionibacterium acnes. This bacterium stimulates the inflammation process in human sebaceous glands. The giant African snail (Achatina fulica) is an alien species that rapidly reproduces and seriously damages agricultural products in Thailand. There were several research reports on the medical and pharmaceutical benefits of these snail mucus peptides and proteins. This study aimed to in silico predict multifunctional bioactive peptides from A. fulica mucus peptidome using bioinformatic tools for the determination of antimicrobial (iAMPpred), anti-biofilm (dPABBs), cytotoxic (ToxinPred) and cell-membrane-penetrating (CPPpred) peptides. Three candidate peptides with the highest predictive score were selected and re-designed/modified to improve the required activities. Structural and physicochemical properties of six anti-P. acnes (APA) peptide candidates were performed using the PEP–FOLD3 program and the four previous tools. All candidates had a random coiled structure and were named APAP-1 ori, APAP-2 ori, APAP-3 ori, APAP-1 mod, APAP-2 mod, and APAP-3 mod. To validate the APA activity, these peptide candidates were synthesized and tested against six isolates of P. acnes. The modified APA peptides showed high APA activity on three isolates. Therefore, our biomimetic mucus peptides could be useful for preventing acne vulgaris and further examined on other activities important to medical and pharmaceutical applications. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

12 pages, 2054 KiB

Open AccessArticle

Computational Drug Repurposing Based on a Recommendation System and Drug–Drug Functional Pathway Similarity

by Mengting Shao, Leiming Jiang, Zhigang Meng and Jianzhen Xu

Molecules 2022, 27(4), 1404; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27041404 - 18 Feb 2022

Cited by 7 | Viewed by 2813

Abstract

Drug repurposing identifies new clinical indications for existing drugs. It can be used to overcome common problems associated with cancers, such as heterogeneity and resistance to established therapies, by rapidly adapting known drugs for new treatment. In this study, we utilized a recommendation system learning model to prioritize candidate cancer drugs. We designed a drug–drug pathway functional similarity by integrating multiple genetic and epigenetic alterations such as gene expression, copy number variation (CNV), and DNA methylation. When compared with other similarities, such as SMILES chemical structures and drug targets based on the protein–protein interaction network, our approach provided better interpretable models capturing drug response mechanisms. Furthermore, our approach can achieve comparable accuracy when evaluated with other learning models based on large public datasets (CCLE and GDSC). A case study about the Erlotinib and OSI-906 (Linsitinib) indicated that they have a synergistic effect to reduce the growth rate of tumors, which is an alternative targeted therapy option for patients. Taken together, our computational method characterized drug response from the viewpoint of a multi-omics pathway and systematically predicted candidate cancer drugs with similar therapeutic effects. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

24 pages, 2792 KiB

Open AccessArticle

Investigating the Role of Obesity in Prostate Cancer and Identifying Biomarkers for Drug Discovery: Systems Biology and Deep Learning Approaches

by Shan-Ju Yeh, Yun-Chen Chung and Bor-Sen Chen

Molecules 2022, 27(3), 900; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27030900 - 28 Jan 2022

Cited by 5 | Viewed by 2534

Abstract

Prostate cancer (PCa) is the second most frequently diagnosed cancer for men and is viewed as the fifth leading cause of death worldwide. The body mass index (BMI) is taken as a vital criterion to elucidate the association between obesity and PCa. In this study, systematic methods are employed to investigate how obesity influences the noncutaneous malignancies of PCa. By comparing the core signaling pathways of lean and obese patients with PCa, we are able to investigate the relationships between obesity and pathogenic mechanisms and identify significant biomarkers as drug targets for drug discovery. Regarding drug design specifications, we take drug–target interaction, drug regulation ability, and drug toxicity into account. One deep neural network (DNN)-based drug–target interaction (DTI) model is trained in advance for predicting drug candidates based on the identified biomarkers. In terms of the application of the DNN-based DTI model and the consideration of drug design specifications, we suggest two potential multiple-molecule drugs to prevent PCa (covering lean and obese PCa) and obesity-specific PCa, respectively. The proposed multiple-molecule drugs (apigenin, digoxin, and orlistat) not only help to prevent PCa, suppressing malignant metastasis, but also result in lower production of fatty acids and cholesterol, especially for obesity-specific PCa. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

22 pages, 745 KiB

Open AccessArticle

Novel Big Data-Driven Machine Learning Models for Drug Discovery Application

by Vishnu Sripriya Akondi, Vineetha Menon, Jerome Baudry and Jana Whittle

Molecules 2022, 27(3), 594; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules27030594 - 18 Jan 2022

Cited by 10 | Viewed by 3435

Abstract

Most contemporary drug discovery projects start with a ‘hit discovery’ phase where small chemicals are identified that have the capacity to interact, in a chemical sense, with a protein target involved in a given disease. To assist and accelerate this initial drug discovery process, ’virtual docking calculations’ are routinely performed, where computational models of proteins and computational models of small chemicals are evaluated for their capacities to bind together. In cutting-edge, contemporary implementations of this process, several conformations of protein targets are independently assayed in parallel ‘ensemble docking’ calculations. Some of these protein conformations, a minority of them, will be capable of binding many chemicals, while other protein conformations, the majority of them, will not be able to do so. This fact that only some of the conformations accessible to a protein will be ’selected’ by chemicals is known as ’conformational selection’ process in biology. This work describes a machine learning approach to characterize and identify the properties of protein conformations that will be selected (i.e., bind to) chemicals, and classified as potential binding drug candidates, unlike the remaining non-binding drug candidate protein conformations. This work also addresses the class imbalance problem through advanced machine learning techniques that maximize the prediction rate of potential protein molecular conformations for the test case proteins ADORA2A (Adenosine A2a Receptor) and OPRK1 (Opioid Receptor Kappa 1), and subsequently reduces the failure rates and hastens the drug discovery process. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

12 pages, 2308 KiB

Open AccessArticle

BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters

by Xin Cheng, Jun Wang, Qianyue Li and Taigang Liu

Molecules 2021, 26(24), 7414; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26247414 - 07 Dec 2021

Cited by 15 | Viewed by 2523

Abstract

An important reason of cancer proliferation is the change in DNA methylation patterns, characterized by the localized hypermethylation of the promoters of tumor-suppressor genes together with an overall decrease in the level of 5-methylcytosine (5mC). Therefore, identifying the 5mC sites in the promoters is a critical step towards further understanding the diverse functions of DNA methylation in genetic diseases such as cancers and aging. However, most wet-lab experimental techniques are often time consuming and laborious for detecting 5mC sites. In this study, we proposed a deep learning-based approach, called BiLSTM-5mC, for accurately identifying 5mC sites in genome-wide DNA promoters. First, we randomly divided the negative samples into 11 subsets of equal size, one of which can form the balance subset by combining with the positive samples in the same amount. Then, two types of feature vectors encoded by the one-hot method, and the nucleotide property and frequency (NPF) methods were fed into a bidirectional long short-term memory (BiLSTM) network and a full connection layer to train the 22 submodels. Finally, the outputs of these models were integrated to predict 5mC sites by using the majority vote strategy. Our experimental results demonstrated that BiLSTM-5mC outperformed existing methods based on the same independent dataset. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Graphical abstract

27 pages, 14001 KiB

Open AccessArticle

Insights into the Ligand Binding to Bromodomain-Containing Protein 9 (BRD9): A Guide to the Selection of Potential Binders by Computational Methods

by Simona De Vita, Maria Giovanna Chini, Giuseppe Bifulco and Gianluigi Lauro

Molecules 2021, 26(23), 7192; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26237192 - 27 Nov 2021

Cited by 20 | Viewed by 1631

Abstract

The estimation of the binding of a set of molecules against BRD9 protein was carried out through an in silico molecular dynamics-driven exhaustive analysis to guide the identification of potential novel ligands. Starting from eight crystal structures of this protein co-complexed with known binders and one apo form, we conducted an exhaustive molecular docking/molecular dynamics (MD) investigation. To balance accuracy and an affordable calculation time, the systems were simulated for 100 ns in explicit solvent. Moreover, one complex was simulated for 1 µs to assess the influence of simulation time on the results. A set of MD-derived parameters was computed and compared with molecular docking-derived and experimental data. MM-GBSA and the per-residue interaction energy emerged as the main indicators for the good interaction between the specific binder and the protein counterpart. To assess the performance of the proposed analysis workflow, we tested six molecules featuring different binding affinities for BRD9, obtaining promising outcomes. Further insights were reported to highlight the influence of the starting structure on the molecular dynamics simulations evolution. The data confirmed that a ranking of BRD9 binders using key parameters arising from molecular dynamics is advisable to discard poor ligands before moving on with the synthesis and the biological tests. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Graphical abstract

12 pages, 2123 KiB

Open AccessArticle

Modeling Structure–Activity Relationship of AMPK Activation

by Jürgen Drewe, Ernst Küsters, Felix Hammann, Matthias Kreuter, Philipp Boss and Verena Schöning

Molecules 2021, 26(21), 6508; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26216508 - 28 Oct 2021

Cited by 3 | Viewed by 2222

Abstract

The adenosine monophosphate activated protein kinase (AMPK) is critical in the regulation of important cellular functions such as lipid, glucose, and protein metabolism; mitochondrial biogenesis and autophagy; and cellular growth. In many diseases—such as metabolic syndrome, obesity, diabetes, and also cancer—activation of AMPK is beneficial. Therefore, there is growing interest in AMPK activators that act either by direct action on the enzyme itself or by indirect activation of upstream regulators. Many natural compounds have been described that activate AMPK indirectly. These compounds are usually contained in mixtures with a variety of structurally different other compounds, which in turn can also alter the activity of AMPK via one or more pathways. For these compounds, experiments are complicated, since the required pure substances are often not yet isolated and/or therefore not sufficiently available. Therefore, our goal was to develop a screening tool that could handle the profound heterogeneity in activation pathways of the AMPK. Since machine learning algorithms can model complex (unknown) relationships and patterns, some of these methods (random forest, support vector machines, stochastic gradient boosting, logistic regression, and deep neural network) were applied and validated using a database, comprising of 904 activating and 799 neutral or inhibiting compounds identified by extensive PubMed literature search and PubChem Bioassay database. All models showed unexpectedly high classification accuracy in training, but more importantly in predicting the unseen test data. These models are therefore suitable tools for rapid in silico screening of established substances or multicomponent mixtures and can be used to identify compounds of interest for further testing. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Graphical abstract

13 pages, 2475 KiB

Open AccessArticle

Virtual Screening for Biomimetic Anti-Cancer Peptides from Cordyceps militaris Putative Pepsinized Peptidome and Validation on Colon Cancer Cell Line

by Jarinyagon Chantawannakul, Paninnuch Chatpattanasiri, Vichugorn Wattayagorn, Mesayamas Kongsema, Tipanart Noikaew and Pramote Chumnanpuen

Molecules 2021, 26(19), 5767; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26195767 - 23 Sep 2021

Cited by 5 | Viewed by 2711

Abstract

Colorectal cancer is one of the leading causes of cancer-related death in Thailand and many other countries. The standard practice for curing this cancer is surgery with an adjuvant chemotherapy treatment. However, the unfavorable side effects of chemotherapeutic drugs are undeniable. Recently, protein hydrolysates and anticancer peptides have become popular alternative options for colon cancer treatment. Therefore, we aimed to screen and select the anticancer peptide candidates from the in silico pepsin hydrolysate of a Cordyceps militaris (CM) proteome using machine-learning-based prediction servers for anticancer prediction, i.e., AntiCP, iACP, and MLACP. The selected CM-anticancer peptide candidates could be an alternative treatment or co-treatment agent for colorectal cancer, reducing the use of chemotherapeutic drugs. To ensure the anticancer properties, an in vitro assay was performed with “CM-biomimetic peptides” on the non-metastatic colon cancer cell line (HT-29). According to the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay results from peptide candidate treatments at 0–400 µM, the IC₅₀ doses of the CM-biomimetic peptide with no toxic and cancer-cell-penetrating ability, original C. militaris biomimetic peptide (C-ori), against the HT-29 cell line were 114.9 µM at 72 hours. The effects of C-ori compared to the doxorubicin, a conventional chemotherapeutic drug for colon cancer treatment, and the combination effects of both the CM-anticancer peptide and doxorubicin were observed. The results showed that C-ori increased the overall efficiency in the combination treatment with doxorubicin. According to the acridine orange/propidium iodine (AO/PI) staining assay, C-ori can induce apoptosis in HT-29 cells significantly, confirmed by chromatin condensation, membrane blebbing, apoptotic bodies, and late apoptosis which were observed under a fluorescence microscope. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

10 pages, 3998 KiB

Open AccessArticle

K-Nearest Neighbor and Random Forest-Based Prediction of Putative Tyrosinase Inhibitory Peptides of Abalone Haliotis diversicolor

by Sasikarn Kongsompong, Teerasak E-kobon and Pramote Chumnanpuen

Molecules 2021, 26(12), 3671; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26123671 - 16 Jun 2021

Cited by 9 | Viewed by 2515

Abstract

Skin pigment disorders are common cosmetic and medical problems. Many known compounds inhibit the key melanin-producing enzyme, tyrosinase, but their use is limited due to side effects. Natural-derived peptides also display tyrosinase inhibition. Abalone is a good source of peptides, and the abalone proteins have been used widely in pharmaceutical and cosmetic products, but not for melanin inhibition. This study aimed to predict putative tyrosinase inhibitory peptides (TIPs) from abalone, Haliotis diversicolor, using k-nearest neighbor (kNN) and random forest (RF) algorithms. The kNN and RF predictors were trained and tested against 133 peptides with known anti-tyrosinase properties with 97% and 99% accuracy. The kNN predictor suggested 1075 putative TIPs and six TIPs from the RF predictor. Two helical peptides were predicted by both methods and showed possible interaction with the predicted structure of mushroom tyrosinase, similar to those of the known TIPs. These two peptides had arginine and aromatic amino acids, which were common to the known TIPs, suggesting non-competitive inhibition on the tyrosinase. Therefore, the first version of the TIP predictors could suggest a reasonable number of the TIP candidates for further experiments. More experimental data will be important for improving the performance of these predictors, and they can be extended to discover more TIPs from other organisms. The confirmation of TIPs in abalone will be a new commercial opportunity for abalone farmers and industry. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

18 pages, 4019 KiB

Open AccessArticle

Unveiling Putative Functions of Mucus Proteins and Their Tryptic Peptides in Seven Gastropod Species Using Comparative Proteomics and Machine Learning-Based Bioinformatics Predictions

by Viroj Tachapuripunya, Sittiruk Roytrakul, Pramote Chumnanpuen and Teerasak E-kobon

Molecules 2021, 26(11), 3475; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26113475 - 07 Jun 2021

Cited by 15 | Viewed by 3729

Abstract

Gastropods are among the most diverse animals. Gastropod mucus contains several glycoproteins and peptides that vary by species and habitat. Some bioactive peptides from gastropod mucus were identified only in a few species. Therefore, using biochemical, mass spectrometric, and bioinformatics approaches, this study aimed to comprehensively identify putative bioactive peptides from the mucus proteomes of seven commonly found or commercially valuable gastropods. The mucus was collected in triplicate samples, and the proteins were separated by 1D-SDS-PAGE before tryptic digestion and peptide identification by nano LC-MS/MS. The mucus peptides were subsequently compared with R scripts. A total of 2818 different peptides constituting 1634 proteins from the mucus samples were identified, and 1218 of these peptides (43%) were core peptides found in the mucus of all examined species. Clustering and correspondence analyses of 1600 variable peptides showed unique mucous peptide patterns for each species. The high-throughput k-nearest neighbor and random forest-based prediction programs were developed with more than 95% averaged accuracy and could identify 11 functional categories of putative bioactive peptides and 268 peptides (9.5%) with at least five to seven bioactive properties. Antihypertensive, drug-delivering, and antiparasitic peptides were predominant. These peptides provide an understanding of gastropod mucus, and the putative bioactive peptides are expected to be experimentally validated for further medical, pharmaceutical, and cosmetic applications. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

13 pages, 1922 KiB

Open AccessArticle

iT4SE-EP: Accurate Identification of Bacterial Type IV Secreted Effectors by Exploring Evolutionary Features from Two PSI-BLAST Profiles

by Haitao Han, Chenchen Ding, Xin Cheng, Xiuzhi Sang and Taigang Liu

Molecules 2021, 26(9), 2487; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26092487 - 24 Apr 2021

Cited by 2 | Viewed by 1824

Abstract

Many gram-negative bacteria use type IV secretion systems to deliver effector molecules to a wide range of target cells. These substrate proteins, which are called type IV secreted effectors (T4SE), manipulate host cell processes during infection, often resulting in severe diseases or even death of the host. Therefore, identification of putative T4SEs has become a very active research topic in bioinformatics due to its vital roles in understanding host-pathogen interactions. PSI-BLAST profiles have been experimentally validated to provide important and discriminatory evolutionary information for various protein classification tasks. In the present study, an accurate computational predictor termed iT4SE-EP was developed for identifying T4SEs by extracting evolutionary features from the position-specific scoring matrix and the position-specific frequency matrix profiles. First, four types of encoding strategies were designed to transform protein sequences into fixed-length feature vectors based on the two profiles. Then, the feature selection technique based on the random forest algorithm was utilized to reduce redundant or irrelevant features without much loss of information. Finally, the optimal features were input into a support vector machine classifier to carry out the prediction of T4SEs. Our experimental results demonstrated that iT4SE-EP outperformed most of existing methods based on the independent dataset test. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Figure 1

12 pages, 933 KiB

Open AccessArticle

Fast Identification of Adverse Drug Reactions (ADRs) of Digestive and Nervous Systems of Organic Drugs by In Silico Models

by Meimei Chen, Zhaoyang Yang, Yuxing Gao and Candong Li

Molecules 2021, 26(4), 930; https://0-doi-org.brum.beds.ac.uk/10.3390/molecules26040930 - 10 Feb 2021

Cited by 3 | Viewed by 2240

Abstract

This study aimed to discover concurrences of adverse drug reactions (ADRs) and derive models of the most frequent items of ADRs based on the SIDER database, which included 1430 marketed drugs and 5868 ADRs. First, common ADRs of organic drugs were manually reclassified according to side effects in the human system and followed by an association rule analysis, which found ADRs of digestive and nervous systems often occurred at the same time with a good association rule. Then, three algorithms, linear discriminant analysis (LDA), support vector machine (SVM) and deep learning, were used to derive models of ADRs of digestive and nervous systems based on 497 organic monomer drugs and to identify key structural features in defining these ADRs. The statistical results indicated that these kinds of QSAR models were good tools for screening ADRs of digestive and nervous systems, which gave the ROC AUC values of 81.5%, 98.9%, 91.5%, 69.5%, 78.4% and 78.8%, respectively. Then, these models were applied to investigate ADRs of 1536 organic compounds with four phase and zero rule-of-five (RO5) violations from the ChEMBL database. Based on the consensus ADRs’ predictions of models, 58.1% and 42.6% of compounds were predicted to cause these two ADRs, respectively, indicating the significance of initial assessment of ADRs in early drug discovery. Full article

(This article belongs to the Special Issue The Machine Learning Applications in the Discovery of New Bioactive Molecules)

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

The Machine Learning Applications in the Discovery of New Bioactive Molecules

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (14 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI