Machine Intelligence in Single-Cell Data Analysis: Resources, Challenges and Perspectives

A special issue of Biomolecules (ISSN 2218-273X). This special issue belongs to the section "Bioinformatics and Systems Biology".

Deadline for manuscript submissions: closed (20 July 2022) | Viewed by 7099

Special Issue Editors

School of Statistics and Data Science, Nankai University, Tianjin 300071, China
Interests: structural bioinformatics; statistical genomics; transcriptomics; intrinsically disordered proteins; single cell omics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Theory and Data Science, School of Mathematical Sciences, Nankai University, Tianjin 300071, China
Interests: single cell omics; structural bioinformatics; intrinsically disordered proteins; protein-ligand interactions

Special Issue Information

Dear Colleagues,

Single-cell sequencing is one of the most exciting biological technologies of the past decade. It is a powerful tool to investigate the genome, epigenome and transcriptome profile at the single-cell level. The information from individual cells allows the quantitative and characterization of cellular heterogeneity, and also give us a novel lens for a deeper understanding of biology and disease.

However, the measurements at the single-cell resolution are always high-dimensional, sparse and noisy, imposing new demands on statistical and computational methods, especially machine intelligence for single-cell data analysis and the integration of multi-omics data.

To address the challenges in this fast-evolving filed, we would like to invite the latest articles related to single-cell analytical techniques, bioinformatics methods and databases.

Prof. Dr. Gang Hu
Prof. Dr. Kui Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomolecules is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • single-cell sequencing
  • machine intelligence
  • spatial transcriptomics
  • multi-omics

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 16819 KiB  
Article
Automatic Cell Type Annotation Using Marker Genes for Single-Cell RNA Sequencing Data
by Yu Chen and Shuqin Zhang
Biomolecules 2022, 12(10), 1539; https://0-doi-org.brum.beds.ac.uk/10.3390/biom12101539 - 21 Oct 2022
Cited by 3 | Viewed by 2402
Abstract
Recent advancement in single-cell RNA sequencing (scRNA-seq) technology is gaining more and more attention. Cell type annotation plays an essential role in scRNA-seq data analysis. Several computational methods have been proposed for automatic annotation. Traditional cell type annotation is to first cluster the [...] Read more.
Recent advancement in single-cell RNA sequencing (scRNA-seq) technology is gaining more and more attention. Cell type annotation plays an essential role in scRNA-seq data analysis. Several computational methods have been proposed for automatic annotation. Traditional cell type annotation is to first cluster the cells using unsupervised learning methods based on the gene expression profiles, then to label the clusters using the aggregated cluster-level expression profiles and the marker genes’ information. Such procedure relies heavily on the clustering results. As the purity of clusters cannot be guaranteed, false detection of cluster features may lead to wrong annotations. In this paper, we improve this procedure and propose an Automatic Cell type Annotation Method (ACAM). ACAM delineates a clear framework to conduct automatic cell annotation through representative cluster identification, representative cluster annotation using marker genes, and the remaining cells’ classification. Experiments on seven real datasets show the better performance of ACAM compared to six well-known cell type annotation methods. Full article
Show Figures

Figure 1

15 pages, 1518 KiB  
Article
Detecting Fear-Memory-Related Genes from Neuronal scRNA-seq Data by Diverse Distributions and Bhattacharyya Distance
by Shaoqiang Zhang, Linjuan Xie, Yaxuan Cui, Benjamin R. Carone and Yong Chen
Biomolecules 2022, 12(8), 1130; https://0-doi-org.brum.beds.ac.uk/10.3390/biom12081130 - 17 Aug 2022
Cited by 3 | Viewed by 1799
Abstract
The detection of differentially expressed genes (DEGs) is one of most important computational challenges in the analysis of single-cell RNA sequencing (scRNA-seq) data. However, due to the high heterogeneity and dropout noise inherent in scRNAseq data, challenges in detecting DEGs exist when using [...] Read more.
The detection of differentially expressed genes (DEGs) is one of most important computational challenges in the analysis of single-cell RNA sequencing (scRNA-seq) data. However, due to the high heterogeneity and dropout noise inherent in scRNAseq data, challenges in detecting DEGs exist when using a single distribution of gene expression levels, leaving much room to improve the precision and robustness of current DEG detection methods. Here, we propose the use of a new method, DEGman, which utilizes several possible diverse distributions in combination with Bhattacharyya distance. DEGman can automatically select the best-fitting distributions of gene expression levels, and then detect DEGs by permutation testing of Bhattacharyya distances of the selected distributions from two cell groups. Compared with several popular DEG analysis tools on both large-scale simulation data and real scRNA-seq data, DEGman shows an overall improvement in the balance of sensitivity and precision. We applied DEGman to scRNA-seq data of TRAP; Ai14 mouse neurons to detect fear-memory-related genes that are significantly differentially expressed in neurons with and without fear memory. DEGman detected well-known fear-memory-related genes and many novel candidates. Interestingly, we found 25 DEGs in common in five neuron clusters that are functionally enriched for synaptic vesicles, indicating that the coupled dynamics of synaptic vesicles across in neurons plays a critical role in remote memory formation. The proposed method leverages the advantage of the use of diverse distributions in DEG analysis, exhibiting better performance in analyzing composite scRNA-seq datasets in real applications. Full article
Show Figures

Figure 1

16 pages, 1956 KiB  
Article
scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data
by Yanwen Gong, Shushrruth Sai Srinivasan, Ruiyi Zhang, Kai Kessenbrock and Jing Zhang
Biomolecules 2022, 12(7), 874; https://0-doi-org.brum.beds.ac.uk/10.3390/biom12070874 - 23 Jun 2022
Cited by 1 | Viewed by 2372
Abstract
Recent advances in single-cell transposase-accessible chromatin using a sequencing assay (scATAC-seq) allow cellular heterogeneity dissection and regulatory landscape reconstruction with an unprecedented resolution. However, compared to bulk-sequencing, its ultra-high missingness remarkably reduces usable reads in each cell type, resulting in broader, fuzzier peak [...] Read more.
Recent advances in single-cell transposase-accessible chromatin using a sequencing assay (scATAC-seq) allow cellular heterogeneity dissection and regulatory landscape reconstruction with an unprecedented resolution. However, compared to bulk-sequencing, its ultra-high missingness remarkably reduces usable reads in each cell type, resulting in broader, fuzzier peak boundary definitions and limiting our ability to pinpoint functional regions and interpret variant impacts precisely. We propose a weakly supervised learning method, scEpiLock, to directly identify core functional regions from coarse peak labels and quantify variant impacts in a cell-type-specific manner. First, scEpiLock uses a multi-label classifier to predict chromatin accessibility via a deep convolutional neural network. Then, its weakly supervised object detection module further refines the peak boundary definition using gradient-weighted class activation mapping (Grad-CAM). Finally, scEpiLock provides cell-type-specific variant impacts within a given peak region. We applied scEpiLock to various scATAC-seq datasets and found that it achieves an area under receiver operating characteristic curve (AUC) of ~0.9 and an area under precision recall (AUPR) above 0.7. Besides, scEpiLock’s object detection condenses coarse peaks to only ⅓ of their original size while still reporting higher conservation scores. In addition, we applied scEpiLock on brain scATAC-seq data and reported several genome-wide association studies (GWAS) variants disrupting regulatory elements around known risk genes for Alzheimer’s disease, demonstrating its potential to provide cell-type-specific biological insights in disease studies. Full article
Show Figures

Figure 1

Back to TopTop