Motivation
Single-cell RNA-sequencing (scRNA-seq) is widely used to reveal cellular heterogeneity, complex disease mechanisms, and cell differentiation processes. Due to high sparsity and complex gene expression patterns, scRNA-seq data presents a large number of dropout events, affecting downstream tasks such as cell clustering and pseudo-time analysis. Restoring the expression levels of genes is essential for reducing technical noise and facilitating downstream analysis. However, Existing scRNA-seq data imputation methods ignore the topological structure information of scRNA-seq data and cannot comprehensively utilize the relationships between cells.
Results
Here, we propose a single-cell Graph Contrastive Learning method for scRNA-seq data imputation, named scGCL, which integrates graph contrastive learning and Zero-inflated Negative Binomial (ZINB) distribution to estimate dropout values. scGCL summarizes global and local semantic information through contrastive learning and selects positive samples to enhance the representation of target nodes. To capture the global probability distribution, scGCL introduces an autoencoder based on the ZINB distribution, which reconstructs the scRNA-seq data based on the prior distribution. Through extensive experiments, we verify that scGCL outperforms existing state-of-the-art imputation methods in clustering performance and gene imputation on 14 scRNA-seq datasets. Further, we find that scGCL can enhance the expression patterns of specific genes in Alzheimer’s disease datasets.
Availability
https://github.com/zehaoxiong123/scGCL
Supplementary information
Supplementary data are available at Bioinformatics online.