The integration of single-cell RNA-sequencing datasets from multiple sources is critical for deciphering cell-to-cell heterogeneities and interactions in complex biological systems. We present a novel unsupervised batch effect removal framework, called iMAP, based on both deep autoencoders and generative adversarial networks. Compared with current methods, iMAP shows superior, robust, and scalable performance in terms of both reliably detecting the batch-specific cells and effectively mixing distributions of the batch-shared cell types. Applying iMAP to tumor microenvironment datasets from two platforms, Smart-seq2 and 10x Genomics, we find that iMAP can leverage the powers of both platforms to discover novel cell-cell interactions.
Understanding the biological functions of molecules in specific human tissues or cell types is crucial for gaining insights into human physiology and disease. To address this issue, it is essential to systematically uncover associations among multilevel elements consisting of disease phenotypes, tissues, cell types and molecules, which could pose a challenge because of their heterogeneity and incompleteness. To address this challenge, we describe a new methodological framework, called Graph Local InfoMax (GLIM), based on a human multilevel network (HMLN) that we established by introducing multiple tissues and cell types on top of molecular networks. GLIM can systematically mine the potential relationships between multilevel elements by embedding the features of the HMLN through contrastive learning. Our simulation results demonstrated that GLIM consistently outperforms other state-of-the-art algorithms in disease gene prediction. Moreover, GLIM was also successfully used to infer cell markers and rewire intercellular and molecular interactions in the context of specific tissues or diseases. As a typical case, the tissue-cell-molecule network underlying gastritis and gastric cancer was first uncovered by GLIM, providing systematic insights into the mechanism underlying the occurrence and development of gastric cancer. Overall, our constructed methodological framework has the potential to systematically uncover complex disease mechanisms and mine high-quality relationships among phenotypical, tissue, cellular and molecular elements.
Background In traditional Chinese medicine, it is believed that the “tongue coating is produced by fumigation of stomach gas”, and that tongue coating can reflect the health status of humans, especially stomach health. Therefore, studying the relationship between the microbiome of the tongue coating and the gastric fluid is of great significance for understanding the biological basis of tongue diagnosis. Methods This paper detected the microbiomes of the tongue coating and the gastric fluid in 35 gastritis patients using metagenomic sequencing technology, systematically constructed the microbial atlas of tongue coating and gastric juice, and first described the similar characteristics between the two sites. Results There was a significant correlation between tongue coating and gastric juice in terms of microbial species composition and overall diversity. In terms of species composition, it was found that the two sites were dominated by five phyla, namely, Actinobacteria, Bacteroidetes, Firmicutes, Fusobacteria and Proteobacteria, and that most of the gastric microbial species could be detected from the patient's own tongue coating. In terms of overall diversity, a significant correlation was found between the alpha diversity of the tongue coating microbiome and the gastric juice microbiome. Furthermore, in terms of abundance, 4 classes, 2 orders, 4 families, 18 genera and 46 species were found to significantly correlate between the tongue coating and the gastric fluid. Conclusions The results provide microbiome-based scientific evidence for tongue diagnosis, and offer a new perspective for understanding the biological basis of tongue diagnosis.
Summary Although many quantitative structure–activity relationship (QSAR) models are trained and evaluated for their predictive merits, understanding what models have been learning is of critical importance. However, the interpretation and visualization of QSAR model results remain challenging, especially for ‘black box’ models such as deep neural network (DNN). Here, we take a step forward to interpret the learned chemical features from DNN QSAR models, and present VISAR, an interactive tool for visualizing the structure–activity relationship. VISAR first provides functions to construct and train DNN models. Then VISAR builds the activity landscapes based on a series of compounds using the trained model, showing the correlation between the chemical feature space and the experimental activity space after model training, and allowing for knowledge mining from a global perspective. VISAR also maps the gradients of the chemical features to the corresponding compounds as contribution weights for each atom, and visualizes the positive and negative contributor substructures suggested by the models from a local perspective. Using the web application of VISAR, users could interactively explore the activity landscape and the color-coded atom contributions. We propose that VISAR could serve as a helpful tool for training and interactive analysis of the DNN QSAR model, providing insights for drug design, and an additional level of model validation. Availability and implementation The source code and usage instructions for VISAR are available on github https://github.com/qid12/visar. Contact shaoli@mail.tsinghua.edu.cn Supplementary information Supplementary data are available at Bioinformatics online.
Spatially resolved transcriptomics (SRT) has greatly expanded our understanding of the spatial patterns of gene expression in histological tissue sections. However, most currently available platforms could not provide in situ single-cell spatial transcriptomics, limiting their biological applications. Here, to in silico reconstruct SRT at the single-cell resolution, we propose St2cell which combines deep learning-based frameworks with a novel convex quadratic programming (CQP)-based model. St2cell can thoroughly leverage information in high-resolution (HR) histological images, enabling the accurate segmentation of in situ single cells and identification of their transcriptomics. Applying St2cell on various SRT datasets, we demonstrated the reliability of reconstructed transcriptomics. The single-cell resolution provided by our proposed method greatly promoted the detection of elaborate spatial architectures and further facilitated the integration with single-cell RNA-sequencing data. Moreover, in a breast cancer tissue, St2cell identified general spatial structures and co-occurrence patterns of cell types in the tumor microenvironment. St2cell is also computationally efficient and easily accessible, making it a promising tool for SRT studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.