2022
DOI: 10.1101/2022.05.13.491738
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predictive and robust gene selection for spatial transcriptomics

Abstract: Fluorescence in situ hybridization (FISH) is a widely used method for visualizing gene expression in cells and tissues. A key challenge is determining a small panel of genes (typically less than 1% of the genome) to probe in a FISH experiment that are most informative about gene expression, either in general or for a specific experimental objective. We introduce predictive and robust probe selection (PROPOSE), a method that uses deep learning to identify informative marker genes using data from single-cell RNA… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 82 publications
0
6
0
Order By: Relevance
“…To evaluate the top features selected by Matilda across multiple data modalities and those selected from RNA modality using popular methods such as t -test and limma ( 7 ), and those specifically designed for scRNA-seq (e.g. MAST ( 8 ), ROC), and recently proposed deep learning feature selection methods, including PROPOSE ( 35 ) and scCapsNet ( 41 ), we compared their utility in classifying each cell type in each dataset. We found that cell-type-specific features selected by Matilda from multiple data modalities on average resulted in more accurate discrimination of their respective cell types as shown by the scatter plot and the overall rankings of methods in each dataset (Figure 5E and Supplementary Figure S8 ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To evaluate the top features selected by Matilda across multiple data modalities and those selected from RNA modality using popular methods such as t -test and limma ( 7 ), and those specifically designed for scRNA-seq (e.g. MAST ( 8 ), ROC), and recently proposed deep learning feature selection methods, including PROPOSE ( 35 ) and scCapsNet ( 41 ), we compared their utility in classifying each cell type in each dataset. We found that cell-type-specific features selected by Matilda from multiple data modalities on average resulted in more accurate discrimination of their respective cell types as shown by the scatter plot and the overall rankings of methods in each dataset (Figure 5E and Supplementary Figure S8 ).…”
Section: Resultsmentioning
confidence: 99%
“…The PROPOSE procedure ( 35 ) was used for feature selection of RNA modality from all datasets. Following the author's pipeline ( https://github.com/iancovert/propose ), the raw count matrices were first binarized to {0,1} according to the sign of the values and then used for model training using the ‘PROPOSE’ function in propose package (Github version 41fd568) with the number of marker genes as 100 and other parameters as default.…”
Section: Methodsmentioning
confidence: 99%
“…The most crucial aspect is the integration of histological imaging information with gene expression data through spatial location. In SRT, gene expression data suffers from issues of sparsity and zero inflation, which are key factors that interfere with downstream analysis [33,77]. Previous research has shown that histological imaging information can predict gene expression data [30][31][32].…”
Section: Discussionmentioning
confidence: 99%
“…SRT data are characterized by high sparsity, discreteness, and variance greater than the mean, specifically manifested as a high number of genes expressed at zero (zero inflation) [33]. Previous research has found that the zero-inflated negative binomial (ZINB) distribution can effectively characterize gene expression in SRT [34].…”
Section: Feature Reconstructionmentioning
confidence: 99%
“…To study the relationship of gene-expression differentiation programs of SI T RM to anatomical location in situ , we performed spatial transcriptomic profiling (Xenium,10X Genomics) 17,18 on SI from male mice adoptively transferred with female P14 CD8 T cells and analyzed over the course of LCMV infection (days 6, 8, 30, and 90). A 350-plex target gene panel was designed using a reference single nuclei RNAseq dataset to inform a prioritization algorithm based on predictive deep learning 19 ( Extended Data Fig. 1b ).…”
Section: A Spatial Framework For Si Trmmentioning
confidence: 99%