Spatially resolved transcriptomics technologies enable the mapping of multiplexed gene expression profiles within tissue contexts. To explore the gene spatial patterns in complex tissues, computational methods have been developed to identify spatially variable genes within single tissue slices. However, there is a lack of methods designed to identify genes with differential spatial expression patterns (DSEPs) across multiple slices or conditions, which becomes increasingly common in complex experimental designs. The challenges include the complexity of cross-slice gene expression and spatial information modeling, scalability issues in constructing large-scale cell graphs, and mixed factors of inter-slice heterogeneity. We propose DSEP gene identification as a new task and develop River, an interpretable deep learning-based method, to solve this task. River comprises a two-branch prediction model architecture and a post-hoc attribution method to prioritize DSEP genes that explain condition differences. River’s special design for modeling spatial-informed gene expression makes it scalable to large-scale spatial omics datasets. We proposed strategies to decouple the spatial and non-spatial components of River’s outcomes. We validated River’s performance using simulated datasets and applied it to identify DSEP genes/proteins in diverse biological contexts, including embryo development, diabetes-induced alterations in spermatogenesis, and lupus-induced splenic changes. In a human triple-negative breast cancer dataset, River identified generalizable survival-related DSEPs, validated across unseen patient groups. River does not rely on specific data distribution assumptions and is compatible with various spatial omics data types, making it a versatile method for analyzing complex tissue architectures across multiple biological conditions.