Recently, RNA velocity has driven a paradigmatic change in single-cell RNA sequencing (scRNA-seq) studies, allowing the reconstruction and prediction of directed trajectories in cell differentiation and state transitions. However, most existing methods use dynamic modeling via ordinary differential equations (ODE) for individual genes in sequence and can lead to erroneous results, as they are inadequately able to fully capture the intrinsically stochastic nature of transcriptional dynamics governed by a cell-specific latent time across multiple genes. Here, we present SDEvelo, a novel deep generative approach to inferring RNA velocity by modeling the dynamics of unspliced and spliced RNAs via multivariate stochastic differential equations (SDE). Uniquely, SDEvelo explicitly models inherent uncertainty in transcriptional dynamics while estimating a cell-specific latent time across genes. Using both simulated and four scRNA-seq and spatial transcriptomics datasets, we show that SDEvelo can model the random dynamic patterns of mature-state cells while accurately detecting carcinogenesis. Additionally, the estimated gene-shared latent time can facilitate many downstream analyses for biological discovery. We demonstrate that SDEvelo is computationally scalable and applicable to both scRNA-seq and sequencing-based spatial transcriptomics data.