Significant attention has recently been paid to deep learning as a method for improved catchment modeling. Compared with process‐based models, deep learning is often criticized for its lack of interpretability. One solution is to combine a process‐based hydrological model with a residual error model based on deep learning to give full scope to their respective advantages. In classical residual error models, Bayesian inference via Markov chain Monte Carlo (MCMC) is commonly used to provide an estimation of the uncertainty. However, deep neural networks tend to have excessively large numbers of parameters, making MCMC an unsuitable approach. Here, we introduce an alternative to Bayesian MCMC sampling called stochastic variational inference (SVI) which has recently been developed for Bayesian deep learning in Natural Language Processing. We implement SVI in a Long Short‐Term Memory (LSTM) network and construct residual error models in process‐based hydrological models. This approach is examined in the contrasting geographical and climatic characteristics of two catchments from China, the Tangnaihai catchment and the Shiquan catchment. Compared with the Bayesian linear regression model, the Bayesian LSTM provides better uncertainty estimates. Specifically, the proposed method improves the Continuous Ranked Probability Score (CRPS) by over 10% in both two catchments. In the Tangnaihai catchment, it provides more than 10% narrower uncertainty intervals in terms of Sharpness with slightly superior Reliability. In the Shiquan catchment, it provides comparable uncertainty intervals with better Reliability. Further, our study highlights the scalability of SVI to high‐dimensional parameter spaces in hydrological applications (e.g., distributed hydrological models, groundwater models).