One of the main goals of pre‐stack seismic inversion is the estimation of elastic properties (i.e. P‐, S‐ wave velocities and density) and litho‐fluid classes in the investigated area. To this end, many inversion strategies have been proposed, but the most popular is based on a two‐step inversion approach: First, elastic properties are inferred from pre‐stack data, and then a classification algorithm is used to convert the outcomes of the first stage into litho‐fluid facies. In this work, we propose an alternative approach based on recurrent neural networks. We train two bidirectional long short‐term memory networks to predict the inverse mappings from pre‐stack seismic data to elastic properties, and litho‐fluid classes. In the elastic inversion, we also use a Monte Carlo simulation approach to properly propagate onto the model space both the uncertainties related to noise contamination in the data and to the modelling error introduced by the network approximation. One crucial aspect of any machine‐learning inversion strategy is the definition of an appropriate training set. In this case, the models forming the training and validation examples are drawn according to a previously defined elastic and facies prior models derived from actual well log recordings. In particular, we assume a Gaussian‐mixture elastic prior, and we also take into account the uncertainties affecting the estimation of the transition probabilities of facies. We invert each seismic gather independently, and in this context, the generation of the training set, and the learning process can be accomplished with a very limited computational effort on a common notebook. Once trained, the networks estimate the elastic properties, the litho‐fluid facies and the related uncertainties from the pre‐stack data in near real‐time. Synthetic and field data inversions are used to validate the proposed method. The network predictions are also benchmarked against the outcomes of a more standard two‐step approach that combines a linear elastic inversion and a subsequent point‐wise Bayesian classification. Our results demonstrate that the implemented algorithm guarantees more accurate elastic property estimations and facies predictions than the standard inversion strategy. In particular, the predictions provided by the long short‐term memory network are less affected by erroneous assumptions on the noise statistics and prior model and by errors in the estimated source wavelet.