Oilfield production forecasting plays a crucial role in petroleum production. However, traditional methods, such as the Arps decline curve, merely reflect the trend of production history and fail to comprehensively consider the impact of measurement adjustments on production. In contrast, numerical simulation methods, although capable of comprehensively considering various factors, face practical application limitations due to their complexity and high computational costs. This study proposes a method for oil well production forecasting that integrates the attention mechanism with the Seq2Seq structure. Leveraging data from multiple wells in the ultrahigh water cut phase of the SL oilfield in China, a learning sample data set is established using dynamic time warping (DTW) for feature sorting. A production forecasting model, a transformer-based Seq2Seq deep learning algorithm capable of considering the impact of liquid lifting measures, is established. Comprehensive testing validates the superior performance of the model, with coefficients of determination on the training, validation, and testing sets being 0.9730, 0.9649, and 0.9461, respectively, clearly outperforming the traditional long−short-term memory (LSTM) model. Taking nine sample wells as an example, the model predicts oil production for the next 12 months under different liquid lifting magnitudes and determines the optimal liquid lifting magnitudes and corresponding monthly average incremental oil levels with the aid of a genetic algorithm. Experimental results align well with the empirical trends of the target oilfield, demonstrating that this model can be used as an effective tool for predicting well production and conducting analysis of the potential of liquid lifting. By optimizing the liquid lifting magnitudes of oil wells through accurate oil production predictions, the model enhances intelligent decision-making and optimizes production strategies, providing significant practical value in actual oilfield management.