Sea surface temperature (SST) is one of the most important and widely used physical parameters for oceanography and meteorology. To obtain SST, in addition to direct measurement, remote sensing, and numerical models, a variety of data-driven models have been developed with a wealth of SST data being accumulated. As oceans are comprehensive and complex dynamic systems, the distribution and variation of SST are affected by various factors. To overcome this challenge and improve the prediction accuracy, a multi-variable long short-term memory (LSTM) model is proposed which takes wind speed and air pressure at sea level together with SST as inputs. Furthermore, two attention mechanisms are introduced to optimize the model. An interdimensional attention strategy, which is similar to the positional encoding matrix, is utilized to focus on important historical moments of multi-dimensional input; a self-attention strategy is adopted to smooth the data during the training process. Forty-three-year monthly mean SST and meteorological data from the fifth-generation ECMWF (European Centre for Medium-Range Weather Forecasts) reanalysis (ERA5) are collected to train and test the model for the sea areas around China. The performance of the model is evaluated in terms of different statistical parameters, namely the coefficient of determination, root mean squared error, mean absolute error and mean average percentage error, with a range of 0.9138–0.991, 0.3928–0.8789, 0.3213–0.6803, and 0.1067–0.2336, respectively. The prediction results indicate that it is superior to the LSTM-only model and models taking SST only as input, and confirm that our model is promising for oceanography and meteorology investigation.