Accurate forecasting of ocean surface currents is crucial for the planning of marine activities, including fisheries, shipping, and pollution control. Previous studies have often neglected the consideration of spatiotemporal correlations and interdependencies among ocean elements, leading to suboptimal accuracy in medium to long-term forecasts, especially in regions characterized by intricate ocean currents. This paper proposes an adaptive spatiotemporal and multi-element fusion network for ocean surface currents forecasting (ASTMEN). Specifically, we use an improved Swin Transformer (Swin-T) to perform self-attention computation at any given moment, enabling the adaptive generation of multi-element time series with spatial dependencies. Then, we utilize a Long Short-Term Memory network (LSTM) to encode and decode these series in the dimensions of temporal and multi-element features, resulting in accurate forecasts of ocean surface currents. This study takes the Kuroshio region in the northwest Pacific Ocean as the study area with data from the ocean reanalysis dataset. The experimental results show that ASTMEN significantly outperforms the baseline model and the climate state method, and is the only model whose correlation coefficient is still higher than 0.8 at day 12. In the experiments during the summer, when the currents are most variable, ASTMEN provides better forecasts at the sea-land interface and at the junction of different currents, which has the potential to fill the gap of poor forecast performance of previous methods for complex current fields.