Long short-term memory (LSTM) is widely applied in both academic and industrial fields. However, there is no reliable criterion on selecting hyperparameters of LSTM. Currently, although some widely used classic methods such as random search and grid search have obtained success to some extent, the problems in local optimum and convergence still exist. In this research, we propose to use grey wolf optimizer (GWO) to search for the hyperparameters of LSTM. Through the method, the superiority of metaheuristic in global optimization and the strength of LSTM in predicting are combined. In this model, number of hidden layer nodes and learning rate of LSTM are set as preys, and grey wolf pack has a simple but efficient mechanism to search for the optimal hyperparameters. The benchmark tests on several basic functions were utilized, and the results were verified by a comparative study with random search, support vector regression and several other regression methods. Specifically, we applied this algorithm in predicting the degradation trend of the airborne fuel pump. As a result, the ergodicity and convergence of the algorithm are proved mathematically based on Markov processes theory. The benchmark tests show that the GWO-LSTM model holds for predicting data with low overall slope and high partial fluctuation. The application in airborne fuel pump shows that, trained by dataset with 5700 points, the proposed model could predict sequence of 300 points with root mean square error 0.617 after 30 iterations of optimizing, which is 2.512 previously. The result further demonstrates that the proposed algorithm is applicable to make prediction with high accuracy. Overall, the effectiveness of GWO-LSTM model is verified from theoretical proof to benchmark tests and then to actual product application.