Soybeans have the characteristics of balanced amino acid species and high nutritional value. In this paper, the feasibility of the identification soybean from three typical origins (Argentina, the United States and China) by interval partial least squares (iPLS) optimized terahertz (THz) spectroscopy combined with chemometrics was investigated. Firstly, the THz frequency-domain spectrum was optimized using iPLS. Then, 168 soybean samples were selected as the correction set, and soybean origin identification models were respectively built using the extreme learning machine (ELM), genetic algorithm support vector machine (GA-SVM), and artificial bee colony algorithm support vector machine (ABC-SVM) combined with 8 pre-processing techniques. Finally, the models were verified through 57 samples of the test set and the comprehensive identification accuracy rate of the ABC-SVM model reached 94.74%. The experimental results showed that after iPLS optimization and appropriate pre-processing technique, THz spectroscopy and chemometrics could accurately identify the origin of soybean. INDEX TERMS Terahertz, origin, pre-processing technique, iPLS, chemometrics I. INTRODUCTION Soybeans from different origins have large differences in appearance, color, nutritional value, and internal chemical composition [1]-[2]. According to the database of the Food and Agriculture Organization of the United Nations, the United States and Argentina are among the top three soybean producing countries in the world in 2019. At the same time, they are also the main sources of imported soybeans for China.