Background: A critical issue in International English Language Testing System (IELTS) lies in the significance of the validity of IELTS listening comprehension test (hereafter IELTS LCM). However IELTS listening validity has been investigated, it has not been investigated with reference to multiple sources of evidence regarding item internal factors. To bridge this gap, we investigated its construct validity with use of structural equation modelling (SEM) and assessed differential item functioning (DIF) through cognitive diagnostic modelling (CDM) and Mantel Haenszel (MH). Methods: In this study, first, the participants signed a consent form for participation in the study; then, 480 participants were administered a proficiency test designed by the university of Cambridge; next, out of 480 participants, 463 participants were administered a 40-item IELTS LCT developed by the University of Cambridge. Finally, the data were analyzed with use of LISREL for probing the construct validity of the test; also, for detecting the potential DIF items, MH and CDM were used to make the results of DIF related findings more reliable. Results: The results of the first study confirmed an appropriate model fit, so that all four constructs, i.e., gap filling, diagram labelling, multiple choice and short answer on IELTS LCT, had a statistically significant contribution to IELTS LCT. However, construct-related evidence may not lead to the whole validity. This given, the second study examined the DIF items to argue the validity of IELTS LCT: MH detected 15 DIF items and CDM detected at least 6 DIF items and at most 12 DIF items. Conclusions: Due to its international nature and worldwide evaluative contribution, IELTS needs to have approximately (not absolutely) a stable factor structure, so that it should be invariant across populations and various cultures. More naturally, a test highly valid in one context might suffer from some degree of invalidity with some related constructs in another context. This in mind, our perspective in this research is not recommended to be taken as a one-size-fits-all model: Neither generalization nor claim is made based on the present study.