This study examined the equivalence and reliability of the two versions of the Vocabulary Levels Test in an Iranian context. This study was motivated by the fact that the Vocabulary Levels test is increasingly being used in Iran for both research and pedagogical purposes without having been checked for validity and reliability in this context. The equivalence and reliability of the two versions of the test were examined through the parallel-form approach to reliability in Classical True Score theory. Seventy-five intermediate learners of English as a foreign language at the Iran Language Institute took the two versions of the test with one week interval between the two administrations in a counterbalanced fashion. To examine the equivalence of the two versions, the means and variances of the scores obtained for the two tests were compared using paired-sample t-test and one-way ANOVA, respectively. The results of the analyses indicated that the difference between the means of the two versions was significant, and the two versions cannot be considered as parallel forms. To assess the reliability of the two versions, the correlation between the scores obtained from them was estimated using Pearson Product Moment correlation. The results of the analyses showed that the two versions are highly correlated and are reliable tests. It is concluded that the two versions should not be treated as equivalent in longitudinal and gain score studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.