EDUCATIONAL TESTING SERVICE, ETS, the ETS logo, GRE, TOEFL, and the TOEFL logo are registered trademarks of Educational Testing Service.
Abstract!We estimated the reliability of scores on four forms of the Test of English as a Foreign Language (TOEFL@) using a hybridIRT model. Very little difference between their overall reliability was found when the testlet items were assumed to be independent and their dependence was modeled. A larger difference in reliability was found when test sections were analyzed individually. Then we found as much as a 40 percent overestimate in reading comprehension testlets, with the longer testlets of the newest form of the TOEFL test showing the most local dependence. The listening comprehension testlets exhibited much less local dependence. We also found that the test was unidimensional enough for the use of univariate IRT to be efficacious, and that the reading comprehension testlets showed essentially no differential functioning by sex. This paper will be appearing in the journal Educational and Psychological Measurement and all references to it should be to that source.1 This research was supported by the TOEFL Policy Council and we are pleased to acknowledge its help. In addition, the data we analyzed were made available to us by Linda Tang who also advised us about some of their idiosyncratic characteristics; the data files themselves were prepared by Stephen Laue. We are grateful to Dan Eignor, Jacqueline Ross, and Mary Schedl for their careful reading of an earlier draft that helped us to correct some errors and allowed us to say better what we meant; also for most of the semicolons. Last, we would like to thank David Thissen for his always helpful advice. Of course none of these people or organizations should be held responsible for any errors herein, we claim total responsibility for those ourselves. This work was accomplished while the second author was an ETS post-doctoral fellow; his current address is 5804 Zone 5, Pimville, Johannesburg 1808, Republic of South Africa.The Test of English as a Foreign Language (TOEFL~) was developed in 1963 by the National Council on the Testing of English as a Foreign Language. The Council was formed through the cooperative effort of more than 30 public and private organizations concerned with testing the English proficiency of nonnative speakers of the language applying for admission to institutions in the United States. In 1965, Educational Testing Service (ETSifI) and the College Board assumed joint responsibility for the program. In 1973, a cooperative arrangement for the operation of the program was entered into by ETS, the College Board, and the Graduate Record Examinations (GRE ifI ) Board. The membership of the College Board is composed of schools, colleges, school systems, and educational associations; GRE Board members are associated with graduate education.ETS administers the TOEFL program under the general direction of a Policy Council that was established by, and is affiliated with, the sponsoring organizations. Members of...