ITEM ANALYSIS FOR TEACHER‐MADE MASTERY TESTS<sup>1</sup>

Crehan, Kevin D.

doi:10.1111/j.1745-3984.1974.tb00997.x

Cited by 18 publications

(15 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the two test validity indices, however, Crehan 's index and the Cox and Vargas measure produced tests which showed significant and consistently higher test validities over the other four i tem measures. Wi th these results , Crehan (1974) …”

Section: Crehan 'S Instructed and Noninstructed Group Indexmentioning

confidence: 99%

“…A second view of CR measurement (e.g., Crehan, 1974;Glaser & Cox, 1968) considers these two topics to be intima tely related due to the emphasis of both on performance. In practice, CR measurement often consists largely or completely of performance testing.…”

Section: Rater Bias and Its Correction--experimental Studymentioning

confidence: 99%

“…Positively discriminating i tems might be used to locate areas requiring additiona l or modified instruction so that all students can be brought to the mastery level--a goa l of the instructiona l program designed under the CR testing concept. Negatively discriminating items (i.e., items answered correctly more often by the low total test scoring students ) may serve to identify a need for revision either in the items or in the focus of instruction and instructional materi al (Hambleton & Gorth , 1971 ;Gorth & Hambleton , 1972;Popham & Husek , 1969 Cox and Vargas (1966), Ivens (1970), Rahml ow , Mathews and Jung (1970), Hambleton and north (1971), Hsu (1971), Popham (1971) , Crehan (1974), and Haladyna (1974).…”

Section: Item Analysismentioning

confidence: 99%

See 2 more Smart Citations

Criterion-Referenced Testing: Review, Evaluation, and Extension

Siegel¹,

Musetti²,

Federman³

et al. 1979

PsycEXTRA Dataset

View full text Add to dashboard Cite

Section: Crehan 'S Instructed and Noninstructed Group Indexmentioning

confidence: 99%

Section: Rater Bias and Its Correction--experimental Studymentioning

confidence: 99%

Section: Item Analysismentioning

confidence: 99%

See 1 more Smart Citation

Criterion-Referenced Testing: Review, Evaluation, and Extension

Siegel¹,

Musetti²,

Federman³

et al. 1979

PsycEXTRA Dataset

View full text Add to dashboard Cite

“…Most of these studies also try to detect clusters of coefficients and use correlation techniques for this purpose without realizing that most clusters can be expected a priori because of spurious correlations (the coefficients are defied on the same proportions from Table I). Crehan (1974) evaluates a number of pretest-posttest coefficients in view of their possible contributions to improving mastery decisions, and Smith (Note 1) does the same not using empirical but simulated data. The fact that he chooses the Rasch model to simulate the data and therewith represents items having equal discriminating power might account for his not finding any trend.…”

Section: Latent Trait Look At Pretest-posttest Validationmentioning

confidence: 99%

A Latent Trait Look at Pretest-Posttest Validation of Criterion-referenced Test Items

Linden

1981

Review of Educational Research

View full text Add to dashboard Cite

Since Cox and Vargas (1966) introduced their pretest-posttest validity index for criterion-referenced test items, a great number of additions and modifications havefollowed. All are based on the idea of gain scoring; that is, they are computedfrom the differences between proportions ofpretest andposttest item responses. Although the method is simple and generally considered as the prototype of criterion-referenced item analysis, it has many and serious disadvantages. Some of these go back to thefact that it leads to indices based on a dual test administration-and population-dependent item p values. Others have to do with the global information about the discriminating power that these indices provide, the implicit weighting they suppose, and the meaningless maximization of posttest scores they lead to. Analyzing the pretest-posttest methodfrom a latent trait point of view, it is proposed to replace indices like Cox and Vargas' Dpp by an evaluation of the item informationfunctionfor the mastery score. An empirical study was conducted to compare the differences in item selection between both methods.As in any other area of educational and psychological measurement, more attention has been paid to reliability than to validity aspects of criterion-referenced measurement. Several test parameters have been proposed and compared with their normreferenced counterparts, assessment methods have been introduced and examined using both real and simulated data, and the criterion-referenced reliability problem seems on its way to a great diversity of solutions (Hambleton & Novick, 1973; Huynh, 1976a Huynh, , 1976bLivingston, 1972;Marshall, 1975 That less powerful efforts have been made to tackle the validity problem may in part be due to a standpoint advocated by, for example, Millman (1974). According to this standpoint, criterion-referenced validity is the same as content validity and to establish this the construction of a well-defined domain of items is sufficient. Once an item is included in the domain no empirical information or item analysis can or Thanks are due to Fred N. Kerlinger, Gideon J. Mellenbergh, Robert F. van Naerssen, and Egbert Warries for their helpful comments; to Hans van Aalst, Fred Boesenkool, Kees Hellingman, Ton Heuvelman, Rien Steen, Niels Veldhuizen, Ronny Wierstra, and Theo Wubbels for participating in the empirical study and computational assistance; and to Paula Achterberg for typing the manuscript. 379WIM J. VAN These authors distinguish three approaches to the validity problem. The first is the aforementioned item form or item generation rule approach. In it, the fixed syntactical structure and variable elements of item sentences are used to define domains and-eventually with the aid of the computer-to sample items. Item validity is automatically guaranteed because the definition of the domain and the construction of the items are accomplished y the same set of rules. The second approach is a judgmental procedure in which content specialists are retained. The judgmental task may assume ...

show abstract

“…- Crehan (1974) Test reliability was estimated as the proportion of agreement in overall test grades (pass or fail) on two parallel test taken by the same group of students. An agreement, for example , was two passing grades.…”

Section: Crehan 'S Instructed and Noninstructed Group Indexmentioning

confidence: 99%