2010
DOI: 10.3847/aer2010024
|View full text |Cite
|
Sign up to set email alerts
|

Do Concept Inventories Actually Measure Anything?

Abstract: Although concept inventories are among the most frequently used tools in the physics and astronomy education communities, they are rarely evaluated using item response theory ͑IRT͒. When IRT models fit the data, they offer sample-independent estimates of item and person parameters. IRT may also provide a way to measure students' learning gains that circumvents some known issues with Hake's normalized gain. In this paper, we review the essentials of IRT while simultaneously applying it to the Star Properties Co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
77
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 60 publications
(77 citation statements)
references
References 48 publications
0
77
0
Order By: Relevance
“…13,37 Finally, item discrimination is calculated with the point-biserial correlation between student performance on the item and their overall performance on the NGCI. 25,34 For item discrimination, anything greater than 0.30 indicates that students' scores on that item are well-correlated with their total scores. 13,25 Additional information from student response patterns and interviews is critical for providing context to these statistics and to further the discussion of the validity of the NGCI.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…13,37 Finally, item discrimination is calculated with the point-biserial correlation between student performance on the item and their overall performance on the NGCI. 25,34 For item discrimination, anything greater than 0.30 indicates that students' scores on that item are well-correlated with their total scores. 13,25 Additional information from student response patterns and interviews is critical for providing context to these statistics and to further the discussion of the validity of the NGCI.…”
Section: Methodsmentioning
confidence: 99%
“…Values over 0.70 are conventionally accepted as being internally reliable. 25,36 Item difficulty is calculated as the percentage of students who answered incorrectly, such n/a n/a that a very high percentage indicates that most students answered incorrectly (i.e., the item may be too difficult) and a very low percentage indicates that most students answered correctly (i.e., the item may be too easy). In order for a survey to be of appropriate difficulty, most items should have difficulty values between 0.20 and 0.80.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A recent study in astronomy 37 has pointed out the weaknesses of measures based on classical test theory, normalized learning gain calculations, and the advantages of Rasch-based calculations. In the present work, the Rasch learning gain, RLG, will be calculated in the same manner (eq 5) used by Wallace and Bailey.…”
Section: Journal Of Chemical Educationmentioning
confidence: 99%
“…In the present work, the Rasch learning gain, RLG, will be calculated in the same manner (eq 5) used by Wallace and Bailey. 37 This difference calculation is using the Rasch estimate of ability, and not the raw score. This is acceptable here because the Rasch ability estimates represent true linear measures and are suitable for use in a difference calculation 23 where raw scores are not.…”
Section: Journal Of Chemical Educationmentioning
confidence: 99%