2002 Annual Conference Proceedings
DOI: 10.18260/1-2--10943
|View full text |Cite
|
Sign up to set email alerts
|

Rubric Development And Inter Rater Reliability Issues In Assessing Learning Outcomes

Abstract: This paper describes the development of rubrics that help evaluate student performance and relate that performance directly to the educational objectives of the program. Issues in accounting for different constituencies, selecting items for evaluation, and minimizing time required for data analysis are discussed. Aspects of testing the rubrics for consistency between different faculty raters are presented, as well as a specific example of how inconsistencies were addressed. Finally, a considerat ion of the dif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
18
0

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 24 publications
(18 citation statements)
references
References 6 publications
0
18
0
Order By: Relevance
“…Chance agreement given four scoring levels and three graders would be .34 and .06 by the liberal and conservative definitions, respectively. Newell et al (2002) found comparable levels of agreement for three graders using a rubric for grading students' solutions of chemical engineering problems, a task that was not writing based. The rubric developed by Newell et al also had four scoring levels within each dimension.…”
Section: Resultsmentioning
confidence: 90%
See 2 more Smart Citations
“…Chance agreement given four scoring levels and three graders would be .34 and .06 by the liberal and conservative definitions, respectively. Newell et al (2002) found comparable levels of agreement for three graders using a rubric for grading students' solutions of chemical engineering problems, a task that was not writing based. The rubric developed by Newell et al also had four scoring levels within each dimension.…”
Section: Resultsmentioning
confidence: 90%
“…Agreement was defined liberally as all scores assigned for a dimension by the three graders being within 1 point of one another. These criteria are accepted in the measurement literature (Tinsley & Weiss, 2000) and have been applied in past studies of interrater agreement for grading rubrics (Newell et al, 2002). Agreement on total overall score out of 24 possible points (8 dimensions × 3 points maximum for each) for the 40 papers was also calculated and is described in the results.…”
Section: Evaluating Interrater Agreementmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, Stellmack, Konheim-Kalkstein, Manor, Massey, and Schmitz (2009) found low interrater agreement (agreement between reviewers in .37 of the scores they assigned) for graders who developed and refined a grading rubric over several months. Newell, Dahm, and Newell (2002) also reported comparably low interrater agreement (.47 proportion of agreement measured in the same way as Stellmack, Konheim-Kalkstein, Manor, Massey, & Schmitz, 2009) in the grading of student writing with a rubric. 1 Indeed, subjectivity and low interrater agreement in evaluating scientific writing are implicitly acknowledged in the peer-review process when an editor seeks reviews from multiple reviewers.…”
mentioning
confidence: 93%
“…A decision was made to have a four-level scale for the rubric, which is consistent with other university-wide holistic rubrics & minimizes the tendency to rate in the middle of odd number level scales. 33,38 Listing includes a description of the various performance levels that are used to write the dimension descriptions. 31 Engineering faculty reviewed the Paul-Elder critical thinking framework to identify key Elements of Thought and Universal Intellectual Standards that would be applicable across engineering courses.…”
Section: Development and Initial Validation Of A Holistic Engineering Critical Thinking Rubricmentioning
confidence: 99%