Although originally conceived of as a marriage of direct behavioral observation and indirect behavior rating scales, recent research has indicated that Direct Behavior Ratings (DBRs) are affected by rater idiosyncrasies (rater effects) similar to other indirect forms of behavioral assessment. Most of this research has been conducted using generalizability theory (GT), yet another approach, many-facet Rasch measurement (MFRM), has recently been utilized to illuminate the previously opaque nature of these rater idiosyncrasies. The purpose of this study was to utilize both approaches (GT and MFRM) to consider rater effects with 126 second-through fifth-grade students who were rated on two DBR-Multi-Item Scales by four raters (22 of these ratings were fully crossed). Results indicated the presence of rater effects and revealed nuances about their nature, including showing differences across construct domains, identifying items that are potentially more susceptible to rater effects than others, and isolating specific raters who appear to have been more susceptible to rater effects than other raters. These findings further indicate the indirect nature of DBRs and offer potential avenues for addressing and ameliorating rater effects in research and practice.
Impact and ImplicationsOur study examined whether scores on a popular behavioral assessment, Direct Behavior Ratings-Multi-Item Scales (DBR-MIS), were affected by rater effects, which refer to score differences attributable to rater characteristics rather than student behavior. We used two methodologies that enable a fine-grained examination of these rater effects. Our results illuminated nuanced features of DBR-MIS that hold promise to better address rater effects in future research and practice.