Large-scale educational testing data often contain vast amounts of variables associated with information pertaining to test takers, schools, or access to educational resources-information that can help relationships between test taker performance and their learning environment. This study examines approaches to incorporate latent and observed explanatory variables as predictors for cognitive diagnostic models (CDMs). Methods to specify and simultaneously estimate observed and latent variables (estimated using item response theory) as predictors affecting attribute mastery were examined. Real-world data analyses were conducted to demonstrate the application using large-scale international testing data. Simulation studies were conducted to examine the recovery and classification for simultaneously estimating multiple latent (using dichotomous and polytomous items as indicators for the latent construct) and observed predictors for varying sample sizes and number of attributes. Results showed stable parameter recovery and consistency in attribute classification. Implications for latent predictors and attribute specifications are discussed.
Medical schools and certification agencies should consider implications of assigning weights with respect to composite score reliability and consequences on pass-fail decisions.
Introduction: Resuscitation skills decay as early as 4 months after course acquisition. Gaps in research remain regarding ideal educational modalities, timing, and frequency of curricula required to optimize skills retention. Our objective was to evaluate the impact on retention of resuscitation skills 8 months after the Pediatric Advanced Life Support (PALS) course when reinforced by an adjunct simulation-based curriculum 4 months after PALS certification. We hypothesized there would be improved retention in the intervention group.Methods: This is a partial, double-blind, randomized controlled study.First-year pediatric residents were randomized to an intervention or control group. The intervention group participated in a simulation-based curriculum grounded in principles of deliberate practice and debriefing. The control group received no intervention. T-tests were used to compare mean percent scores (M) from simulation-based assessments and multiple-choice tests immediately following the PALS course and after 8 months.Results: Intervention group (n = 12) had overall improved retention of resuscitation skills at 8 months when compared with the control group (n = 12) (mean, 0.57 ± 0.05 vs 0.52 ± 0.06; P = 0.037). No significant difference existed between individual skills stations. The intervention group had greater retention of cognitive knowledge (mean, 0.78 ± 0.09 vs 0.68 ± 0.14; P = 0.049). Residents performed 61% of assessment items correctly immediately following the PALS course.Conclusions: Resuscitation skills acquisition from the PALS course and retention are suboptimal. These findings support the use of simulationbased curricula as course adjuncts to extend retention beyond 4 months.
Recent changes to the patient note (PN) format of the United States Medical Licensing Examination have challenged medical schools to improve the instruction and assessment of students taking the Step-2 clinical skills examination. The purpose of this study was to gather validity evidence regarding response process and internal structure, focusing on inter-rater reliability and generalizability, to determine whether a locally-developed PN scoring rubric and scoring guidelines could yield reproducible PN scores. A randomly selected subsample of historical data (post-encounter PN from 55 of 177 medical students) was rescored by six trained faculty raters in November-December 2014. Inter-rater reliability (% exact agreement and kappa) was calculated for five standardized patient cases administered in a local graduation competency examination. Generalizability studies were conducted to examine the overall reliability. Qualitative data were collected through surveys and a rater-debriefing meeting. The overall inter-rater reliability (weighted kappa) was .79 (Documentation = .63, Differential Diagnosis = .90, Justification = .48, and Workup = .54). The majority of score variance was due to case specificity (13 %) and case-task specificity (31 %), indicating differences in student performance by case and by case-task interactions. Variance associated with raters and its interactions were modest (<5 %). Raters felt that justification was the most difficult task to score and that having case and level-specific scoring guidelines during training was most helpful for calibration. The overall inter-rater reliability indicates high level of confidence in the consistency of note scores. Designs for scoring notes may optimize reliability by balancing the number of raters and cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.