An actuarial risk assessment instrument can be considered valid if independent investigations using novel samples can replicate the findings of the instrument's development study. In order for a study to qualify as a replication, it has to adhere to the methodological protocol of the development study with respect to key design characteristics, as well as ensuring that manual-recommended guidelines of test administration have been followed. A systematic search was conducted to identify predictive validity studies (N = 84) on three commonly used actuarial instruments: the Violence Risk Appraisal Guide (VRAG), the Sex Offender Risk Appraisal Guide (SORAG), and the Static-99. Sample (sex, age, criminal history) and design (follow-up, attrition, recidivism) characteristics, as well as markers of assessment integrity (scoring reliability, item omissions, prorating procedure), were extracted from 84 studies comprising 108 samples. None of the replications matched the development study of the instrument they were attempting to cross-validate with respect to key sample and design characteristics. Furthermore none of the replications strictly followed the manual-recommended guidelines for the instruments' administration. Additional replication studies that follow the methodological protocols outlined in actuarial instruments' development studies are needed before claims of generalizability can be made.
Intimate partner violence (IPV) is a major public health issue; worldwide, almost 1 in 3 women is affected. Police involvement in IPV cases has substantially increased because of “proarrest” and “procharging” policies and the enforcement of laws protecting victims of domestic violence. In the course of these changes, several front-line instruments have been developed to structure police risk assessment and decision-making strategies in such cases. One of those is the Ontario Domestic Assault Risk Assessment (ODARA). To investigate its validity in a Swiss police setting, a total cohort of male IPV offenders was retrospectively assessed for a fixed time at risk of 5 years. The recidivism base rate was 32% when recidivism was defined as subsequent police-registered IPV. Although ODARA scores were significantly correlated with IPV recidivism, they showed poor discrimination and calibration. Despite comparable base rates of recidivism, the Zurich sample scored significantly higher on the ODARA than the development sample. This mismatch of the expected and observed recidivism rates resulted in an overestimating of risk, especially in the two highest risk bins. Several reasons for those deviations, such as level of intervention, victim’s reporting behavior, and the dynamic nature of IPV, are discussed.
IntroductionThe performance of violence risk assessment instruments can be primarily investigated by analysing two psychometric properties: discrimination and calibration. Although many studies have examined the discrimination capacity of the Violence Risk Appraisal Guide (VRAG) and other actuarial risk assessment tools, few have evaluated how well calibrated these instruments are. The aim of the present investigation was to replicate the development study of the VRAG in Europe including measurements of discrimination and calibration.MethodUsing a prospective study design, we assessed a total cohort of violent offenders in the Zurich Canton of Switzerland using the VRAG prior to discharge from prisons, secure facilities, and outpatient clinics. Assessors adhered strictly to the assessment protocol set out in the instrument’s manual. After controlling for attrition, 206 offenders were followed in the community for a fixed period of 7 years. We used charges and convictions for subsequent violent offenses as the outcomes. Receiver operating characteristic analysis was conducted to measure discrimination, and Sanders’ decomposition of the Brier score as well as Bayesian credible intervals were calculated to measure calibration.ResultsThe discrimination of the VRAG’s risk bins was modest (area under the curve = 0.72, 95% CI = 0.63–0.81, p<0.05). However, the calibration of the tool was poor, with Sanders’ calibration score suggesting an average assessment error of 21% in the probabilistic estimates associated with each bin. The Bayesian credible intervals revealed that in five out of nine risk bins the intervals did not contain the expected risk rates.DiscussionMeasurement of the calibration validity of risk assessment instruments needs to be improved, as has been done with respect to discrimination. Additional replication studies that focus on the calibration of actuarial risk assessment instruments are needed. Meanwhile, we recommend caution when using the VRAG probabilistic risk estimates in practice.
This study evaluated the validity of the Static-99 and Static-99R in assessing sexual recidivism in Switzerland, based on a sample of 142 male sex offenders. Both tools showed predictive validity, but the Static-99R had better discrimination (OR = 1.82, AUC = .81) and calibration (Brier = .078, P/E = 0.96) than the Static-99. A cut score of four on the Static-99R maximized sensitivity (92.9%) and specificity (60.2%). However, although most offenders (98.7%) with a score < 4 did not commit sexual offenses in the 5-year follow-up period, only one in five (20.3%) offenders with a score ≥ 4 actually recidivated. Furthermore, the predicted number of recidivists in the well above average risk category (Static-99R ≥ 6) was 24% higher than expected in routine samples. The results suggest that the Static-99R may be a useful screening tool to identify low-risk individuals but offenders with scores ≥ 4 should be subjected to a more thorough assessment.
The Sexual Sadism Scale (SeSaS) was developed to assist in the diagnosis of sexual sadism, and it revealed adequate psychometric properties in prior research. This study cross validated the SeSaS in Switzerland using a sample of 179 male sex offenders. Specifically, the SeSaS conformed to a Mokken model of double monotonicity (scalability coefficient [ H] = .46, coefficient of reproducibility [CR] = .89), indicating that it measures a unidimensional construct of sexual sadism with hierarchically ordered items. The reliability of the scale was acceptable to high (ρ = .80, λ2 = .75, κ = .88). In addition, the SeSaS was strongly associated with sexual sadism diagnoses based on mental health manuals ( rpb = .60, odds ratio [OR] = 13.02, area under the curve [AUC] = 1) but not with recidivism. The results suggest that the use of the SeSaS may improve the validity and reliability of sexual sadism diagnoses, therefore playing a role in the assessment and management of sex offenders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.