This article was originally submitted for publication to the Editor of Advances in Methods and Practices in Psychological Science (AMPPS) in 2015. When the submitted manuscript was subsequently posted online (Silberzahn et al., 2015), it received some media attention, and two of the authors were invited to write a brief commentary in Nature advocating for greater crowdsourcing of data analysis by scientists. This commentary, arguing that crowdsourced research "can balance discussions, validate findings and better inform policy" (Silberzahn & Uhlmann, 2015, p. 189), included a new figure that displayed the analytic teams' effectsize estimates and cited the submitted manuscript as the source of the findings, with a link to the preprint. However, the authors forgot to add a citation of the Nature commentary to the final published version of the AMPPS article or to note that the main findings had been previously publicized via the commentary, the online preprint, research presentations at conferences and universities, and media reports by other people. The authors regret the oversight.
Twenty-nine teams involving 61 analysts used the same dataset to address the same research question: whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. Analytic approaches varied widely across teams, and estimated effect sizes ranged from 0.89 to 2.93 in odds ratio units, with a median of 1.31. Twenty teams (69%) found a statistically significant positive effect and nine teams (31%) observed a non-significant relationship. Overall 29 different analyses used 21 unique combinations of covariates. We found that neither analysts' prior beliefs about the effect, nor their level of expertise, nor peer-reviewed quality of analysis readily explained variation in analysis outcomes. This suggests that significant variation in analysis of complex data may be difficult to avoid, even by experts with honest intentions. Crowdsourcing data analysis, a strategy by which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective analytic choices influence research results.
Three measures of internal consistency – Kuder-Richardson Formula 20 (KR20), Cronbach’s alpha (α), and person separation reliability (R) – are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement. These three measures specify the observed variance as the sum of true variance and error variance. However, they differ for the way in which these quantities are obtained. KR20 uses the error variance of an “average” respondent from the sample, which overestimates the error variance of respondents with high or low scores. Conversely, R uses the actual average error variance of the sample. KR20 and α use respondents’ test scores in calculating the observed variance. This is potentially misleading because test scores are not linear representations of the underlying variable, whereas calculation of variance requires linearity. Contrariwise, if the data fit the Rasch model, the measures estimated for each respondent are on a linear scale, thus being numerically suitable for calculating the observed variance. Given these differences, R is expected to be a better index of internal consistency than KR20 and α. The present work compares the three measures on simulated data sets with dichotomous and polytomous items. It is shown that all the estimates of internal consistency decrease with the increasing of the skewness of the score distribution, with R decreasing to a larger extent. Thus, R is more conservative than KR20 and α, and prevents test users from believing a test has better measurement characteristics than it actually has. In addition, it is shown that Rasch-based infit and outfit person statistics can be used for handling data sets with random responses. Two options are described. The first one implies computing a more conservative estimate of internal consistency. The second one implies detecting individuals with random responses. When there are a few individuals with a consistent number of random responses, infit and outfit allow for correctly detecting almost all of them. Once these individuals are removed, a “cleaned” data set is obtained that can be used for computing a less biased estimate of internal consistency.
The present work explores the connections between cognitive diagnostic models (CDM) and knowledge space theory (KST) and shows that these two quite distinct approaches overlap. It is proved that in fact the Multiple Strategy DINA (Deterministic Input Noisy AND-gate) model and the CBLIM, a competence-based extension of the basic local independence model (BLIM), are equivalent. To demonstrate the benefits that arise from integrating the two theoretical perspectives, it is shown that a fairly complete picture on the identifiability of these models emerges by combining results from both camps. The impact of the results is illustrated by an empirical example, and topics for further research are pointed out.
Overall survival from BC diagnosis was 47.69 ± 22.25 months (range 33-84, median 45.5 months); it was 52.25 ± 14.57 months (range 33-84, median 48.5 months) for the HR patients and 43.79 ± 27.14 months (range 9-101, median 39 months) for the RFA patients. Overall survival from BCLM treatment was 21.12 ± 12.78 months (range 9-64, median 15.5 months); in detail it was 29.42 ± 14.53 months (range 12-64, median 29.5 months) for the resected patients and 14 ± 4.45 months (range 9-24, median 13.5 months) for patients treated by RFA with a strongly significant survival difference for operated patients (p = 0.001). Overall disease-free survival from BCLM was 15.96 ± 13.16 months (range 3-64, median 12 months), disease-free survival for resected patients was 23.22 ± 16.2 months (range 8-64, median 18.5 months), and for patients treated by RFA was 9.64 ± 4.22 months (range 3-18, median 9 months; Fig. 1). Overall 1, 2, and 5 years (actuarial) survival was respectively 80.7, 57, and 31 %. Given in details for the two groups, they were respectively 100, 66.6 and 34 % (actuarial) for the resected group patients and 64.2, 21.4, and 11.5 % (actuarial) for the RFA patients. Fig. 1 Kaplan-Meier analysis of survival after BC and BCLM treatment. GROUP 1 = resection; GROUP 2 = RFA. Overall survival from breast cancer treatment (months) p = 0.082 ns. Overall survival from BCLM treatment (months) p = 0.001 CONCLUSIONS: Aggressive treatment on isolated BCLM may improve survival for these patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.