Heathers and his colleagues have proposed a variety of tests to detect inconsistencies in research data, including the GRIM, SPRITE, DEBIT, and RIVETS tests. Here we focus on relatively simple ways of examining binary data for results that are impossible or results that feature inconsistencies, using binomial tests to evaluate whether anomalous results could be explained as random typographical errors. Hypothetical data are used to illustrate our suggested procedures. Advantages and limitations of the approaches are discussed.
Occasionally, scientific reports have omitted information on standard deviations, making estimates of effect sizes very difficult to impossible. In such situations, several scholars have recommended obtaining an estimate of the standard deviation of distributions by dividing the range of the distribution (highest value minus lowest value) by four. However, there appears to be little evidence to confirm the validity of this approach. Articles from 2012 to 2015 in the journal Marriage & Family Review were surveyed to find instances where demographic variables (age, education, duration of relationship, number of children) were reported with both standard deviations and ranges. Ratios between range and standard deviations were calculated by several rules of thumb or more complex formulas and compared to the actual ratios obtained.Results indicated that dividing by 5 in general provided a more accurate estimate of actual standard deviations but accuracy in predicting the true ratio between range and standard deviation was substantially related to the position of the mean score within the range of scores with larger divisors needed as the mean approached either the minimum or the maximum values of the demographic variable (skew). Other recent formulae for estimating the standard deviation were also evaluated, but the skew-based approach appeared to be more accurate than the others.However, further investigation in other samples is needed because the skew-based approach was derived from observation of the data here, which might not replicate in different sets of data.
Heathers and his colleagues have proposed a variety of tests to detect inconsistencies in research data, including the GRIM, SPRITE, DEBIT, and RIVETS tests. Binary data are common in social science research, for such variables as male/female, rural/urban, white/nonwhite, or college educated/not college educated. However, the standard deviation for binary data is a direct mathematical function of the mean score. We show how standard deviations vary as a function of the mean and how the maximum possible standard deviation varies as a function of sample size for a mean of .50. Implications for detecting fraudulent data are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.