Jordan Anaya scite author profile

Jordan Anaya

4Publications

53Citation Statements Received

76Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE)

Heathers

Anaya²,

Zee

et al. 2018

Preprint

View full text Add to dashboard Cite

Scientific publications have not traditionally been accompanied by data, either during the peer review process or when published. Concern has arisen that the literature in many fields may contain inaccuracies or errors that cannot be detected without inspecting the original data. Here, we introduce SPRITE (Sample Parameter Reconstruction via Interative TEchniques), a heuristic method for reconstructing plausible samples from descriptive statistics of granular data, allowing reviewers, editors, readers, and future researchers to gain insights into the possible distributions of item values in the original data set. This paper presents the principles of operation of SPRITE, as well as worked examples of its practical use for error detection in real published work. Full source code for three software implementations of SPRITE (in MATLAB, R, and Python) and two web-based implementations requiring no local installation (1, 2) are available for readers.

show abstract

Statistical heartburn: an attempt to digest four pizza publications from the Cornell Food and Brand Lab

Zee¹,

Anaya²,

Brown

2017

BMC Nutr

View full text Add to dashboard Cite

Background: We present the results of a reanalysis of four articles from the Cornell Food and Brand Lab based on data collected from diners at an Italian restaurant buffet. Method: We calculated whether the means, standard deviations, and test statistics were compatible with the sample size. Test statistics and p values were recalculated. We also applied deductive logic to see whether the claims made in each article were compatible with the claims made in the others. We have so far been unable to obtain the data from the authors of the four articles. Results: A thorough reading of the articles and careful reanalysis of the results revealed a wide range of problems. The sample sizes for the number of diners in each condition are incongruous both within and between the four articles. In some cases, the degrees of freedom of between-participant test statistics are larger than the sample size, which is impossible. Many of the computed F and t statistics are inconsistent with the reported means and standard deviations. In some cases, the number of possible inconsistencies for a single statistic was such that we were unable to determine which of the components of that statistic were incorrect. Our Appendix reports approximately 150 inconsistencies in these four articles, which we were able to identify from the reported statistics alone. Conclusions:We hope that our analysis will encourage readers, using and extending the simple methods that we describe, to undertake their own efforts to verify published results, and that such initiatives will improve the accuracy and reproducibility of the scientific literature. We also anticipate that the editors of the journals that published these four articles may wish to consider whether any corrective action is required.

show abstract

The GRIMMER test: A method for testing the validity of reported measures of variability

Anaya¹

2016

Preprint

View full text Add to dashboard Cite

GRIMMER (Granularity-Related Inconsistency of Means Mapped to Error Repeats) builds upon the GRIM test and allows for testing whether reported measures of variability are mathematically possible. GRIMMER relies upon the statistical phenomenon that variances display a simple repetitive pattern when the data is discrete, i.e. granular. This observation allows for the generation of an algorithm that can quickly identify whether a reported statistic of any size or precision is consistent with the stated sample size and granularity. My implementation of the test is available at PrePubMed (http://www.prepubmed.org/grimmer) and currently allows for testing variances, standard deviations, and standard errors for integer data. It is possible to extend the test to other measures of variability such as deviation from the mean, or apply the test to non-integer data such as data reported to halves or tenths. The ability of the test to identify inconsistent statistics relies upon four factors: (1) the sample size; (2) the granularity of the data; (3) the precision (number of decimals) of the reported statistic; and (4) the size of the standard deviation or standard error (but not the variance). The test is most powerful when the sample size is small, the granularity is large, the statistic is reported to a large number of decimal places, and the standard deviation or standard error is small (variance is immune to size considerations). This test has important implications for any field that routinely reports statistics for granular data to at least two decimal places because it can help identify errors in publications, and should be used by journals during their initial screen of new submissions. The errors detected can be the result of anything from something as innocent as a typo or rounding error to large statistical mistakes or unfortunately even fraud. In this report I describe the mathematical foundations of the GRIMMER test and the algorithm I use to implement it.

show abstract

Statistical infarction: A postmortem of the Cornell Food and Brand Lab pizza publications

Anaya¹,

van

Brown

2017

Preprint

View full text Add to dashboard Cite

We previously reported over 150 inconsistencies in a series of four articles (the "pizza papers") from the Cornell Food and Brand Lab that described a study of eating habits at an all-you-can-eat pizza buffet. The lab's initial response led us to investigate more of their work, and our investigation has now identified issues with at least 45 publications from this lab. Perhaps because of the growing media attention, Cornell and the lab have released a statement concerning the pizza papers, which included a response to the inconsistencies, along with data and code. Many of the inconsistencies were identified with the new technique of granularity testing, and this case has the highest density of granularity inconsistencies that we know of. This is also the first time a data set has been made public after granularity concerns were raised, making it a highly suitable case study for showing the accuracy and potential of this technique. It is also important that a third party audit the lab's response, given the continuing investigation of misconduct and presumably future reports and data releases. Our careful inspection of the data set suggests no evidence of fabrication, but we found the lab's report confusing, incomplete, and error prone. In addition, we found the number of missing, unusual, and logically impossible responses in the data set highly concerning. Unfortunately, given the unsound theory, poor methodology, questionable data, and countless errors, we find it remarkable that these four papers were published and recommend retraction of all four papers.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jordan Anaya

Recovering data from summary statistics: Sample Parameter Reconstruction via Iterative TEchniques (SPRITE)

Statistical heartburn: an attempt to digest four pizza publications from the Cornell Food and Brand Lab

The GRIMMER test: A method for testing the validity of reported measures of variability

Statistical infarction: A postmortem of the Cornell Food and Brand Lab pizza publications

Contact Info

Product

Resources

About