We recently released our new Citation Statement Search feature, which allows users to search 900M citation statements extracted from 26M full-text articles. This is a powerful new tool for finding expert analyses and opinions on nearly any topic. It’s also powerful for finding some things scientists probably shouldn’t be writing in their papers.
Here we review 7 common phrases seen in the scientific literature, which scientists probably shouldn’t be writing.
Researchers have been known to suggest that their findings are trending “towards significance,” meaning their p-values were slightly above or near p=0.05. This is problematic for two reasons: first, it suggests that if circumstances were slightly different (e.g., if more data were collected), the p-value would be lower; but indeed, it could just as well be higher (as Geoff Cumming likes to point out, it might as well be trending away from significance). Second, if one were to collect more data, one would be violating the principle that you should never base decisions about data collection and analyses on the data itself.
Related to the search above, the phrase nearly significant is not actually something a scientist should ever write because there is no such thing as being nearly statistically significant. This phrase is in direct contravention of the logic of frequentist statistics, which are based on a strict decision rule based on whether or not the obtained p-value being below the cutoff (alpha). In much the same way as it is impossible to be “a little bit pregnant”, it is impossible to be “nearly significant.”
Researchers might often cite a preliminary finding in their paper that they are working on in another paper, often citing it as “data not shown.” While this is tempting to do, this is basically an unverifiable claim being made. Given the rise in open repositories and places where we can share early findings or even incomplete findings, the reasons for citing “data not shown” are dwindling.
Much like the example above, sharing your code and data is important for the verification of studies. In today’s world, there is little reason not to share your code. Of course, there is some code that is subject to restrictive licensing or privacy concerns but even in those cases anonymized or sample code can be provided.
The recent preprint, “Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals,” highlights numerous phrases that we should hope to not see in the scientific literature because chosen terminology has a precise meaning meant to help communication. As an example, profound neural organization in place of “deep neural network” is one phrase that should not be used because it does not have a precise meaning thereby making scientific communication more difficult.
This phrase is commonly used in scientific papers to try to distill complex topics in papers into a sentence or two. In most cases, however, this is rarely achieved. As one researcher noted recently, “Academics will say ‘put simply’, and follow it with the least comprehensible sentence you’ve ever read.” This is quite true as you can see from the many examples in the search.
Scientific papers are littered with puns. It’s enough, stop with that garbage. Only kidding, keep it up!
What are some more phrases you can think of? Search Citation Statements!