Background
The manual processes used for risk assessments are not scaling to the amount of data available. Although automated approaches appear promising, they must be transparent in a public policy setting.
Objective
Our goal is to create an automated approach that moves beyond retrieval to the extraction step of the information synthesis process, where evidence is characterized as supporting, refuting, or neutral with respect to a given outcome.
Methods
We combine knowledge resources and natural language processing to resolve coordinated ellipses and thus avoid surface level differences between concepts in an ontology and outcomes in an abstract. As with a systematic review, the search criterion, and inclusion and exclusion criterion are explicit.
Results
The system scales to 482K abstracts on 27 chemicals. Results for three endpoints that are critical for cancer risk assessments show that refuting evidence (where the outcome decreased) was higher for cell proliferation (45.9%), and general cell changes (37.7%) than for cell death (25.0%). Moreover, cell death was the only end point where supporting claims were the majority (61.3%). If the number of abstracts that measure an outcome was used as a proxy for association there would be a stronger association with cell proliferation than cell death (20/27 chemicals). However, if the amount of supporting evidence was used (where the outcome increased) the conclusion would change for 21/27 chemicals (20 from proliferation to death and 1 from death to proliferation).
Conclusions
We provide decision makers with a visual representation of supporting, neutral, and refuting evidence whilst maintaining the reproducibility and transparency needed for public policy. Our findings show that results from the retrieval step where the number of abstracts that measure an outcome are reported can be misleading if not accompanied with results from the extraction step where the directionality of the outcome is established.