2022
DOI: 10.1098/rspb.2021.2721
|View full text |Cite
|
Sign up to set email alerts
|

Past and future uses of text mining in ecology and evolution

Abstract: Ecology and evolutionary biology, like other scientific fields, are experiencing an exponential growth of academic manuscripts. As domain knowledge accumulates, scientists will need new computational approaches for identifying relevant literature to read and include in formal literature reviews and meta-analyses. Importantly, these approaches can also facilitate automated, large-scale data synthesis tasks and build structured databases from the information in the texts of primary journal articles, books, grey … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(31 citation statements)
references
References 105 publications
0
30
0
1
Order By: Relevance
“…The above applications of SFWO rely on our ability to compile large databases on the trophic ecology of soil consumers. Although much data is available from previously published research, the creation of literature-based datasets requires significant manual investment for literature searching, acquisition, screening, data extraction, and harmonisation of entities, such as species or trait names [29]. An alternative is to use information extraction approaches to automatically turn unstructured text into structured data, an approach currently taken by the Specialised Information Service Biodiversity Research 11 (BIOfid), which aims at extracting structured data from legacy Central European literature through semantic role labelling.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The above applications of SFWO rely on our ability to compile large databases on the trophic ecology of soil consumers. Although much data is available from previously published research, the creation of literature-based datasets requires significant manual investment for literature searching, acquisition, screening, data extraction, and harmonisation of entities, such as species or trait names [29]. An alternative is to use information extraction approaches to automatically turn unstructured text into structured data, an approach currently taken by the Specialised Information Service Biodiversity Research 11 (BIOfid), which aims at extracting structured data from legacy Central European literature through semantic role labelling.…”
Section: Resultsmentioning
confidence: 99%
“…A number of the above applications of SFWO depend on our ability to collect large species-level databases on the trophic ecology of soil consumers. Although much data is available from previously published research, the creation of literature-based datasets requires significant manual investment for literature searching, acquisition, 6 https://github.com/biocodellc/ontology-data-pipeline 7 https://github.com/nleguillarme/inteGraph screening, data extraction, and harmonisation of entities, such as species or trait names (Farrell et al, 2022) A good ontology is consensual in nature, which means it should capture domain knowledge in a way that is accepted by the community. This is also key to a widespread adoption of a semantic model to the point it becomes a standard.…”
Section: Trophic Information Extractionmentioning
confidence: 99%
“…With current data availability, focusing on such details would strongly limit the 'across species' and 'across realms' aims of our study, compromising the potential to provide generalized results. However, we believe these approaches could be valid in the near future as techniques to mine host-parasite data improve (Farrell et al, 2022).…”
Section: Discussionmentioning
confidence: 99%
“…Culturomics (Ladle et al., 2016; Correia et al., 2021a, b) and text mining (Farrell et al., 2022) have become increasingly useful tools for analyzing conservation issues. However, these data can often be complex, and there are several limitations to my data.…”
Section: Discussionmentioning
confidence: 99%