2021
DOI: 10.1186/s12859-021-03997-w
|View full text |Cite
|
Sign up to set email alerts
|

geneRFinder: gene finding in distinct metagenomic data complexities

Abstract: Background Microbes perform a fundamental economic, social, and environmental role in our society. Metagenomics makes it possible to investigate microbes in their natural environments (the complex communities) and their interactions. The way they act is usually estimated by looking at the functions they play in those environments and their responsibility is measured by their genes. The advances of next-generation sequencing technology have facilitated metagenomics research however it also creat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 41 publications
(25 reference statements)
0
3
0
Order By: Relevance
“…While, in order to be able to correctly perform classification independent of a read’s offset within the CDS, we also automatically determine which of the six possible frame the read is in. This pre-requisite step in itself is a novel application of machine learning to ORF detection, as current tools either (i) rely purely on presence/absence of start/stop codons without further interpretation of the sequence (such as getorf [ 28 ] or OrfM [ 29 ]) or return all candidate sequences for each read without clearly resolving potentially contradictory hypotheses (such as FragGeneScan [ 30 ], CNN-MGP [ 31 ] or geneRFinder [ 32 ]). However, since this is a proof-of-concept work, we do not—in contrast to these existing tools—examine reads that are outside of CDSs in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…While, in order to be able to correctly perform classification independent of a read’s offset within the CDS, we also automatically determine which of the six possible frame the read is in. This pre-requisite step in itself is a novel application of machine learning to ORF detection, as current tools either (i) rely purely on presence/absence of start/stop codons without further interpretation of the sequence (such as getorf [ 28 ] or OrfM [ 29 ]) or return all candidate sequences for each read without clearly resolving potentially contradictory hypotheses (such as FragGeneScan [ 30 ], CNN-MGP [ 31 ] or geneRFinder [ 32 ]). However, since this is a proof-of-concept work, we do not—in contrast to these existing tools—examine reads that are outside of CDSs in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…While metagenomics and binning approaches provide deeper and highly specific information on the taxonomy and functional capacity of individual members within a complex microbial community, is accompanied by higher comparative costs, data burden and increased complexity of downstream bioinformatic processes (Meziti et al, 2021). These limitations are consequently restricting the analysis of large metagenomics datasets to organizations with access to high performance computing clusters, yet, even with access to specialized tools, data analysis can take multiple days to a week (Tikariha and Purohit, 2019;Silva et al, 2021;Chivian et al, 2023). Nevertheless, improvements to downstream analysis pipelines and technologies are constantly occurring and with time will likely mitigate these challenges (Meziti et al, 2019(Meziti et al, , 2021Tikariha and Purohit, 2019;Nelson et al, 2020).…”
Section: Omic Tools In Bioremediation Researchmentioning
confidence: 99%
“…Coming back to the study of genes in metagenomes: it is important to note that numerous computational techniques have been developed to identify genes that are present in an environmental sample. In a broader sense, these techniques fall into two categories: (a) gene prediction tools – which attempt to predict genes ab initio using only the sequences [48, 59, 77], and (b) gene classification/annotation tools – which aim to annotate a metagenome sample against a database of known genes. The latter category is more relevant for functional profiling since we need to have prior knowledge of the functions of known genes.…”
Section: Introductionmentioning
confidence: 99%