2018
DOI: 10.1101/307157
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine learning based prediction of functional capabilities in metagenomically assembled microbial genomes

Abstract: The increasing popularity of genome resolved meta genomics -the binning of genomes of potentially uncultured organisms direct from the environmental DNA -has resulted in a deluge of draft genomes. There is a pressing need to develop methods to interpret this data. Here, we used machine learning to predict functional and metabolic traits of microbes from their genomes. We collated an extensive database of 84 phenotypic traits associated with 9407 prokaryotic genomes and trained different machine learning models… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 45 publications
(37 reference statements)
0
8
0
Order By: Relevance
“…Moreover, such methods are rather time-consuming and computationally demanding, thus representing a bottleneck for efficient sequence data analysis (Sharma et al, 2015). ML algorithms could potentially increase the accuracy and speed of clinically and epidemiologically relevant predictions (Farrell et al, 2018). However, to yield accurate predictions, besides the choice of the most appropriate algorithm and a set of well-defined inputs and outputs of interest, ML-based strategies generally require large amounts of high-quality training data (Baker et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…Moreover, such methods are rather time-consuming and computationally demanding, thus representing a bottleneck for efficient sequence data analysis (Sharma et al, 2015). ML algorithms could potentially increase the accuracy and speed of clinically and epidemiologically relevant predictions (Farrell et al, 2018). However, to yield accurate predictions, besides the choice of the most appropriate algorithm and a set of well-defined inputs and outputs of interest, ML-based strategies generally require large amounts of high-quality training data (Baker et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…This presents a limitation, as currently microbial genome databases are known to be biased toward cultivable pathogenic bacteria. The current lack of large and comprehensive databases can be considered as the key bottleneck for the application of ML methods (Farrell et al, 2018). Hence, future improvements can be expected to come from better data curation and collection, in addition to development of new and improved classification algorithms (Farrell et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations