2021
DOI: 10.1093/bib/bbab128
|View full text |Cite|
|
Sign up to set email alerts
|

Machine learning approach to gene essentiality prediction: a review

Abstract: Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
45
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 68 publications
(45 citation statements)
references
References 134 publications
0
45
0
Order By: Relevance
“…Despite an increased interest in machine learning for essentiality prediction, current methods still suffer from limitations in their accuracy and ability to generalize across environmental conditions or species 6 . This is partly due to the lack of gene featurization strategies that are predictive of essentiality.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite an increased interest in machine learning for essentiality prediction, current methods still suffer from limitations in their accuracy and ability to generalize across environmental conditions or species 6 . This is partly due to the lack of gene featurization strategies that are predictive of essentiality.…”
Section: Discussionmentioning
confidence: 99%
“…Due to the cost of such screens, there is substantial interest in computational methods that can rapidly explore the impact of gene deletions and complement current experimental efforts to determine gene essentiality. Such approaches typically employ machine learning in combination with various properties such as sequence homology and gene-function ontologies 5,6 .…”
Section: Introductionmentioning
confidence: 99%
“…Some of the cited methods with higher observed AUC use alternative features (e.g., network topology, gene ontology-based features, more gene expression) and larger training set sizes. Our method can be improved by exploring some of these alternative features (see recent review [66]), as was seen in Figure 3 where an expanded feature set provided modest improvement in overall mean and variance of the AUC. Furthermore, exploration of what types of genes are misclassified (Tables S1 and S2) may help suggest the types of features that should be included.…”
Section: Discussionmentioning
confidence: 99%
“…Area under precision-recall curve is a more effective metric than area under receiver optimizer characteristics curve when applied on highly skewed tasks [ 31 , 84 ]. Matthews correlation coefficient [ 106 ] has also been successfully used in SL prediction study [ 24 ] of which the samples are highly imbalanced. Besides, Li et al.…”
Section: Challenges and Future Workmentioning
confidence: 99%