2022
DOI: 10.3390/diagnostics12081997
|View full text |Cite
|
Sign up to set email alerts
|

A Computational Approach to Identification of Candidate Biomarkers in High-Dimensional Molecular Data

Abstract: Complex high-dimensional datasets that are challenging to analyze are frequently produced through ‘-omics’ profiling. Typically, these datasets contain more genomic features than samples, limiting the use of multivariable statistical and machine learning-based approaches to analysis. Therefore, effective alternative approaches are urgently needed to identify features-of-interest in ‘-omics’ data. In this study, we present the molecular feature selection tool, a novel, ensemble-based, feature selection applicat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 41 publications
(62 reference statements)
0
4
0
Order By: Relevance
“…To select mutational motifs most predictive of the POLE driver status, feature selection was performed using MFeaST . 85 Classification model development was carried out with the scikit-learn python package (v. 1.1.1) and the MATLAB Classification Learner App with 5-fold cross-validation and default tuning hyper-parameters. To identify the most accurate classification model, 91 several machine learning classifiers were applied, namely linear discriminant, support vector machine, gradient boosting, AdaBoostClassifier, Bagging classifier, KNeighborsClassifier, Decision Tree, Random Forest, Gaussian Naive Bayes.…”
Section: Methodsmentioning
confidence: 99%
“…To select mutational motifs most predictive of the POLE driver status, feature selection was performed using MFeaST . 85 Classification model development was carried out with the scikit-learn python package (v. 1.1.1) and the MATLAB Classification Learner App with 5-fold cross-validation and default tuning hyper-parameters. To identify the most accurate classification model, 91 several machine learning classifiers were applied, namely linear discriminant, support vector machine, gradient boosting, AdaBoostClassifier, Bagging classifier, KNeighborsClassifier, Decision Tree, Random Forest, Gaussian Naive Bayes.…”
Section: Methodsmentioning
confidence: 99%
“…To perform feature selection, we recommend using an ensemble feature selection tool, MFeaST . 10 We have successfully applied this tool in many omics studies. 1 , 11 , 12 , 14 , 15 , 16 , 17 , 47 , 48 , 49 For download instructions, see before you begin – software installation and directory set-up .…”
Section: Step-by-step Methods Detailsmentioning
confidence: 99%
“…To overcome this issue, we use a more generalizable ensemble feature selection approach that uses multiple feature selection algorithms. The Molecular Feature Selection Tool (MFeaST) 10 ranks all available features based on their combined score from different types of feature selection algorithms. Compared to other ensemble approaches, which are limited to a subset of feature selection algorithms, MFeaST uses filter, wrapper, and embedded techniques.…”
Section: Before You Beginmentioning
confidence: 99%
See 1 more Smart Citation
“…To identify peaks that best discriminate between patients who suffered from early disease recurrence and patients with no early recurrence at different time points, we used a machine learning-based feature selection algorithm ( MFeaST ) [ 11 ]. The algorithm relies on a set of machine learning techniques to rank the features based on their ability to discriminate between two groups and is highly effective at identifying molecular patterns among samples.…”
Section: Methodsmentioning
confidence: 99%