2022
DOI: 10.1109/tcbb.2020.3029952
|View full text |Cite
|
Sign up to set email alerts
|

The -OMP Algorithm for Feature Selection With Application to Gene Expression Data

Abstract: Feature selection for predictive analytics is the problem of identifying a minimal-size subset of features that is maximally predictive of an outcome of interest. To apply to molecular data, feature selection algorithms need to be scalable to tens of thousands of features. In this paper, we propose γ-OMP, a generalisation of the highly-scalable Orthogonal Matching Pursuit feature selection algorithm. γ-OMP can handle (a) various types of outcomes, such as continuous, binary, nominal, time-to-event, (b) discret… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 51 publications
0
6
0
Order By: Relevance
“…εpilogi is a greedy feature selection algorithm based on the generalization of the Orthogonal Matching Pursuit algorithm ( Pati et al 1993 ) called the γ-OMP ( Tsagris et al 2022 ). γ-OMP generalizes the standard OMP to any type of outcome, any type of predictor feature, metric for measuring residuals, and predictive model used internally by the algorithm.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…εpilogi is a greedy feature selection algorithm based on the generalization of the Orthogonal Matching Pursuit algorithm ( Pati et al 1993 ) called the γ-OMP ( Tsagris et al 2022 ). γ-OMP generalizes the standard OMP to any type of outcome, any type of predictor feature, metric for measuring residuals, and predictive model used internally by the algorithm.…”
Section: Methodsmentioning
confidence: 99%
“…using logistic regression) with the selected features and computes the residuals of the model (e.g. deviance residuals or raw residuals) ( Tsagris et al 2022 ). Next, εpilogi selects as the next-best feature to include the one that is mostly correlated with the residuals.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…gOMP is a faster and more flexible implementation of the OMP with the options for using different models and stopping criteria (Tsagris et al, 2020). After standardizing the data, covariances of the response and the predictor variables are calculated and then the feature with the maximum covariance is chosen.…”
Section: Methodologiesmentioning
confidence: 99%
“…In other words, the gOMP algorithm continues to select features since the criterion of minimum 2 BIC difference is met. Rfast library has been used for the gOMP implementation (Tsagris et al, 2020).…”
Section: Methodologiesmentioning
confidence: 99%