2016
DOI: 10.1142/s0219720016500293
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating feature-selection stability in next-generation proteomics

Abstract: Identifying reproducible yet relevant features is a major challenge in biological research. This is well documented in genomics data. Using a proposed set of three reliability benchmarks, we find that this issue exists also in proteomics for commonly used feature-selection methods, e.g. [Formula: see text]-test and recursive feature elimination. Moreover, due to high test variability, selecting the top proteins based on [Formula: see text]-value ranks - even when restricted to high-abundance proteins - does no… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
59
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 69 publications
(61 citation statements)
references
References 32 publications
2
59
0
Order By: Relevance
“…The idea of fuzzification has also been used earlier in a few gene expression profile analysis methods [7, 8] and also proteomic profile analysis methods [6, 9]. However, these works merely use it as a component of their respective methods, and do not study its role and effectiveness as a normalization procedure.…”
Section: Introductionmentioning
confidence: 99%
“…The idea of fuzzification has also been used earlier in a few gene expression profile analysis methods [7, 8] and also proteomic profile analysis methods [6, 9]. However, these works merely use it as a component of their respective methods, and do not study its role and effectiveness as a normalization procedure.…”
Section: Introductionmentioning
confidence: 99%
“…Although subnets or clusters are predictable from large biological networks, real biological complexes are enriched for biological signal, far outperforming predicted complexes/subnets from reference networks [19, 31, 33, 34]. Here, known human protein complexes derived from the CORUM database are used [35].…”
Section: Methodsmentioning
confidence: 99%
“…QPSP, and the rank-based network approaches (RBNAs), SNET (SubNET) [29], FSNET (Fuzzy SNET) and PFSNET (Paired FSNET) [30] have been shown to be highly stable and robust, these techniques are similar in that they use a fuzzy weighting system on proteins ranked by expression [31] (see Methods). …”
Section: Introductionmentioning
confidence: 99%
“…The advent of network-based analysis methods as featureselection methods can help resolve irreproducibility [4,[6][7][8][9]. Whilst design is nontrivial and requires proper integration of bio-statistics, networks and proteomics [4,[6][7][8][9], networkbased approaches are already contributing towards resolving idiosyncratic coverage and consistency problems in clinical proteomics [10][11][12]. Soh et al [13] and Lim et al [14,15] have further demonstrated that network-based feature-selection methods are highly reproducible, and select phenotypically relevant features.…”
Section: Significance Of the Studymentioning
confidence: 99%
“…The advent of network‐based analysis methods as feature‐selection methods can help resolve irreproducibility . Whilst design is nontrivial and requires proper integration of bio‐statistics, networks and proteomics , network‐based approaches are already contributing towards resolving idiosyncratic coverage and consistency problems in clinical proteomics . Soh et al.…”
Section: Introductionmentioning
confidence: 99%