2002
DOI: 10.1093/bioinformatics/18.suppl_1.s136
|View full text |Cite
|
Sign up to set email alerts
|

Discovering statistically significant biclusters in gene expression data

Abstract: In gene expression data, a bicluster is a subset of the genes exhibiting consistent patterns over a subset of the conditions. We propose a new method to detect significant biclusters in large expression datasets. Our approach is graph theoretic coupled with statistical modelling of the data. Under plausible assumptions, our algorithm is polynomial and is guaranteed to find the most significant biclusters. We tested our method on a collection of yeast expression profiles and on a human cancer dataset. Cross val… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
579
0
10

Year Published

2004
2004
2013
2013

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 767 publications
(589 citation statements)
references
References 17 publications
0
579
0
10
Order By: Relevance
“…Moreover, it allows overlap among modules, which is essential when analyzing systems with multiple-function genes. Extant analysis techniques (8,9,17,(19)(20)(21) lack one or more of these characteristics. SAMBA is built to exploit the emerging repositories of very large-scale functional genomics data and is highly efficient and scalable in both memory and speed.…”
Section: Resultsmentioning
confidence: 99%
“…Moreover, it allows overlap among modules, which is essential when analyzing systems with multiple-function genes. Extant analysis techniques (8,9,17,(19)(20)(21) lack one or more of these characteristics. SAMBA is built to exploit the emerging repositories of very large-scale functional genomics data and is highly efficient and scalable in both memory and speed.…”
Section: Resultsmentioning
confidence: 99%
“…The thresholding BMF algorithm is used in our experiments since the datasets are generally sparse. We compare it with four other methods, BiMax [23], ISA [11,10], SAMBA [26],and Binary Non-Orthogonal Matrix Decomposition [19](BND for short). Note that BND is based on heuristics while BMF is based on non-linear programming.…”
Section: Results Analysismentioning
confidence: 99%
“…-Exhaustive bicluster enumeration: enumerating all possible biclusters to identify the best ones in exponential time [10]. -Distribution parameter identification: assuming the data is generated from a model and trying to fit parameters of that model by minimizing a certain criterion [7].…”
Section: Related Workmentioning
confidence: 99%