2009
DOI: 10.2202/1544-6115.1426
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Outlier Samples in Microarray Data

Abstract: In this paper, we address the problem of detecting outlier samples with highly different expression patterns in microarray data. Although outliers are not common, they appear even in widely used benchmark data sets and can negatively affect microarray data analysis. It is important to identify outliers in order to explore underlying experimental or biological problems and remove erroneous data. We propose an outlier detection method based on principal component analysis (PCA) and robust estimation of Mahalanob… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
58
0
3

Year Published

2009
2009
2021
2021

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 70 publications
(62 citation statements)
references
References 31 publications
1
58
0
3
Order By: Relevance
“…Extreme expression values that lie outside the mean level of variation observed in the study were referred to as outliers. We used the principal component analysis (PCA) outlier detection method 17 for identifying outliers in this study. Of the five samples identified as outliers, 3 were from active and 2 from placebo patients.…”
Section: Resultsmentioning
confidence: 99%
“…Extreme expression values that lie outside the mean level of variation observed in the study were referred to as outliers. We used the principal component analysis (PCA) outlier detection method 17 for identifying outliers in this study. Of the five samples identified as outliers, 3 were from active and 2 from placebo patients.…”
Section: Resultsmentioning
confidence: 99%
“…Other metrics such as correlation or quantity of outlier spots may be used for outlier array detection [32]. Similar methods exist for detecting biological microarray outliers using principal component analysis (PCA) [33]. Despite the abundance of robust methods for processing microarray data, concerns with data quality still hinder adoption of the technology for clinical applications.…”
Section: Section II Experimental Methodsmentioning
confidence: 99%
“…It is a widely used benchmark datasets in outlier identification. Many researchers reported their findings on outliers for this dataset [6][7][8]. Our algorithm is used to address the colon cancer dataset and identified sample outliers which are N36, N34, T36, T33 T30, T2, and N8 where "N" represent the normal samples and "T" represents the tumor samples.…”
Section: Transitional Datasetmentioning
confidence: 99%
“…The clean datasets are obtained by removing the outliers from the original dataset. Outliers reported in [6][7][8] and those identified by our algorithm are listed in Table 1 in which the MR values based on the clean datasets are also given. We can find that every method can give a very small MR and our algorithm gives the smallest.…”
Section: Transitional Datasetmentioning
confidence: 99%