2003
DOI: 10.1101/gr.104003
|View full text |Cite
|
Sign up to set email alerts
|

A Classification-Based Machine Learning Approach for the Analysis of Genome-Wide Expression Data

Abstract: Three important areas of data analysis for global gene expression analysis are class discovery, class prediction, and finding dysregulated genes (biomarkers). The clinical application of microarray data will require marker genes whose expression patterns are sufficiently well understood to allow accurate predictions on disease subclass membership. Commonly used methods of analysis include hierarchical clustering algorithms, t-, F-, and Z-tests, and machine learning approaches. We describe an approach called th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
28
0
1

Year Published

2003
2003
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 46 publications
(30 citation statements)
references
References 42 publications
1
28
0
1
Order By: Relevance
“…Our gene expression biomarkers were used to distinguish emphysema cases from controls in the cohort of Spira and colleagues by average linkage hierarchical clustering with Euclidean distance. These analyses were performed using the Gene Expression Data Analyzer (http://bioinformatics2.pitt.edu/ GE2/GEDA.html) (26). Cases were subjects who met the clinical criteria and underwent lung volume reduction surgery.…”
Section: Class Predictionmentioning
confidence: 99%
“…Our gene expression biomarkers were used to distinguish emphysema cases from controls in the cohort of Spira and colleagues by average linkage hierarchical clustering with Euclidean distance. These analyses were performed using the Gene Expression Data Analyzer (http://bioinformatics2.pitt.edu/ GE2/GEDA.html) (26). Cases were subjects who met the clinical criteria and underwent lung volume reduction surgery.…”
Section: Class Predictionmentioning
confidence: 99%
“…We have used two methods Lyons-Weiler et al, 2003) to identify differentially expressed genes. Before data from the same slide or among different slides could be compared, it was necessary to normalize the data (i.e.…”
Section: Rna Isolation Microarray Hybridization and Identification mentioning
confidence: 99%
“…Figure 1 was generated by using a previously published analysis method . To further narrow down the gene list, we analysed data and identified dysregulated genes by using the maximum difference subset (MDSS) algorithm as described by Lyons-Weiler et al (2003). The advantages of this approach are (1) that it combines classification algorithms, classical statistics, and elements of machine learning, (2) it eliminates the arbitrariness of setting a threshold of statistical significance, and (3) by assimilating prediction accuracy, the MDSS algorithm is able to acquire the critical threshold of statistical significance (Pvalue).…”
Section: Rna Isolation Microarray Hybridization and Identification mentioning
confidence: 99%
“…Almost all genes among the selected 21 genes have been identified previously as containing abnormalities in AML or another form of leukemia. Most of the genes reported by Lyons-Weiler et al (2003) are also found in our DPLS gene set (HoxA9, PIG-B, MACH-alpha-2 protein, BPI Bactericidal/permeability increasing protein, Autoantigen PM-SCL, ERGIC-53 Protein, and so on). Figure 2(b) shows a heat map of the leukemia gene expression data based on the 21 selected genes most relevant for discrimination between success and failure of AML leukemia treatment.…”
Section: Supervised Clustering Between Aml Subclassesmentioning
confidence: 98%