2003
DOI: 10.1007/978-3-540-45145-7_36
|View full text |Cite
|
Sign up to set email alerts
|

Alternative Parallelization Strategies in EST Clustering

Abstract: Abstract.One of the fundamental components of large-scale gene discovery projects is that of clustering of Expressed Sequence Tags (ESTs) from complementary DNA (cDNA) clone libraries. Clustering is used to create non-redundant catalogs and indices of these sequences. In particular, clustering of ESTs is frequently used to estimate the number of genes derived from cDNA-based gene discovery efforts. This paper presents a novel parallel extension to an EST clustering program, UIcluster4, that incorporates altern… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2004
2004
2012
2012

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…Although the gross number of clusters differed between the UniGene sets derived by UIcluster and NCBI, the cluster composition was very similar. Previously reported results demonstrated 2.5-5% difference in cluster composition between the two strategies (26).…”
Section: Gene Discoverymentioning
confidence: 83%
“…Although the gross number of clusters differed between the UniGene sets derived by UIcluster and NCBI, the cluster composition was very similar. Previously reported results demonstrated 2.5-5% difference in cluster composition between the two strategies (26).…”
Section: Gene Discoverymentioning
confidence: 83%
“…The complexity of the comparison feature in the algorithm is reduced because of the hashing table [21] generated as clusters were formed. In the recent years, the hashing technique is used to speed up the comparison of cDNA sequence because comparing numbers has proved to be much faster than comparing text [22][23].…”
Section: Phase II -Initial Clusteringmentioning
confidence: 99%
“…The clustering algorithm proposed to solve the stated problem in this research is the hashing technique which eliminates the clustering performance issue; meanwhile parallelization approach is used to solve the computational and memory problem. Both techniques were used in clustering of 'Expressed Sequence Tags' (ESTs) from complementary DNA (cDNA) clone libraries [23]. For this research however clusters are grouped together according to scenarios and an individual cluster may be a member to more than one scenario and this computation is done simultaneously.…”
Section: Phase III -Cluster Joiningmentioning
confidence: 99%
“…The ESTs generated from specific tissues represents the presence of active mRNAs in the selected tissue and sampling conditions. The creation of an EST database has several advantages including a fast and inexpensive way to discover novel genes, rapid identification of active genes, identification of exon–intron structure, generation of information on gene expression, gene regulation and sequence diversity, comparative genomic study, serve as markers or tags for transcripts, development of markers for reference genetic map and recovery of full-length cDNAs and genomic sequences (Ho et al 2007 ; Luro et al 2008 ; Thanh et al 2011 ; Trivedi et al 2003 ; Ye et al 2010 ; Zeng et al 2010 ).…”
Section: Introductionmentioning
confidence: 99%