Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.
DOI: 10.1109/csb.2004.1332454
|View full text |Cite
|
Sign up to set email alerts
|

AZuRe, a scalable system for automated term disambiguation of gene and protein names

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0
1

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(19 citation statements)
references
References 12 publications
0
18
0
1
Order By: Relevance
“…[28] in which a training set is automatically generated for each human gene, a thesaurus-based method [30] in which a reference description based on either its annotations or MEDLINE abstracts is created for each human gene, and a general method [36] in which Entrez Gene is used to create a profile for each gene sense. All these existing methods need a lot of efforts to create either a training set or a profile for each ambiguous gene symbol.…”
Section: Related Abbreviationsmentioning
confidence: 99%
“…[28] in which a training set is automatically generated for each human gene, a thesaurus-based method [30] in which a reference description based on either its annotations or MEDLINE abstracts is created for each human gene, and a general method [36] in which Entrez Gene is used to create a profile for each gene sense. All these existing methods need a lot of efforts to create either a training set or a profile for each ambiguous gene symbol.…”
Section: Related Abbreviationsmentioning
confidence: 99%
“…Both stop-word removal and stemming can be seen as methods for reducing the dimensionality of the feature vector. Podowksi et al (2004) used two bag-of-words feature vectors to disambiguate gene names: one to distinguish between gene and NIT meanings, and one to distinguish between different gene meanings of the same gene name. On a set of 6,521 manually annotated documents relating to 46 different genes, they achieved an accuracy of 88.8% (calculated based on table in original article).…”
Section: Feature Vectorsmentioning
confidence: 99%
“…When the latter were included, 233% additional "gene" instances were retrieved, most of which were false positives. In several other studies [17-19], it was also suggested that solving this ambiguity problem is an important requirement for large-scale application of text-mining tools in the biomedical field.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, Podowski et al [19] used Bayesian classifier models to disambiguate gene symbols found in LocusLink [28]. Interestingly, their system can distinguish between gene and non-gene meanings of a symbol, acknowledging the fact that many gene symbols are abbreviations of terms with non-gene meanings.…”
Section: Introductionmentioning
confidence: 99%