2007
DOI: 10.1007/978-3-540-70939-8_26
|View full text |Cite
|
Sign up to set email alerts
|

Rule-Based Protein Term Identification with Help from Automatic Species Tagging

Abstract: Abstract. In biomedical articles, terms often refer to different protein entities. For example, an arbitrary occurrence of term p53 might denote thousands of proteins across a number of species. A human annotator is able to resolve this ambiguity relatively easily, by looking at its context and if necessary, by searching an appropriate protein database. However, this phenomenon may cause much trouble to a text mining system, which does not understand human languages and hence can not identify the correct prote… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2007
2007
2024
2024

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…As noted in previous work (Krauthammer and Nenadic, 2004;Chen et al, 2005;Wang, 2007), determining the correct species for the protein mentions is a very important step towards TI. However, as far as we know, there has been little work in species disambiguation and in to what extent resolving species ambiguity can help TI.…”
Section: Related Workmentioning
confidence: 89%
“…As noted in previous work (Krauthammer and Nenadic, 2004;Chen et al, 2005;Wang, 2007), determining the correct species for the protein mentions is a very important step towards TI. However, as far as we know, there has been little work in species disambiguation and in to what extent resolving species ambiguity can help TI.…”
Section: Related Workmentioning
confidence: 89%
“…One direction for future work is to conduct more curation experiments so that the variability between curators can be smoothed (e.g., some curators may prefer seeing more accurate NLP output whereas others may prefer higher recall). Meanwhile, we plan to improve the matching systems by integrating ontology processors and species disambiguators [7].…”
Section: Discussionmentioning
confidence: 99%
“…j Tissues were to be assigned to MeSH k IDs and proteins to RefSeq IDs. We selected only human proteins for this experiment, because although species is a major source of ambiguity in biological entities [7], we wanted to focus on investigating how matching techniques affect curation speed in this work. Curation was carried out using an in-house curation tool (as shown in Figure 1).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Species are assigned to proteins using a machine learning based tagger trained on contextual and species word features [14]. The species information and a set of heuristics are used to choose the most probable identifiers from the set of candidates proposed by the matcher.…”
Section: Term Identificationmentioning
confidence: 99%