Proceedings of the 2004 ACM Symposium on Applied Computing 2004
DOI: 10.1145/967900.967925
|View full text |Cite
|
Sign up to set email alerts
|

Classifying biological articles using web resources

Abstract: Text classification systems on biomedical literature aim to select relevant articles to a specific issue from large corpora. Most systems with an acceptable accuracy are based on domain knowledge, which is very expensive and does not provide a general solution. This paper presents a novel approach for text classification on biomedical literature, involving the use of information extracted from related web resources. We validated this approach by implementing the proposed method and testing it on the KDD2002 Cu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0
1

Year Published

2005
2005
2011
2011

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 13 publications
0
9
0
1
Order By: Relevance
“…For instance, in KDD2002 Cup challenge: bio-text task , statistical text classification systems reasoning without considering domain knowledge achieved also poor results [7]. An effective approach is to obtain the required domain knowledge from publicly available resources [8]. …”
Section: Discussionmentioning
confidence: 99%
“…For instance, in KDD2002 Cup challenge: bio-text task , statistical text classification systems reasoning without considering domain knowledge achieved also poor results [7]. An effective approach is to obtain the required domain knowledge from publicly available resources [8]. …”
Section: Discussionmentioning
confidence: 99%
“…In addition to domain-independent methods, specialized methods for classifying biomedical documents have been proposed that incorporate information from outside the documents, such as from the Unified Medical Language System [18] or from the Web [4]. Other methods use special information within documents such as figure captions [28] and image data [35].…”
Section: Methodsmentioning
confidence: 99%
“…CAC avoids the complexities of creating rules and patterns covering all possible cases or creating training sets that are too specific to be extended to new domains [29]. Besides avoiding direct human intervention, automatically collected domain knowledge is usually much larger than manually generated domain knowledge and does not become outdated, since public databases can be tracked for updates as they evolve [10].…”
Section: Introductionmentioning
confidence: 99%