2006
DOI: 10.1186/1471-2105-7-140
|View full text |Cite
|
Sign up to set email alerts
|

Exploring supervised and unsupervised methods to detect topics in biomedical text

Abstract: Background: Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on information content. Topic detection will benefit many other natural language processing tasks including information retrieval, text summarization and question answering; and is a necessary step towards the building of an information system that provides an efficient way for biologists to seek information from an ocean of literature.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
14
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(15 citation statements)
references
References 13 publications
1
14
0
Order By: Relevance
“…The results showed that the model developed using SVM had the best prediction performance compared to those developed using other algorithms. This is in concordance with other studies which frequently showed that models developed using SVM outperforms those developed using other learning algorithms [20]. …”
Section: Discussionsupporting
confidence: 92%
See 2 more Smart Citations
“…The results showed that the model developed using SVM had the best prediction performance compared to those developed using other algorithms. This is in concordance with other studies which frequently showed that models developed using SVM outperforms those developed using other learning algorithms [20]. …”
Section: Discussionsupporting
confidence: 92%
“…Semantics refers to the study of 'meanings' linked to their words in linguistic studies [25]. Semantic features could be applied by using MeSH and the Unified Medical Language System (UMLS) concepts and semantic types [12,20]. UMLS is the most extensive known database of synonyms and concepts relations of biomedical and health-related terms, maintained by National Library of Medicine.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…They clustered these into 9 categories, one for each Wikipedia category and one additional cluster. While in reality, it is almost impossible to pre-define the number of the clusters for varied topics in biomedical domain, Lee and colleagues [33] compared supervised and unsupervised methods to detect topics in biomedical texts and found that the performance of supervised topic spotting methods was better. They also found that unsupervised hierarchical clustering was robust and more readily applicable in real world settings.…”
Section: Discussionmentioning
confidence: 99%
“…When using hierarchical clustering to group documents and generate labels for the clusters, the vector space model is often adopted to produce term or keyword vectors, which help indicate similarity among documents [30,32,33]. Subsequently, documents are clustered into several subgroups, and terms or keywords that are salient for a given cluster are extracted as the theme (or label) for the cluster.…”
Section: Introductionmentioning
confidence: 99%