2005
DOI: 10.1007/11562214_30
|View full text |Cite
|
Sign up to set email alerts
|

PLSI Utilization for Automatic Thesaurus Construction

Abstract: Abstract. When acquiring synonyms from large corpora, it is important to deal not only with such surface information as the context of the words but also their latent semantics. This paper describes how to utilize a latent semantic model PLSI to acquire synonyms automatically from large corpora. PLSI has been shown to achieve a better performance than conventional methods such as tf·idf and LSI, making it applicable to automatic thesaurus construction. Also, various PLSI techniques have been shown to be effect… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0
1

Year Published

2008
2008
2018
2018

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 8 publications
0
8
0
1
Order By: Relevance
“…The data are available athttp://vsearch.cl.cs.okayama-u.ac.jp/ index.php † † The detailed setting is described in Hagiwara et al [5]. † † † The number of output clusters will be the same as the number of words, i.e., tokens.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The data are available athttp://vsearch.cl.cs.okayama-u.ac.jp/ index.php † † The detailed setting is described in Hagiwara et al [5]. † † † The number of output clusters will be the same as the number of words, i.e., tokens.…”
Section: Resultsmentioning
confidence: 99%
“…The major word clustering schemes are word distributionbased methods [1], [2] and decomposition-based methods [3]- [5]. The decompositional methods cluster words on the basis of the orthogonal vectors of the assumed latent semantic clusters between words.…”
Section: Background Issuesmentioning
confidence: 99%
See 1 more Smart Citation
“…As a vector-based approach of collecting verb synonyms, we employ Hagiwara's approach [5] which applies PLSI to decreasing unrelated vector space. The Hagiwara's approach extracting similar verbs to a key verb on the basis of similarity of vectors between verbs.…”
Section: Plsi-based Approachmentioning
confidence: 99%
“…The major methods of word clustering are word distribution-based methods [6,11] and decompositionbased methods [12,10,5]. The decompositional methods cluster words on the basis of orthogonal vectors of assumed latent semantic clusters between words.aim of the methods is to make a compact orthogonal vectors based on global optimization such as EM algorithm.…”
Section: Background Issuesmentioning
confidence: 99%