2005
DOI: 10.1017/s1351324905003876
|View full text |Cite
|
Sign up to set email alerts
|

Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Abstract: This paper presents a very simple and effective approach to using parallel corpora for automatic bilingual lexicon acquisition. The approach, which uses the Random Indexing vector space methodology, is based on finding correlations between terms based on their distributional characteristics. The approach requires a minimum of preprocessing and linguistic knowledge, and is efficient, fast and scalable. In this paper, we explain how our approach differs from traditional cooccurrence-based word alignment algorith… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
47
0
1

Year Published

2006
2006
2014
2014

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(49 citation statements)
references
References 15 publications
1
47
0
1
Order By: Relevance
“…These index vectors are sparse, high-dimensional, and ternary, which means that their dimensionality (d) is on the order of hundreds, and that they consist of a small number( ) of randomly distributed +1s and -1s, with the rest of the elements of the vectors set to 0. In our work each element is allocated with the following probability [13]:…”
Section: Random Indexing (Ri)mentioning
confidence: 99%
See 1 more Smart Citation
“…These index vectors are sparse, high-dimensional, and ternary, which means that their dimensionality (d) is on the order of hundreds, and that they consist of a small number( ) of randomly distributed +1s and -1s, with the rest of the elements of the vectors set to 0. In our work each element is allocated with the following probability [13]:…”
Section: Random Indexing (Ri)mentioning
confidence: 99%
“…Since 2000, it has been studied and empirically validated in a number of experiments and usages in distributional similarity problems [12,13]. However, few of the Random Indexing approaches have been employed into the field of Web mining, especially for the discovery of Web user access patterns.…”
Section: Introductionmentioning
confidence: 99%
“…Such considerations taken into account a word space model can be applied to basically any language. For instance, word space models have been applied to, among many other languages; English, German, Spanish, Swedish as well as Japanese (Hassel 2005, Sahlgren & Karlgren 2005, Sahlgren 2006). …”
Section: Word Space Modelsmentioning
confidence: 99%
“…Sahlgren & Karlgren [21] demonstrated that Random Indexing can be applied to parallel texts for automatic bilingual lexicon acquisition. Sahlgren & Cöster [7] used Random Indexing to carry out text categorization.…”
Section: Random Indexingmentioning
confidence: 99%