2013
DOI: 10.1016/j.sbspro.2013.10.620
|View full text |Cite
|
Sign up to set email alerts
|

Extracting Comparable Articles from Wikipedia and Measuring their Comparabilities

Abstract: Corpus Resources for Descriptive and Applied Studies. Current Challenges and Future Directions: Selected Papers from the 5th International Conference on Corpus Linguistics (CILC2013)International audienceParallel corpora are not available for all domains and languages, but statistical methods in multilingual research domains require huge parallel/comparable corpora. Comparable corpora can be used when the parallel is not sufficient or not available for specific domains and languages. In this paper, we propose … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

2
20
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 14 publications
(22 citation statements)
references
References 2 publications
2
20
0
Order By: Relevance
“…These methods are based on bilingual dictionaries [10,16,19], or based on cross-lingual Information retrieval (CL-IR) [7,1,21], or based on cross-lingual Latent Semantic Indexing (CL-LSI) system [2,11,6,14].…”
Section: Introductionmentioning
confidence: 99%
See 4 more Smart Citations
“…These methods are based on bilingual dictionaries [10,16,19], or based on cross-lingual Information retrieval (CL-IR) [7,1,21], or based on cross-lingual Latent Semantic Indexing (CL-LSI) system [2,11,6,14].…”
Section: Introductionmentioning
confidence: 99%
“…In the dictionary based method [10,16,19], two cross-lingual documents d a and d e are comparable if a maximum of words in d a are translations of words in d e , so a bilingual dictionary can be used to look-up the translation of words in both documents. The drawbacks of this approach are the dependency on bilingual dictionaries which are not always available, and the necessity to use morphological analyzers for languages that can be inflected.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations