2007
DOI: 10.1016/j.patrec.2007.07.011
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual news clustering: Feature translation vs. identification of cognate named entities

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
10
0

Year Published

2009
2009
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 15 publications
2
10
0
Order By: Relevance
“…Their experiments show that clustering based only on NEs is complementary to clustering based on keywords, because using more information about NEs in the clustering, as their category, produce meaningful layers and group of documents. In a previous work (Montalvo, Martínez, Casillas, & Fresno, ), we showed that the use of NEs as the only type of features to represent news documents leads to good multilingual news clustering results through the identification of equivalent NEs, either by translation or by cognate identification. Additionally, in Montalvo, Fresno, and Martínez () we proposed a new similarity measure based only on the NEs shared by news documents; the results were promising and comparable with those obtained with such well‐known similarity measures as the cosine or correlation coefficient.…”
Section: Related Workmentioning
confidence: 76%
See 1 more Smart Citation
“…Their experiments show that clustering based only on NEs is complementary to clustering based on keywords, because using more information about NEs in the clustering, as their category, produce meaningful layers and group of documents. In a previous work (Montalvo, Martínez, Casillas, & Fresno, ), we showed that the use of NEs as the only type of features to represent news documents leads to good multilingual news clustering results through the identification of equivalent NEs, either by translation or by cognate identification. Additionally, in Montalvo, Fresno, and Martínez () we proposed a new similarity measure based only on the NEs shared by news documents; the results were promising and comparable with those obtained with such well‐known similarity measures as the cosine or correlation coefficient.…”
Section: Related Workmentioning
confidence: 76%
“…Therefore, it is crucial to be able to establish as many correspondences as possible among common NEs, as much within the same language as between both sides of the comparable collections of news. In addition, NE categories are very useful as additional information for use in similarity evaluations of news documents because we can vary the importance of common entities depending on their category or we can combine the information of common entities differently depending on their category (Montalvo et al., , ).…”
Section: Representation Of the Collection Of Comparable Newsmentioning
confidence: 99%
“…Our approach for multilingual news clustering is based on the representation of the news documents by means of the cognate Named Entities they contain. Previously, we obtained encouraging preliminary results using only the cognate Named Entities as news representative features [43]. We measured the similarity between news documents by computing the Named Entities they shared.…”
Section: Multilingual News Clusteringmentioning
confidence: 99%
“…To the best of our knowledge, existing clustering approaches for comparable corpora are customized for a small set (two or three) of languages (Montalvo et al, 2007). Most of them are not generalizable to many languages as they employ bilingual dictionaries and the translation is performed sequentially considering only pairs of languages.…”
Section: Introductionmentioning
confidence: 99%