Proceedings of the Workshop on Multi-Source Multilingual Information Extraction and Summarization - MMIES '08 2008
DOI: 10.3115/1613172.1613176
|View full text |Cite
|
Sign up to set email alerts
|

Learning to match names across languages

Abstract: We report on research on matching names in different scripts across languages. We explore two trainable approaches based on comparing pronunciations. The first, a cross-lingual approach, uses an automatic name-matching program that exploits rules based on phonological comparisons of the two languages carried out by humans. The second, monolingual approach, relies only on automatic comparison of the phonological representations of each pair. Alignments produced by each approach are fed to a machine learning alg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…Another type of cost-sensitive learning is the weighting method which applies a weight on the training instance that represents its misclassification cost. Such modifications have been made to common classification algorithms like the proximity-based classifiers [50], Support Vector Machines (SVMs) [51], decision trees [52], [53], and rule-based classifiers [54].…”
Section: Supervised Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Another type of cost-sensitive learning is the weighting method which applies a weight on the training instance that represents its misclassification cost. Such modifications have been made to common classification algorithms like the proximity-based classifiers [50], Support Vector Machines (SVMs) [51], decision trees [52], [53], and rule-based classifiers [54].…”
Section: Supervised Learningmentioning
confidence: 99%
“…Time series analysis [40], [41], [42] Statistical Non-parametric [3], [14], [19], [20] Signal processing [43], [44] Parametric [3], [14], [19], [20] Information theory [47], [48], [3] Spectral [45], [46], [3] Data-driven techniques Supervised [49], [50], [51], [52], [53], [54] Deep RBM [80], [56], [81], [82], [83] Semi-supervised [55], [56], [57], [58] CNN [68],[69] Unsupervised [59], [60], [61] GAN [77], [78], [79], [74] Reinforcement [62], [63], [64], [65], [66] Autoencoder [71], …”
Section: Conventional Techniquesmentioning
confidence: 99%
“…The most common methods in the data level are random oversampling [17] and random undersampling [27] . The former randomly duplicates the minority class samples and the latter randomly reduces the majority class samples to rebalance the dataset.…”
Section: Introductionmentioning
confidence: 99%