Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2014
DOI: 10.3115/v1/d14-1037
|View full text |Cite
|
Sign up to set email alerts
|

A Graph-based Approach for Contextual Text Normalization

Abstract: The informal nature of social media text renders it very difficult to be automatically processed by natural language processing tools. Text normalization, which corresponds to restoring the non-standard words to their canonical forms, provides a solution to this challenge. We introduce an unsupervised text normalization approach that utilizes not only lexical, but also contextual and grammatical features of social text. The contextual and grammatical features are extracted from a word association graph built b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
16
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 24 publications
(18 citation statements)
references
References 11 publications
1
16
0
1
Order By: Relevance
“…Kinerja NER-tools dapat ditingkatkan melalui normalisasi teks data uji dan menggunakan data latih dalam jumlah besar. Normalisasi dilakukan dengan melacak kata nonstandar untuk digantikan dengan kata standar yang sesuai [7], [17], sedangkan data latih dalam jumlah besar dapat diperoleh dengan mengolah artikel yang relevan dengan cakupan kasus [4], [9]- [11], [18], [19].…”
Section: Metode Penelitianunclassified
“…Kinerja NER-tools dapat ditingkatkan melalui normalisasi teks data uji dan menggunakan data latih dalam jumlah besar. Normalisasi dilakukan dengan melacak kata nonstandar untuk digantikan dengan kata standar yang sesuai [7], [17], sedangkan data latih dalam jumlah besar dapat diperoleh dengan mengolah artikel yang relevan dengan cakupan kasus [4], [9]- [11], [18], [19].…”
Section: Metode Penelitianunclassified
“…However, the phonetic similarity used in these systems cannot be applied to Chinese words since Pinyin has its own specific characteristics, which do not easily map to English, for determining phonetic similarity. Another main application of phonetic similarity algorithms is text normalization (Xia et al, 2006;Li et al, 2003;Han et al, 2012;Sonmez and Ozgur, 2014;Qian et al, 2015), where phonetic similarity is measured by a combination of initial and final similarities. However, the encodings used in these approaches are too coarse-grained, yielding low F1 measures.…”
Section: Related Workmentioning
confidence: 99%
“…The task is generally treated as a noisy channel problem (Pennell and Liu, 2014;Cook and Stevenson, 2009;Yang and Eisenstein, 2013;Sonmez and Ozgur, 2014) or a translation problem (Aw et al, 2006;Contractor et al, 2010;Li and Liu, 2012;Zhang et al, 2014c). For English, most recent work (Han and Baldwin, 2011;Gouws et al, 2011;Han et al, 2012) uses two-step unsupervised approaches to first detect and then normalize informal words.…”
Section: Related Workmentioning
confidence: 99%