Proceedings of the 7th ACM India Computing Conference 2014
DOI: 10.1145/2675744.2675762
|View full text |Cite
|
Sign up to set email alerts
|

Finding acronym expansion using semi-Markov conditional random fields

Abstract: Acronyms are heavily used Out of Vocabulary terms in sms, search-queries, social media postings. The performance of text mining algorithms such as Part of Speech Tagging(POS), Named Entity Recognition, Chunking often suffer when they are applied over the noisy text. Text normalization systems are developed to normalize the noisy text. Acronym mapping and expansion has become an important component of the text normalization process. Since manually collecting acronyms and their corresponding expansions from the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…Second, most of the existing datasets are in the medical domain, ignoring the challenges in other scientific domains. While there are a few datasets for general domain (e.g., Wikipedia), online forums, news and scientific documents (Thakker et al, 2017;Li et al, 2018;Charbonnier and Wartena, 2018;Liu et al, 2011;Harris and Srinivasan, 2019), they still suffer from either noisy examples inevitable in the heuristically generated datasets (Charbonnier and Wartena, 2018;Ciosici et al, 2019;Thakker et al, 2017) or their small size, which makes them inappropriate for training advanced methods (e.g., deep neural networks) (Prokofyev et al, 2013;Harris and Srinivasan, 2019;Nautial et al, 2014).…”
mentioning
confidence: 99%
“…Second, most of the existing datasets are in the medical domain, ignoring the challenges in other scientific domains. While there are a few datasets for general domain (e.g., Wikipedia), online forums, news and scientific documents (Thakker et al, 2017;Li et al, 2018;Charbonnier and Wartena, 2018;Liu et al, 2011;Harris and Srinivasan, 2019), they still suffer from either noisy examples inevitable in the heuristically generated datasets (Charbonnier and Wartena, 2018;Ciosici et al, 2019;Thakker et al, 2017) or their small size, which makes them inappropriate for training advanced methods (e.g., deep neural networks) (Prokofyev et al, 2013;Harris and Srinivasan, 2019;Nautial et al, 2014).…”
mentioning
confidence: 99%