Building English – Punjabi Aligned Parallel Corpora of Nouns from Comparable Corpora
Dilshad Kaur,
Satwinder Singh
Abstract:Comparable corpora are the right resources for extracting parallel data due to their abundant availability. It is of great importance where parallel data are scarce. In this study, the focus is placed on building of parallel data for Punjabi and English language pair. The raw data were collected from web contents of “Mann Ki Baat”, which is a collection of textual speeches of Prime Minister of India Mr. Narendra Modi broadcasted every last Sunday of the month. Data were cleaned and pre-processed using a natura… Show more
Set email alert for when this publication receives citations?
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.