In this work, the approach used is to sequence powerful models that have achieved excellent performance on language translation encoding-decoding tasks. A language transformer model is used in this work based on the sequence to sequence approach, which uses a Long Short-Term Memory (LSTM) to map the input sequence to a vector of fixed dimensionality. Then another deep LSTM decodes the target sequence from the vector. Evaluated the model efficiency through BLEU score and LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty with long-short of sentences. This work performed the deep LSTM setup English-Japanese translation accuracy at an order of magnitude faster speed, both on GPU and CPU. The variety of the data is introduced into it to evaluate the robustness using the BLEU score. Finally, a better result is achieved on merging the two different types of datasets and got the highest BLEU score of 40.1 at the end.
Recently the NLP community has started showing interest towards the challenging task of Hostile Post Detection. This paper presents our system for Shared Task @ Constraint2021 on "Hostile Post Detection in Hindi" 1 . The data for this shared task is provided in Hindi Devanagari script which was collected from Twitter and Facebook. It is a multi-label multi-class classification problem where each data instance is annotated into one or more of the five classes: fake, hate, offensive, defamation, and non-hostile. We propose a two level architecture which is made up of BERT based classifiers and statistical classifiers to solve this problem. Our team 'Albatross', scored 0.9709 Coarse grained hostility F1 score measure on Hostile Post Detection in Hindi subtask and secured 2nd rank out of 45 teams for the task 2 . Our submission is ranked 2nd and 3rd out of a total of 156 submissions with Coarse grained hostility F1 score of 0.9709 and 0.9703 respectively. Our fine grained scores are also very encouraging and can be improved with further finetuning. The code is publicly available 3 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.