Reading comprehension (RC) on social media such as Twitter is a critical and challenging task due to its noisy, informal, but informative nature. Most existing RC models are developed on formal datasets such as news articles and Wikipedia documents, which severely limit their performances when directly applied to the noisy and informal texts in social media. Moreover, these models only focus on a certain type of RC, extractive or generative, but ignore the integration of them. To well address these challenges, we come up with a noisy user-generated text-oriented RC model. In particular, we first introduce a set of text normalizers to transform the noisy and informal texts to the formal ones. Then, we integrate the extractive and the generative RC model by a multi-task learning mechanism and an answer selection module. Experimental results on TweetQA demonstrate that our NUT-RC model significantly outperforms the state-of-the-art social media-oriented RC models.
Entity relation classification aims to classify the semantic relationship between two marked entities in a given sentence, and plays a vital role in various natural language processing applications. However, existing studies focus on exploiting mono-lingual data in English, due to the lack of labeled data in other languages. How to effectively benefit from a richly-labeled language to help a poorly-labeled language is still an open problem. In this paper, we come up with a language adaptation framework for cross-lingual entity relation classification. The basic idea is to employ adversarial neural networks (AdvNN) to transfer feature representations from one language to another. Especially, such a language adaptation framework enables feature imitation via the competition between a sentence encoder and a rival language discriminator to generate effective representations. To verify the effectiveness of AdvNN, we introduce two kinds of adversarial structures, dual-channel AdvNN and single-channel AdvNN. Experimental results on the ACE 2005 multilingual training corpus show that our single-channel AdvNN achieves the best performance on both unsupervised and semi-supervised scenarios, yielding an improvement of 6.61% and 2.98% over the state-of-the-art, respectively. Compared with baselines which directly adopt a machine translation module, we find that both dual-channel and single-channel AdvNN significantly improve the performances (F 1) of cross-lingual entity relation classification. Moreover, extensive analysis and discussion demonstrate the appropriateness and effectiveness of different parameter settings in our language adaptation framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.