Hongxuan Tang scite author profile

Hongxuan Tang

5Publications

14Citation Statements Received

62Citation Statements Given

How they've been cited

How they cite others

Affiliations

Baidu (China), Soochow University

Publications

Order By: Most citations

DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications

Tang¹,

Li²,

Liu³

et al. 2021

View full text Add to dashboard Cite

Machine reading comprehension (MRC) is a crucial task in natural language processing and has achieved remarkable advancements. However, most of the neural MRC models are still far from robust and fail to generalize well in real-world applications. In order to comprehensively verify the robustness and generalization of MRC models, we introduce a realworld Chinese dataset -DuReader robust . It is designed to evaluate the MRC models from three aspects: over-sensitivity, over-stability and generalization. Comparing to previous work, the instances in DuReader robust are natural texts, rather than the altered unnatural texts. It presents the challenges when applying MRC models to real-world applications. The experimental results show that MRC models do not perform well on the challenge test set. Moreover, we analyze the behavior of existing models on the challenge test set, which may provide suggestions for future model development. The dataset and codes are publicly available at https://github.com/baidu/ DuReader.

show abstract

DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications

Tang

Liu

et al. 2020

Preprint

View full text Add to dashboard Cite

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

Wang¹,

Shen²,

Peng³

et al. 2022

View full text Add to dashboard Cite

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

Wang¹,

Shen²,

Peng³

et al. 2022

Preprint

View full text Add to dashboard Cite

While there is increasing concern about the interpretability of neural models, the evaluation of interpretability remains an open problem, due to the lack of proper evaluation datasets and metrics. In this paper, we present a novel benchmark to evaluate the interpretability of both neural models and saliency methods. This benchmark covers three representative NLP tasks: sentiment analysis, textual similarity and reading comprehension, each provided with both English and Chinese annotated data. In order to precisely evaluate the interpretability, we provide token-level rationales that are carefully annotated to be sufficient, compact and comprehensive. We also design a new metric, i.e., the consistency between the rationales before and after perturbations, to uniformly evaluate the interpretability of models and saliency methods on different tasks. Based on this benchmark, we conduct experiments on three typical models with three saliency methods, and unveil their strengths and weakness in terms of interpretability. We will release this benchmark at https://xyz and hope it can facilitate the research in building trustworthy systems.

show abstract

Separate Answer Decoding for Multi-class Question Generation

Yan

Zhu

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hongxuan Tang

DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications

DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

Separate Answer Decoding for Multi-class Question Generation

Contact Info

Product

Resources

About