IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for
            Community Question Answering and Implicit Dialogue Identification

Nandi, Titas; Biemann, Chris; Yimam, Seid Muhie; Gupta, Deepak; Kohail, Sarah; Ekbal, Asif; Bhattacharyya, Pushpak

doi:10.18653/v1/s17-2009

Cited by 10 publications

(10 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Extracting Keyphrases Keyphrases are extracted from clarification questions using the RAKE algorithm [16], which is an efficient way to find noun phrases. This algorithm has been used in a similar setting where CQA comments should be matched to related questions [12]. We tokenize the keyphrases and consider each token individually.…”

Section: System Componentsmentioning

confidence: 99%

Identifying Unclear Questions in Community Question Answering Websites

Trienes

Balog

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Thousands of complex natural language questions are submitted to community question answering websites on a daily basis, rendering them as one of the most important information sources these days. However, oftentimes submitted questions are unclear and cannot be answered without further clarification questions by expert community members. This study is the first to investigate the complex task of classifying a question as clear or unclear, i.e., if it requires further clarification. We construct a novel dataset and propose a classification approach that is based on the notion of similar questions. This approach is compared to state-of-the-art text classification baselines. Our main finding is that the similar questions approach is a viable alternative that can be used as a stepping stone towards the development of supportive user interfaces for question formulation. This is the authors version of the work. It is posted here for your personal use. The definitive version is published in:

show abstract

Section: System Componentsmentioning

confidence: 99%

Identifying Unclear Questions in Community Question Answering Websites

Trienes

Balog

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Table 1 summarizes the results of different methods on SemEval 2017 dataset. For our methods, the term "single" denotes that we only consider word-to-word matches as in Equation 14, while "multi" means that we consider Method MAP MRR Baseline (IR) 9.18 10.11 Baseline (random) 5.77 7.69 (Tian et al 2017) 10.64 11.09 (Zhang et al 2017a) 13.23 14.27 (Xie et al 2017) 13.48 16.04 (Filice, Da Martino, and Moschitti 2017) 14.35 16.07 (Koreeda et al 2017) 14.71 16.48 (Nandi et al 2017) 15 (Filice, Da Martino, and Moschitti 2017;Xie et al 2017;Nandi et al 2017) and neural networks (Tian et al 2017;Zhang et al 2017a;Koreeda et al 2017). For singlescale model, the MAP is increased from 14.67 to 17.25, while for multi-scale model, the number is increased from 14.80 to 17.91.…”

Section: Training Hyper-parametersmentioning

confidence: 99%

Adversarial Training for Community Question Answer Selection Based on Multi-Scale Matching

Xiao¹,

Khabsa²,

Wang³

et al. 2019

AAAI

View full text Add to dashboard Cite

Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue.We employ a generative model to iteratively sample a subset of challenging negative samples to fool our classification model. Both models are alternatively optimized using REIN-FORCE algorithm. The proposed method is completely different from previous ones, where negative samples in training set are directly used or uniformly down-sampled. Further, we propose using Multi-scale Matching which explicitly inspects the correlation between words and ngrams of different levels of granularity. We evaluate the proposed method on SemEval 2016 and SemEval 2017 datasets and achieves state-of-the-art or similar performance.

show abstract

“…For classification tasks like question similarity across community QA forums, machine learning classification algorithms like Support Vector Machines (SVMs) have been used (Šaina et al, 2017;Nandi et al, 2017;Xie et al, 2017;Mihaylova et al, 2016;Wang and Poupart, 2016;. Recently, advances in deep neural network architectures have also led to the use of Convolutional Neural Networks (CNNs) (Šaina et al, 2017;Mohtarami et al, 2016) which perform reasonably well for selection of the correct answer amongst cQA formus.…”

Section: Related Workmentioning

confidence: 99%

“…Other works in the space include use of Random Forests (Wang and Poupart, 2016); topic models to match the questions at both the term level and topic level (Zhang et al, 2014). There have also been works on translation based retrieval models (Jeon et al, 2005;Zhou et al, 2011); Xg-Boost (Feng et al, 2017) and Feedforward Neural Networks (NN) (Wang and Poupart, 2016 (Wang and Poupart, 2016;Mohtarami et al, 2016;Nandi et al, 2017); and Metadata-based features (Mohtarami et al, 2016;Mihaylova et al, 2016;Xie et al, 2017).…”

Section: Related Workmentioning

confidence: 99%

Fermi at SemEval-2019 Task 8: An elementary but effective approach to Question Discernment in Community QA Forums

Syed

Indurthi

Shrivastava³

et al. 2019

Proceedings of the 13th International Workshop on Semantic Evaluation

View full text Add to dashboard Cite

Online Community Question Answering Forums (cQA) have gained massive popularity within recent years. The rise in users for such forums have led to the increase in the need for automated evaluation for question comprehension and fact evaluation of the answers provided by various participants in the forum. Our team, Fermi, participated in sub-task A of Task 8 at SemEval 2019-which tackles the first problem in the pipeline of factual evaluation in cQA forums, i.e., deciding whether a posed question asks for a factual information, an opinion/advice or is just socializing. This information is highly useful in segregating factual questions from non-factual ones which highly helps in organizing the questions into useful categories and trims down the problem space for the next task in the pipeline for fact evaluation among the available answers. Our system uses the embeddings obtained from Universal Sentence Encoder combined with XGBoost for the classification subtask A. We also evaluate other combinations of embeddings and off-the-shelf machine learning algorithms to demonstrate the efficacy of the various representations and their combinations. Our results across the evaluation test set gave an accuracy of 84% and received the first position in the final standings judged by the organizers.

show abstract

IIT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Question Answering and Implicit Dialogue Identification

Cited by 10 publications

References 12 publications

Identifying Unclear Questions in Community Question Answering Websites

Identifying Unclear Questions in Community Question Answering Websites

Adversarial Training for Community Question Answer Selection Based on Multi-Scale Matching

Fermi at SemEval-2019 Task 8: An elementary but effective approach to Question Discernment in Community QA Forums

Contact Info

Product

Resources

About