Data-Driven Answer Selection in Community QA Systems

Nie, Liqiang; Wei, Xiaochi; Zhang, Dongxiang; Wang, Xiang; Gao, Zhipeng; Yang, Yi

doi:10.1109/tkde.2017.2669982

Cited by 64 publications

(23 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thread-wise, we examine how accurate our output user scores are in ranking trustworthy respondents within a single discussion by measuring both Spearman’s coefficient and nDCG@2 with regard to the ordering provided by our credibility proxy. We find the forum-wise ranking metrics meaningful as answer ranking, based on user credibility in our case, is a well-formulated task in CQA [ 49 – 51 ]. The measurement results are presented in Table 6 .…”

Section: Resultsmentioning

confidence: 99%

Neural side effect discovery from user credibility and experience-assessed online health discussions

et al. 2020

View full text Add to dashboard Cite

Background Health 2.0 allows patients and caregivers to conveniently seek medical information and advice via e-portals and online discussion forums, especially regarding potential drug side effects. Although online health communities are helpful platforms for obtaining non-professional opinions, they pose risks in communicating unreliable and insufficient information in terms of quality and quantity. Existing methods in extracting user-reported adverse drug reactions (ADRs) in online health forums are not only insufficiently accurate as they disregard user credibility and drug experience, but are also expensive as they rely on supervised ground truth annotation of individual statement. We propose a NEural ArchiTecture for Drug side effect prediction (NEAT), which is optimized on the task of drug side effect discovery based on a complete discussion while being attentive to user credibility and experience, thus, addressing the mentioned shortcomings. We train our neural model in a self-supervised fashion using ground truth drug side effects from mayoclinic.org. NEAT learns to assign each user a score that is descriptive of their credibility and highlights the critical textual segments of their post. Results Experiments show that NEAT improves drug side effect discovery from online health discussion by 3.04 % from user-credibility agnostic baselines, and by 9.94 % from non-neural baselines in term of F 1 . Additionally, the latent credibility scores learned by the model correlate well with trustworthiness signals, such as the number of “thanks” received by other forum members, and improve credibility heuristics such as number of posts by 0.113 in term of Spearman’s rank correlation coefficient. Experience-based self-supervised attention highlights critical phrases such as mentioned side effects, and enhances fully supervised ADR extraction models based on sequence labelling by 5.502 % in terms of precision. Conclusions NEAT considers both user credibility and experience in online health forums, making feasible a self-supervised approach to side effect prediction for mentioned drugs. The derived user credibility and attention mechanism are transferable and improve downstream ADR extraction models. Our approach enhances automatic drug side effect discovery and fosters research in several domains including pharmacovigilance and clinical studies.

show abstract

Section: Resultsmentioning

confidence: 99%

Neural side effect discovery from user credibility and experience-assessed online health discussions

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Other researchers have tested different approached to solve the same problem. In particular, Nie et al (2017) have developed an approach to 'resolve' questions by recommending a list of potential solutions chosen among those answers provided in response to other similar, possibly duplicate questions. As such, unlike ours, their approach has the potential to 'resolve' also questions with no answers (e.g., new answers).…”

Section: Discussionmentioning

confidence: 99%

An empirical assessment of best-answer prediction models in technical Q&A sites

2018

View full text Add to dashboard Cite

Technical Q&A sites have become essential for software engineers as they constantly seek help from other experts to solve their work problems. Despite their success, many questions remain unresolved, sometimes because the asker does not acknowledge any helpful answer. In these cases, an information seeker can only browse all the answers within a question thread to assess their quality as potential solutions. We approach this time-consuming problem as a binary-classification task where a best-answer prediction model is built to identify the accepted answer among those within a resolved question thread, and the candidate solutions to those questions that have received answers but are still unresolved. In this paper, we report on a study aimed at assessing 26 best-answer prediction models in two steps. First, we study how models perform when predicting best answers in Stack Overflow, the most popular Q&A site for software engineers. Then, we assess performance in a cross-platform setting where the prediction models are trained on Stack Overflow and tested on other technical Q&A sites. Our findings show that the choice of the classifier and automatied parameter tuning have a large impact on the prediction of the best answer. We also demonstrate that our approach to the bestanswer prediction problem is generalizable across technical Q&A sites. Finally, we provide practical recommendations to Q&A platform designers to curate and preserve the crowdsourced knowledge shared through these sites.

show abstract

“…Nie et al present an algorithm for ranking answer candidates from all of the available answers for a new question. They rely on four types of features, named deep, topic-level, statistical, and user-centric [32].…”

Section: Related Workmentioning

confidence: 99%

Duplicate Question Detection in Stack Overflow: A Reproducibility Study

Silva

Paixão

Maia

2018

Preprint

View full text Add to dashboard Cite

Abstract-Stack Overflow has become a fundamental element of developer toolset. Such influence increase has been accompanied by an effort from Stack Overflow community to keep the quality of its content. One of the problems which jeopardizes that quality is the continuous growth of duplicated questions. To solve this problem, prior works focused on automatically detecting duplicated questions. Two important solutions are DupPredictor and Dupe. Despite reporting significant results, both works do not provide their implementations publicly available, hindering subsequent works in scientific literature which rely on them. We executed an empirical study as a reproduction of DupPredictor and Dupe. Our results, not robust when attempted with different set of tools and data sets, show that the barriers to reproduce these approaches are high. Furthermore, when applied to more recent data, we observe a performance decay of our both reproductions in terms of recall-rate over time, as the number of questions increases. Our findings suggest that the subsequent works concerning detection of duplicated questions in Question and Answer communities require more investigation to assert their findings.

show abstract

Data-Driven Answer Selection in Community QA Systems

Cited by 64 publications

References 37 publications

Neural side effect discovery from user credibility and experience-assessed online health discussions

Neural side effect discovery from user credibility and experience-assessed online health discussions

An empirical assessment of best-answer prediction models in technical Q&A sites

Duplicate Question Detection in Stack Overflow: A Reproducibility Study

Contact Info

Product

Resources

About