TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy

Venktesh, V.; Mohania, Mukesh; Goyal, Vikram

doi:10.1007/978-3-030-86517-7_24

Cited by 4 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We conduct experiments on the following datasets: ARC (Xu et al, 2019), QC-Science (Mohania et al, 2021), and EURLEX57K (Chalkidis et al, 2019). Details of datasets, metrics, and training details are in Appendix.…”

Section: Methodsmentioning

confidence: 99%

“…For comparison, in addition to simple baselines, we employ some state-of-the-art methods including BERT (prototype) (Snell et al, 2017), TagRec (Mohania et al, 2021), TagRec++ (Viswanathan et al, 2022), and Poly-encoder . For ablations, built on the bi-encoder (BERT) method, we present three variants: Bi-encoder (BERT) + CEAA, Bi-encoder (DPR), and Bi-encoder (DPR) + CEAA, where the comparisons between the variants could highlight the contribution of transfer learning and CEAA.…”

Section: Methodsmentioning

confidence: 99%

“…Text classification in the education domain is reportedly difficult as the tags (or, labels) are hierarchical (Xu et al, 2019;Goel et al, 2022;Mohania et al, 2021), grow flexibly, and can be multi-labeled (Medini et al, 2019;Dekel and Shamir, 2010). Though retrieval-based methods were effective for such long-tailed and multilabel datasets (Zhang et al, 2022;, they relied on vanilla BERT (Devlin et al, 2018) models, leaving room for improvement, for which we leverage question-answering fine-tuned retrieval models (Karpukhin et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

“…However, applying auto-tagging for real-world education is challenging due to data scarcity. This is because auto-tagging has a potentially very large label space, ranging from subject topics to knowledge components (KC) (Zhang et al, 2015;Koedinger et al, 2012;Mohania et al, 2021;Viswanathan et al, 2022). The resulting data scarcity decreases performance on rare labels during training (Chalkidis et al, 2020;Lu et al, 2020;Snell et al, 2017;Choi et al, 2022).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

A Study on the Perception of SNS Users on Artificial Intelligence Design Using Text Mining

Lee¹

2023

kmms

View full text Add to dashboard Cite

Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenarios, there have been fewer efforts to directly address the data scarcity problem. To mitigate these issues, here we propose a novel retrieval approach CEAA that provides effective learning in educational text classification. Our main contributions are as follows: 1) we leverage transfer learning from question-answering datasets, and 2) we propose a simple but effective data augmentation method introducing cross-encoder style texts to a bi-encoder architecture for more efficient inference. An extensive set of experiments shows that our proposed method is effective in multi-label scenarios and low-resource tags compared to state-of-the-art models.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%