Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications

Mulla, Nikahat; Gharpure, Prachi

doi:10.1007/s13748-023-00295-9

Cited by 30 publications

(12 citation statements)

References 102 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…According to Cao and Wang [3], these corpora are limited to the generation of simple fact-based questions. Furthermore, as stated in [20,21], the majority of these QA datasets are borrowed or crowd-sourced from open-source platforms such as Wikipedia articles, and the questions generally do not incorporate multiple sentences as their basis. There is a notable QG dataset for educational purposes called LearningQ [4], which utilizes complete articles or videos as contexts, resulting in a substantial portion of sentences within the contexts being irrelevant to the specific target question.…”

Section: Datasets Used For Qgmentioning

confidence: 99%

Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models

Maity,

Deroy,

Sarkar

2023

Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

View full text Add to dashboard Cite

Designing high-quality educational questions is a challenging and time-consuming task. In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. However, current question-answering (QA) datasets are inadequate for conducting our experiments on prompt-based question generation (QG) in an educational setting. Therefore, we curate a new QG dataset called EduProbe for schoollevel subjects, by leveraging the rich content of NCERT textbooks. We carefully annotate this dataset as quadruples of 1) Context: a segment upon which the question is formed; 2) Long Prompt: a long textual cue for the question (i.e., a longer sequence of words or phrases, covering the main theme of the context); 3) Short Prompt: a short textual cue for the question (i.e., a condensed representation of the key information or focus of the context); 4) Question: a deep question that aligns with the context and is coherent with the prompts. We investigate several prompt-based QG methods by fine-tuning pre-trained transformer-based large language models (LLMs), namely PEGASUS, T5, MBART, and BART. Moreover, we explore the performance of two general-purpose pre-trained LLMs such as Text-Davinci-003 and GPT-3.5-Turbo without any further training. By performing automatic evaluation, we show that T5 (with long prompt) outperforms all other models, but still falls short of the human baseline. Under human evaluation criteria, Text-Davinci-003 usually shows better results than other models under various prompt settings. Even in the case of human evaluation criteria, QG models mostly fall short of the human baseline. Our code and dataset are available at: https://github.com/my625/PromptQG CCS CONCEPTS• Computing methodologies → Natural language generation; Language resources; • Applied computing → Education.

show abstract

Section: Datasets Used For Qgmentioning

confidence: 99%

Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models

Maity,

Deroy,

Sarkar

2023

Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

View full text Add to dashboard Cite

show abstract

“…In early studies, two primary taxonomies based on the abstraction of answers [6], [7] were introduced. Recently, a new schema was proposed based on the form of possible answers into four categories factual, multiple sentences spanning, yes/no, and deep understanding [8]. Similarly, we divide questions into two groups objective questions where the answer is retrievable from the given text regardless of answer types, and subjective questions where a subjective answer is provided by individuals and the text only provides a particular topic to be questioned.…”

Section: Introductionmentioning

confidence: 99%

Opinerium: Subjective Question Generation Using Large Language Models

Babakhani,

Lommatzsch,

Brodt

et al. 2024

IEEE Access

View full text Add to dashboard Cite

show abstract

“…The goal is to generate natural-language questions that are useful and fluent. Many approaches also attempt to generate the corresponding answers, or use the answer to generate the question (Kurdi et al 2020;Mulla and Gharpure 2023). Due to their recent success in NLP, recent QG research has been dominated by the use of Transformer-based large language models (LLMs) (Kurdi et al 2020;Liu et al 2023).…”

Section: Introductionmentioning

confidence: 99%

“…These LLMs are deep learning models trained on massive corpora of data to improve their generative performance . The reason for applying this approach in QG research is in large part due to its significant performance improvements over earlier rule-based and other types of systems (Kurdi et al 2020;Steuer et al 2021;Mulla and Gharpure 2023).…”

Section: Introductionmentioning

confidence: 99%

“…Aligning with the common LLM training objective of the next-token prediction, the emerging paradigm for QG is to feed a textual input, called a prompt, to an LLM for the model to complete (Mulla and Gharpure 2023). Designing this prompt to generate a desired output can be a difficult task, which has resulted in a new research direction called prompt engineering.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

How Teachers Can Use Large Language Models and Bloom’s Taxonomy to Create Educational Quizzes

Elkins,

Kochmar,

Cheung

et al. 2024

AAAI

View full text Add to dashboard Cite

Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input of real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting.

show abstract

Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications

Cited by 30 publications

References 102 publications

Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models

Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models

Opinerium: Subjective Question Generation Using Large Language Models

How Teachers Can Use Large Language Models and Bloom’s Taxonomy to Create Educational Quizzes

Contact Info

Product

Resources

About