2023
DOI: 10.1007/s13748-023-00295-9
|View full text |Cite
|
Sign up to set email alerts
|

Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications

Abstract: Question generation in natural language has a wide variety of applications. It can be a helpful tool for chatbots for generating interesting questions as also for automating the process of question generation from a piece of text. Most modern-day systems, which are conversational, require question generation ability for identifying the user’s needs and serving customers better. Generating questions in natural language is now, a more evolved task, which also includes generating questions for an image or video. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(12 citation statements)
references
References 102 publications
0
12
0
Order By: Relevance
“…According to Cao and Wang [3], these corpora are limited to the generation of simple fact-based questions. Furthermore, as stated in [20,21], the majority of these QA datasets are borrowed or crowd-sourced from open-source platforms such as Wikipedia articles, and the questions generally do not incorporate multiple sentences as their basis. There is a notable QG dataset for educational purposes called LearningQ [4], which utilizes complete articles or videos as contexts, resulting in a substantial portion of sentences within the contexts being irrelevant to the specific target question.…”
Section: Datasets Used For Qgmentioning
confidence: 99%
“…According to Cao and Wang [3], these corpora are limited to the generation of simple fact-based questions. Furthermore, as stated in [20,21], the majority of these QA datasets are borrowed or crowd-sourced from open-source platforms such as Wikipedia articles, and the questions generally do not incorporate multiple sentences as their basis. There is a notable QG dataset for educational purposes called LearningQ [4], which utilizes complete articles or videos as contexts, resulting in a substantial portion of sentences within the contexts being irrelevant to the specific target question.…”
Section: Datasets Used For Qgmentioning
confidence: 99%
“…In early studies, two primary taxonomies based on the abstraction of answers [6], [7] were introduced. Recently, a new schema was proposed based on the form of possible answers into four categories factual, multiple sentences spanning, yes/no, and deep understanding [8]. Similarly, we divide questions into two groups objective questions where the answer is retrievable from the given text regardless of answer types, and subjective questions where a subjective answer is provided by individuals and the text only provides a particular topic to be questioned.…”
Section: Introductionmentioning
confidence: 99%
“…The goal is to generate natural-language questions that are useful and fluent. Many approaches also attempt to generate the corresponding answers, or use the answer to generate the question (Kurdi et al 2020;Mulla and Gharpure 2023). Due to their recent success in NLP, recent QG research has been dominated by the use of Transformer-based large language models (LLMs) (Kurdi et al 2020;Liu et al 2023).…”
Section: Introductionmentioning
confidence: 99%
“…These LLMs are deep learning models trained on massive corpora of data to improve their generative performance . The reason for applying this approach in QG research is in large part due to its significant performance improvements over earlier rule-based and other types of systems (Kurdi et al 2020;Steuer et al 2021;Mulla and Gharpure 2023).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation