Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.285
|View full text |Cite
|
Sign up to set email alerts
|

MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs

Abstract: The prosperity of Massive Open Online Courses (MOOCs) provides fodder for many NLP and AI research for education applications, e.g., course concept extraction, prerequisite relation discovery, etc. However, the publicly available datasets of MOOC are limited in size with few types of data, which hinders advanced models and novel attempts in related topics. Therefore, we present MOOCCube, a large-scale data repository of over 700 MOOC courses, 100k concepts, 8 million student behaviors with an external resource… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
50
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 97 publications
(50 citation statements)
references
References 17 publications
0
50
0
Order By: Relevance
“…Besides, the concept graph is not able to cover different domains. Beyond the document recommendation, some data inherently appears with the prerequisite relationships, like lecture data created in the MOOCs [40,41], as different lectures credited by the same student are time-variant. This notorious problem can be automatically sidestepped by leveraging citation relationship as we did.…”
Section: Related Workmentioning
confidence: 99%
“…Besides, the concept graph is not able to cover different domains. Beyond the document recommendation, some data inherently appears with the prerequisite relationships, like lecture data created in the MOOCs [40,41], as different lectures credited by the same student are time-variant. This notorious problem can be automatically sidestepped by leveraging citation relationship as we did.…”
Section: Related Workmentioning
confidence: 99%
“…They include mathematics ASSISTments 1 [32,55], problem solving interactions [18], or multiple choice questions 2 [56,57]), all of which are noteworthy although they do not focus on implicit feedback and engagement. MOOCCube is a very recently released dataset that contains a spectrum of different statistics relating to learner-MOOC interactions including implicit and explicit test taking activity [62]. Although this dataset may contain data that can be used to predict learner engagement with implicit feedback, the dataset has only been used in prerequisite detection task which is very different.…”
Section: Related Datasetsmentioning
confidence: 99%
“…Developing artificial intelligence systems that, mildly at least, understand the structure of knowledge is foundational to building effective recommendation systems for lifelong education [13,41,62], as well as for many other applications related to knowledge management and tracing [35,61]. In the context of personalised education, the research communities have tirelessly worked on building Intelligent Tutoring Systems (ITS) which have been their focus from the early applications of AI in education [19].…”
Section: Introductionmentioning
confidence: 99%
“…With the rapid growth of online educational resources in diverse fields, people need an efficient way to acquire new knowledge. Building a concept graph can help people design a correct and efficient study path (ALSaad et al, 2018;Yu et al, 2020). There are mainly two approaches to learning prerequisite relations between concepts: one is to extract the relations directly from course content, video sequences, textbooks, or Wikipedia articles (Yang et al, 2015b;Pan et al, 2017;Alzetta et al, 2019), but this approach requires extra work on feature engineering and keyword extraction.…”
Section: Introductionmentioning
confidence: 99%
“…Existing methods formulate this question as a classification task. A typical method is to encode concept pairs and train a classifier to predict if there is a prerequisite relation (Alzetta et al, 2019;Yu et al, 2020). However, this method requires annotated prerequisite pairs during training.…”
Section: Introductionmentioning
confidence: 99%