The emergence of big data in educational contexts has led to new data-driven approaches to support informed decision making and efforts to improve educational effectiveness. Digital traces of student behavior promise more scalable and finer-grained understanding and support of learning processes, which were previously too costly to obtain with traditional data sources and methodologies. This synthetic review describes the affordances and applications of microlevel (e.g., clickstream data), mesolevel (e.g., text data), and macrolevel (e.g., institutional data) big data. For instance, clickstream data are often used to operationalize and understand knowledge, cognitive strategies, and behavioral processes in order to personalize and enhance instruction and learning. Corpora of student writing are often analyzed with natural language processing techniques to relate linguistic features to cognitive, social, behavioral, and affective processes. Institutional data are often used to improve student and administrational decision making through course guidance systems and early-warning systems. Furthermore, this chapter outlines current challenges of accessing, analyzing, and using big data. Such challenges include balancing data privacy and protection with data sharing and research, training researchers in educational data science methodologies, and navigating the tensions between explanation and prediction. We argue that addressing these challenges is worthwhile given the potential benefits of mining big data in education.
Student clickstream data-time-stamped records of click events in online courses-can provide fine-grained information about student learning. Such data enable researchers and instructors to collect information at scale about how each student navigates through and interacts with online education resources, potentially enabling objective and rich insight into the learning experience beyond self-reports and intermittent assessments. Yet, analyses of these data often require advanced analytic techniques, as they only provide a partial and noisy record of students' actions. Consequently, these data are not always accessible or useful for course instructors and administrators. In this paper, we provide an overview of the use of clickstream data to define and identify behavioral patterns that are related to student learning outcomes. Through discussions of four studies, we provide examples of the complexities and particular considerations of using these data to examine student self-regulated learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.