We conducted two AB experiments (treatment vs. control) in a massive open online course. The first experiment evaluates deliberate practice activities (DPAs) for developing problem solving expertise as measured by traditional physics problems. We find that a more interactive drag-and-drop format of DPA generates quicker learning than a multiple choice format but DPAs do not improve performance on solving traditional physics problems more than normal homework practice. The second experiment shows that a different video shooting setting can improve the fluency of the instructor which in turn improves the engagement of the students although it has no significant impact on the learning outcomes. These two cases demonstrate the potential of MOOC AB experiments as an open-ended research tool but also reveal limitations. We discuss the three most important challenges: wide student distribution, "open-book" nature of assessments, and large quantity and variety of data. We suggest possible methods to cope with those.
General backgroundWith tens of thousands of registered students, massive open online courses (MOOCs) hold great promise for data-driven education research Reich 2015;Williams 2013). Research in MOOC started almost simultaneously with the launch of the first MOOCs, covering topics from student behavior (Han et al. 2013; Kizilcec et al. 2013, edX;Koedinger et al. 2015) and instructional strategies Kim et al. 2014;Piech et al. 2013) to learning outcomes (Colvin et al. 2014). An important recent advance in MOOC-based education research is the capability to conduct an "AB experiment" on major MOOC platforms (Anderson et al. 2014; edX, n.d.;Williams and Williams 2013).Also called split testing, AB testing, or randomized controlled trial, AB experiments are a research method that is used by a wide range of disciplines ranging from marketing and business intelligence to medical research and social sciences (Chalmers et al. 1981;Kohavi and Longbotham 2015). Simply put, in a MOOC AB experiment, different instructional materials are presented to different groups of students, and the effectiveness of the materials can be evaluated by measuring the behavior and learning outcomes of students using various metrics. Chen -016-0034-4 MOOC AB experiments can be a powerful tool to help sort through the enormous variety of instructional choices. Koedinger et al. (2013) estimated the raw number of possible instructional choices to be on the order of an impossible 3 30 , which includes decisions such as more worked examples vs. more tests, immediate vs. delayed feedback, and longer vs. shorter videos. With large number of students and automated group randomization scheme, multiple MOOC AB experiments can be easily implemented in a single course, enabling researchers to accumulate evidence on different instructional designs at a much faster rate. However, the intrinsic "openness" of MOOCs also poses unique challenges on both the design of AB experiments and the analysis of the result. For example, unlike most clinical ...