The growth of Massive Open Online Courses (MOOCs) has been remarkable in the last few years. A significant amount of MOOCs content is in the form of videos and participants often use non-linear navigation to browse through a video. This paper proposes the design of a system that provides nonlinear navigation in educational videos using features derived from a combination of audio and visual content of a video. It provides multiple dimensions for quickly navigating to a given point of interest in a video i.e., customized dynamic time-aware word-cloud, video pages, and a 2-D timeline. In word-cloud, the relative placement of the words indicates their temporal ordering in the video whereas color codes are used to represent acoustic stress. The 2-D timeline is used to present multiple occurrences of a keyword/concept in the video in response to user click in the word-cloud. Additionally, visual content is analyzed to identify frames with "maximum written content", known as video pages. We conducted a user study with 20 users to evaluate the proposed system and compared it with transcription-based interfaces used by major MOOC providers. Our findings suggest that the proposed system leads to statistically significant navigation time savings especially on multimodal navigation tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.