The goal of learning analytics (LA) is to better understand and improve education. There are indications that LA would benefit from using multiple data sources instead of relying only on log data, which is usually collected when learning occurs in front of a computer screen. Learning is not always facilitated by a digital system that can capture digital traces. Therefore, learning analytics and learning science are increasingly focusing on multimodality, or the combination of multiple sensor data streams where each data type is a modality or mode. During the learning process, sensors can capture observable events, such as learners' behavior and interactions, as well as the learning context. However, to improve and evaluate the effectiveness of analytical systems, it is necessary to evaluate how research is conducted in the emerging area of multimodal learning analytics (MMLA). Multimodal sensory data consist of continuously evolving sets of sequences that are recorded over time. Such data are essential to MMLA applications, but it is difficult to prepare and process these datasets. Moreover, MMLA provides opportunities for understanding and supporting collaborative problem solving. Implementing MMLA systems during face-to-face work can be challenging.This thesis has three objectives: 1. To examine current research methodologies and assess current interest and research themes in MMLA, 2. To investigate the data management aspects of developing MMLA IoT-based systems, and 3. To develop an initial sample design for MMLA systems. We use three main approaches to achieve these objectives. First, we review existing literature and identify existing research methodologies and trending research themes in MMLA. Second, to analyze how multimodal sensor data can be managed, we use an AR case study to identify the main challenges in managing multimodal data.