Human emotions unfold over time, and more affective computing research has to prioritize capturing this crucial component of real-world affect. Modeling dynamic emotional stimuli requires solving the twin challenges of time-series modeling and of collecting high-quality time-series datasets. We begin by assessing the state-of-the-art in time-series emotion recognition, and we review contemporary time-series approaches in affective computing, including discriminative and generative models. We then introduce the first version of the Stanford Emotional Narratives Dataset (SENDv1): a set of rich, multimodal videos of self-paced, unscripted emotional narratives, annotated for emotional valence over time. The complex narratives and naturalistic expressions in this dataset provide a challenging test for contemporary time-series emotion recognition models. We demonstrate several baseline and state-of-the-art modeling approaches on the SEND, including a Long Short-Term Memory model and a multimodal Variational Recurrent Neural Network, which perform comparably to the human-benchmark. We end by discussing the implications for future research in time-series affective computing.recognition. Specifically, we define time-series modeling as taking in temporally continuous input data and producing temporally continuous output, with an explicit consideration of how information is propagated over time. For instance, in order to engage in such inference, a social robot in conversation with its user would have to take in a continuous stream of sensor data, process them, and reason about their user's emotions over time, perhaps after every second or after every sentence, as well as across many sentences in the conversation and across multiple conversations [11].Despite the progress that has been made in time-series emotion recognition in the past decade, the field is still far from affective robots that can understand human emotions in daily life. What is needed to achieve this ambitious goal? We suggest that the biggest barriers to overcome are due to (1) the inherent difficulty of building computational time-series models, and (2) the difficulty of collecting highquality datasets. To address this first gap, we conduct a review covering different machine-learning-based approaches to time-series modeling (Section 2). We begin by discussing the most common time-series techniques in affective computing: deep neural network models, part of a broader class of discriminative models. We also cover generative time-series approaches, which are comparatively less popular within affective computing, but offer interesting modeling capabilities and hold exciting potential for emotion understanding.We turn next to discuss the second gap: Researchers need high-quality time-series datasets on which to train models. These are expensive to construct, in terms of both the production of stimuli and the collection of timeseries annotations of emotion and affective labeling [12]. There are several existing time-series datasets that have been used by the ...