Unobtrusive and accurate ambulatory methods are needed to monitor long-term sleep patterns for improving health. Previously developed ambulatory sleep detection methods rely either in whole or in part on self-reported diary data as ground truth, which is a problem since people often do not fill them out accurately. This paper presents an algorithm that uses multimodal data from smartphones and wearable technologies to detect sleep/wake state and sleep onset/offset using a type of recurrent neural network with long-short-term memory (LSTM) cells for synthesizing temporal information. We collected 5580 days of multimodal data from 186 participants and compared the new method for sleep/wake classification and sleep onset/offset detection to (1) non-temporal machine learning methods and (2) a state-of-the-art actigraphy software. The new LSTM method achieved a sleep/wake classification accuracy of 96.5%, and sleep onset/offset detection F1 scores of 0.86 and 0.84 respectively, with mean absolute errors of 5.0 and 5.5 min, respectively, when compared with sleep/wake state and sleep onset/offset assessed using actigraphy and sleep diaries. The LSTM results were statistically superior to those from non-temporal machine learning algorithms and the actigraphy software. We show good generalization of the new algorithm by comparing participant-dependent and participant-independent models, and we show how to make the model nearly realtime with slightly reduced performance.