Understanding sleep and its perturbation by environment, mutation, or medication remains a central problem in biomedical research. Its examination in animal models rests on brain state analysis via classification of electroencephalographic (EEG) signatures. Traditionally, these states are classified by trained human experts by visual inspection of raw EEG recordings, which is a laborious task prone to inter-individual variability. Recently, machine learning approaches have been developed to automate this process, but their generalization capabilities are often insufficient, especially across animals from different experimental studies. To address this challenge, we crafted a convolutional neural network-based architecture to produce domain invariant predictions, and furthermore integrated a hidden Markov model to constrain state dynamics based upon known sleep physiology. Our method, which we named SPINDLE (Sleep Phase Identification with Neural networks for Domain-invariant LEearning) was validated using data of four animal cohorts from three independent sleep labs, and achieved average agreement rates of 99%, 98%, 93%, and 97% with scorings from five human experts from different labs, essentially duplicating human capability. It generalized across different genetic mutants, surgery procedures, recording setups and even different species, far exceeding state-of-the-art solutions that we tested in parallel on this task. Moreover, we show that these scored data can be processed for downstream analyzes identical to those from human-scored data, in particular by demonstrating the ability to detect mutation-induced sleep alteration. We provide to the scientific community free usage of SPINDLE and benchmarking datasets as an online server at
https://sleeplearning.ethz.ch
. Our aim is to catalyze high-throughput and well-standardized experimental studies in order to improve our understanding of sleep.