“…Indeed, the advantage of time-lapse imaging is that temporal models of cell behavior can encode (or learn) the transitions of states over time (Held et al, 2010;Bove et al, 2017;Soelistyo et al, 2022;Gallusser et al, 2023). Incorporating features such as local (or collective) motion, neighbourhood embeddings or cell state classification can be used to generate rich representations (Bove et al, 2017;Driscoll et al, 2019;Andrews et al, 2021;Gradeci et al, 2021;De Vries et al, 2022;Ko et al, 2022;Malin-Mayor et al, 2022;Yamamoto et al, 2022;Viana et al, 2023). Increasingly, selfsupervised methods (such as variational autoencoders) are being used to learn explainable representations directly from the image data ( (Zaritsky et al, 2021;Soelistyo et al, 2022;Wu et al, 2022)).…”