“…We considered as benchmarks five supervised methods using the labeled set alone: (i) LASSO-penalized logistic regression [16,17,34,37–39], (ii) random forest (RF) [40,41], (iii) linear discriminant analysis (LDA) [42], and (iv) LSTM-gated recurrent neural network (RNN) [24,39,43,44] trained with raw feature counts C i , t , as well as (v) LDA trained with patient-timepoint embeddings generated without weights , which we refer to as LDA embed . In addition, we considered a semi-supervised benchmark: hidden markov model (HMM) [26–29,45,46] with a multivariate gaussian emission trained with the weight-free embeddings . Only HMM and RNN leverage the longitudinal nature of the data, while all other comparator methods train models for predicting Y t based only on concurrent features ( C i,t or ) without considering the time sequence.…”