“…We implemented a hierarchical attention-based deep learning structure and 4 baseline machine learning algorithms, including logistic regression, random forest, support vector machine, and XGBoost. 18 The deep learning algorithm was developed in a prior study 19 ; it incorporates a convolutional neural network for the purpose of handling word variations, recurrent neural network for context, and attention layers for interpretation of the prediction. In the deep learning model, each note section was regarded as a sequence of tokens (including words and punctuation), with individual words represented by word embeddings, for which we used word2vec and trained 100-dimensional embeddings on a large corpus of 3 729 838 notes from 10 837 patients with an initial MCI diagnosis between January 1, 2017, and February 29, 2020.…”