“…Clinical notes 22 and non-structural interviews [23][24][25] are also used and often associated with a more precise diagnosis using self-questionnaire PCL-5 based on the DSM-5 criterion 26 or semi-structured interview SCID (American Psychiatric Association 2013). To build NLP models, many kinds of linguistic features are extracted: statistical (number of words, number of words per sentence), morpho-syntactic (proportion of rst-person pronoun, verb tense), topic modeling (LDA, LSA); word vector representation (Word2Vec, Doc2Vec, Glove, Fasttext), contextual embeddings vectors (BERT, Roberta), graph-based features 28 , coherence 29 and readability features 30 , external resources such as LIWC 31 , sentiment analysis scores like LabMT 32 , TexBlob (Loria, 2018) or FLAIR 34 and transfer learning methods like DLATK 35 that used pre-trained models on social media data. The models used for the classi cation task, which consists of separating in people with and without PTSD, are mainly Random Forest (RF), 36 , Logistic Regression (LR), CNN, LSTM, and transformers 16,17 .…”