“…In fact, one-hot encoding is the simplest method of text encoding but one of the disadvantages is the high dimensional problem we discussed before. In addition, some researchers employ discrete representation approaches like Bag of Words (BOW), Term Frequency-Inverse Document Frequency (TF-IDF) and N-Gram to select words [11,127,132,159,163,165,188,193], but these methods still suffer from the problem of data sparsity and high dimensionality [173]. Therefore, many primary studies further investigate the effectiveness of pre-trained word embedding models, such as Continuous Word2vec [19,25,46,76,171,202,212] and GloVe [76], to learn the non-contextual representation from the extracted semantic features of Android APKs.…”