“…In the recent past, inspired by the success of deep long short-term memory (LSTM) models, some approaches similar to word2vec ( Mikolov et al, 2013 ) have been proposed to successfully learn latent space encoding directly from variable length sequences ( Ding et al, 2019 ). The direct sequence to latent space encoding method produces good generalization models ( Zemouri, 2020 ); however, they usually rely on the availability of a large training dataset. Furthermore, the direct extraction of latent space features from a limited number of sequences such as, bioluminescence ( Zhang et al, 2021 ), antioxidant ( Olsen et al, 2020 ), ECM ( Kabir et al, 2018 ), antifreeze proteins (AFPs) ( Kandaswamy et al, 2011 ), or other classes of proteins is a challenging problem.…”