“…Recent protein function prediction methods rely on different sources of information such as sequence, interactions, protein tertiary structure, literature, coexpression, phylogenetic analysis, or the information provided in GO [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. The methods may use sequence domain annotations [5,6,8,11,21], directly apply deep convolutional neural networks (CNN) [13] or language models such as LSTMs [9] and transformers [14], or use pretrained protein language models [10,15] to represent amino acid sequences. Models may also incorporate protein-protein interactions through knowledge graph embeddings [12,16], approaches using k-nearest neighbors [21], and graph convolutional neural networks [6].…”