“…Examples of the types of data used as input include the wild-type amino acid sequence ( Lin et al, 2022; Brandes et al, 2022 ), a multiple sequence alignment (MSA) ( Ng and Henikoff, 2001; Balakrishnan et al, 2011; Lui and Tiana, 2013; Nielsen et al, 2017; Hopf et al, 2017; Riesselman et al, 2018; Laine et al, 2019 ) or the protein structure ( Boomsma and Frellsen, 2017; Jing et al, 2021a; Hsu et al, 2022 ). Some methods have combined predictions from multiple protein data types at an aggregate level ( Strokach et al, 2021; Høie et al, 2022; Cagiada et al, 2023; Nguyen and Hy, 2023 ), although some results suggest that a richer representation might be learned by combining multiple data types at the input level ( Mansoor et al, 2021; Wu et al, 2023; Wang et al, 2022; Yang et al, 2022; Chen et al, 2023; Cheng et al, 2023; Zhang et al, 2023 ).…”