Most of the protein biological functions occur through contacts with other proteins or ligands. The residues that constitute the contact surface of a ligand-binding pocket are usually located far away within its sequence.Therefore, the identification of such motifs is more challenging than the linear protein domains. To discover new binding sites, we developed a tool called PickPocket that focuses on a small set of user-defined ligands and PickPocket: Pocket binding prediction for specific ligand families using neural networks.uses neural networks to train a ligand-binding prediction model. We tested PickPocket on fatty acid-like ligands due to their structural similarities and their under-representation in the ligand-pocket binding literature.Our results show that for fatty acid-like molecules, pocket descriptors and secondary structures are enough to obtain predictions with accuracy >90% using a dataset of 1740 manually curated ligand-binding pockets. The trained model could also successfully predict the ligandbinding pockets using unseen structural data of two recently reported fatty acid-binding proteins. We think that the PickPocket tool can help to discover new protein functions by investigating the binding sites of specific ligand families. The source code and all datasets contained in this work are freely available at https://github.com/benjaminviart/PickPocket .
Author Summary :Most of the protein biological functions are defined by its interactions with other proteins or ligands. The cavity of the protein structure that receives a ligand, also called a pocket, is made of residues that are usually located far away within its sequence. Therefore understanding the PickPocket: Pocket binding prediction for specific ligand families using neural networks.complementarity of pocket and ligand is a real challenge. To discover new binding sites, we developed a tool called PickPocket that focuses on a small set of user-defined ligands to train a prediction model. Our resultsshow that for fatty acid-like molecules, pocket descriptors ( such as volume, shape, hydrophobicity… ) and secondary structures are enough to obtain predictions with accuracy >90% using a dataset of 1740 manually curated ligand-binding pockets. The trained model could also successfully predict the ligand-binding pockets using unseen structural data of two recently reported fatty acid-binding proteins. We think that the PickPocket tool can help to discover new protein functions by investigating the binding sites of specific ligand families.