In the living cells, proteins bind small molecules (or “ligands”) through a “conformational selection” mechanism, where a subset of protein structures are capable of binding the small molecules well while most other protein structures are not capable of such binding. The present work uses machine learning approaches to identify, in a very large amount of protein:ligand complexes, what protein properties are associated with their capacity to bind small molecules. In order to do so, we calculate 40 physicochemical properties on about 1.5 millions of protein conformations: ligand and protein conformations. This work describes a machine learning approach to identify the unique physico-chemical descriptors of a protein that maximize the prediction rate of potential protein molecular conformations for the test case proteins ADORA2A (Adenosine A2a Receptor), ADRB2 (Adrenoceptor Beta 2) and OPRK1 (Opioid Receptor Kappa 1). We find adequate machine learning techniques can increase by an order of magnitude the identification of “binding protein conformations” in an otherwise very large ensemble of protein conformations, compared to random selection of protein conformations. This opens the door to the systematic identification of such “binding conformations” for proteins and provides a big data approach to the conformational selection mechanism.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.