Neural methods of molecule property prediction require efficient encoding of structure and property relationship to be accurate. Recent work using graph algorithms shows limited generalization in the latent molecule encoding space. We build a Transformer-based molecule encoder and property predictor network with novel input featurization that performs significantly better than existing methods. We adapt our model to semi-supervised learning to further perform well on the limited experimental data usually available in practice.
Related ResearchMost work on property prediction in recent years has followed one of two paths: (1) use of better and well-formulated molecule descriptors/fingerprints such as ECFP[7]; (2) use of novel model architectures that operate on SMILES strings or structural graphs. We propose a new approach in the latter path that combines the state-of-the-art Transformer technique originally used for language Machine Learning for Molecules Workshop at NeurIPS 2020. https://ml4molecules.github.io
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.