Despite 25 years of progress in understanding the molecular mechanisms of olfaction, it is still not possible to predict whether a given molecule will have a perceived odor, or what olfactory percept it will produce. To address this stimulus-percept problem for olfaction, we organized the crowd-sourced DREAM Olfaction Prediction Challenge. Working from a large olfactory psychophysical dataset, teams developed machine learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models predicted odor intensity and pleasantness with high accuracy, and also successfully predicted eight semantic descriptors ("garlic", "fish", "sweet", "fruit," "burnt", "spices", "flower", "sour").Regularized linear models performed nearly as well as random-forest-based approaches, with a predictive accuracy that closely approaches a key theoretical limit. The models presented here make it possible to predict the perceptual qualities of virtually any molecule with an impressive degree of accuracy to reverse-engineer the smell of a molecule.
One Sentence Summary:Results of a crowdsourcing competition show that it is possible to accurately predict and reverse-engineer the smell of a molecule.
Main Text:In vision and hearing, the wavelength of light and frequency of sound are highly predictive of color and tone. In contrast, it is not currently possible to predict the smell of a molecule from its chemical structure (1, 2). This stimulus-percept problem has been difficult to solve in olfaction because odor stimuli do not vary continuously in stimulus space, and the size and dimensionality of olfactory perceptual space is unknown (1, 3,4). Some molecules with very similar chemical structures can be discriminated by humans (5, 6), and molecules with very different structures sometimes produce nearly identical percepts (2). Recent computational efforts developed models to relate chemical structure to odor percept (2,(7)(8)(9)(10)(11), but many relied on psychophysical data from a single 30-year-old study that used odorants with limited structural and perceptual diversity (12, 13).Twenty-two teams competing in the DREAM Olfaction Prediction Challenge (14) were given a large, unpublished psychophysical dataset collected by Keller and Vosshall from 49 individuals who profiled 476 structurally and perceptually diverse molecules, including those that are unfamiliar, unpleasant, or nearly odorless (15) (Fig. 1a). We supplied 4884 physicochemical features of each of the molecules smelled by the subjects, including atom types, . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/082495 doi: bioRxiv preprint first posted online Oct. 21, 2016; 3 functional groups, and topological and geometrical properties that were computed using Dragon chemoinformatic software (version 6) (Fig. 1b).Using a baseline linear model developed for the challenge and inspi...