“…The literature concerning generative models of molecules has exploded since the first work on the topic Gómez-Bombarelli et al [2018]. Current methods feature molecular representations such as SMILES [Janz et al, 2018, Segler et al, 2017, Skalic et al, 2019, Ertl et al, 2017, Lim et al, 2018, Kang and Cho, 2018, Sattarov et al, 2019, Gupta et al, 2018, Harel and Radinsky, 2018, Yoshikawa et al, 2018, Bjerrum and Sattarov, 2018, Mohammadi et al, 2019 and graphs [Simonovsky and Komodakis, 2018, Li et al, 2018a, De Cao and Kipf, 2018, Kusner et al, 2017, Dai et al, 2018, Samanta et al, 2019, Li et al, 2018b, Kajino, 2019, Jin et al, 2019, Bresson and Laurent, 2019, Lim et al, 2019, Pölsterl and Wachinger, 2019, Krenn et al, 2019, Maziarka et al, 2019, Madhawa et al, 2019, Shen, 2018, Korovina et al, 2019 In this section we conduct an empirical test of the hypothesis from [Gómez-Bombarelli et al, 2018] that the decoder's lack of efficiency is due to data point collection in "dead regions" of the latent space far from the data on which the VAE was trained. We use this information to construct a binary classification Bayesian Neural Network (BNN) to serve as a constraint function that outputs the probability of a latent point being valid, the details of which will be discussed in the section on labelling criteria.…”