“…On the inference side, there are attentive NPs , which endow the encoder with self-attention (and thus make it Turing complete, Pérez et al, 2021), and convolutional (conditional) NPs (Gordon et al, 2019;Foong et al, 2020), which add translation equivariance to the model. On the generative side, there are functional NPs (Louizos et al, 2019), which introduce dependence between the predictions by learning a relational graph structure over the latents z and Gaussian NPs (Bruinsma et al, 2021), which achieve a similar property by replacing the generative likelihood with a GP, the mean and kernel of which are inferred based on the latents.…”