“…posteriors (Ding and Gimpel, 2021), which generally follow Gaussian distribution and spherical Gaussian distributions with diagonal co-variance matrices, respectively (Higgins et al, 2017;He et al, 2019;Li et al, 2019a). Such predefined forms would hinder VAEs from larger optimization space (Fang et al, 2019), thus restricting the expressivity of the model (Ding and Gimpel, 2021) and further leading to the posterior collapse (Fang et al, 2019). Therefore, a potential solution is to try more expressive distribution forms for priors and variational posteriors to improve the representation capacity (Fang et al, 2019;Tomczak and Welling, 2018;Ding and Gimpel, 2021).…”