The latent variable prior of the variational autoencoder (VAE) often utilizes a standard Gaussian distribution because of the convenience in calculation, but has an underfitting problem. This paper proposes a variational autoencoder with optimizing Gaussian mixture model priors. This method utilizes a Gaussian mixture model to construct prior distribution, and utilizes the Kullback-Leibler (KL) distance between posterior and prior distribution to implement an iterative optimization of the prior distribution based on the data. The greedy algorithm is used to solve the KL distance for defining the approximate variational lower bound solution of the loss function, and for realizing the VAE with optimizing Gaussian mixture model priors. Compared with the standard VAE method, the proposed method obtains state-of-the-art results on MNIST, Omniglot, and Frey Face datasets, which shows that the VAE with optimizing Gaussian mixture model priors can learn a better model.
Human pose estimation is a fundamental but challenging task in computer vision. The estimation of human pose mainly depends on the global information of the keypoint type and the local information of the keypoint location. However, the consistency of the cascading process makes it difficult for each stacking network to form a differentiation and collaboration mechanism. In order to solve these problems, this paper introduces a new human pose estimation framework called Multi-Scale Collaborative (MSC) network. The pre-processing network forms feature maps of different sizes, and dispatches them to various locations of the stack network, with small-scale features reaching the front-end stacking network and large-scale features reaching the back-end stacking network. A new loss function is proposed for MSC network. Different keypoints have different weight coefficients of loss function at different scales, and the keypoint weight coefficients are dynamically adjusted from the top hourglass network to the bottom hourglass network. Experimental results show that the proposed method is competitive in MPII and LSP challenge leaderboard among the state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.