Abstract-Vocal Tract Length Normalization (VTLN) is a very important speaker normalization technique for speech recognition tasks. In this paper, we propose the use of Gaussian posteriorgram of VTLN-warped spectral features for a Queryby-Example Spoken Term Detection (QbE-STD). This paper presents the use of a Gaussian Mixture Model (GMM) framework for estimation of VTLN warping factor. This GMM framework does not require phoneme-level transcription and hence, it can be useful for unsupervised tasks. We propose the iterative approach for VTLN warping factor estimation with two GMM training approaches, namely, Expectation-Maximization (EM) and Deterministic Annealing-Expectation Maximization (DAEM). The VTLN-warped Gaussian posteriorgram gave the better QbE-STD performance. The performance of TIMIT QbE-STD was investigated with different evaluation factors, such as a number of Gaussian components in GMM, various local constraints, and a number of iterations in VTLN warping factor estimation. VTLNwarped Gaussian posteriorgram reduces the speaker-specific variation in Gaussian posteriorgram and hence, it is expected to give better performance than Gaussian posteriorgram.