This paper introduces an adaptive and integrated utterance verification (UV) framework using minimum verification error (MVE) training as a new set of solutions suitable for real applications. UV is traditionally considered an add‐on procedure to automatic speech recognition (ASR) and thus treated separately from the ASR system model design. This traditional two‐stage approach often fails to cope with a wide range of variations, such as a new speaker or a new environment which is not matched with the original speaker population or the original acoustic environment that the ASR system is trained on. In this paper, we propose an integrated solution to enhance the overall UV system performance in such real applications. The integration is accomplished by adapting and merging the target model for UV with the acoustic model for ASR based on the common MVE principle at each iteration in the recognition stage. The proposed iterative procedure for UV model adaptation also involves revision of the data segmentation and the decoded hypotheses. Under this new framework, remarkable enhancement in not only recognition performance, but also verification performance has been obtained.