Abstract:This paper focuses on the generation of accurate phonetic segmentations. Statistical methods based on absolute and relative correction are discussed and experimented on both monophone and biphone models to improve the segmentation results. The influence of search range on the statistical correction process is studied and a state selection technique is used to enhance the correction results. This paper also explores the influence of resolution (stepsize) of HMMs and proposes a multi-resolution fusion process to… Show more
“…All the proposed refinement steps contribute to the improvements of segmentation results in terms of accuracy and MAE/RMSE. Compared with the previously reported work in Zhao et al (2013), the results presented in this paper are improved due to the use of isolated-unit training to obtain improved baseline models as suggested in Donovan (1996), and Yuan et al (2013), the inclusion of both CI and CD models in the fusion process, and the application of predictive models for refinements. As presented in Table 4, the proposed scheme exhibits higher accuracies on TIMIT as compared with recent studies.…”
Section: Improving Segmentation Results With the Hybrid Refinement Scmentioning
confidence: 61%
“…As the automatically detected boundary is defined by the onset of the first state of a phone, it is possible to calculate the correction term as a ratio, i.e., a relative term, of the state-level segmentations around the automatically detected boundaries (Zhao et al, 2013):…”
Section: Statistical Correction On Segmentation Results and Multi-resolmentioning
confidence: 99%
“…The scheme presented in this paper is an extension of the previous work reported in Zhao et al (2013). It is a hybrid refinement scheme for phonetic segmentation consisting of three components: (1) statistical correction which addresses systematic biases of acoustic models using local information from the most relevant range (i.e., "state-selection" in Section 3); (2) a fusion method which incorporates complementary effects from acoustic models with various resolutions (i.e., stepsizes -the interval to extract each frame of feature vector) which are demonstrated to affect the results significantly; and (3) predictive models which correct non-systematic segmentation biases using predictive models.…”
Section: A Hybrid Scheme For Post-processing Of Phonetic Boundariesmentioning
“…All the proposed refinement steps contribute to the improvements of segmentation results in terms of accuracy and MAE/RMSE. Compared with the previously reported work in Zhao et al (2013), the results presented in this paper are improved due to the use of isolated-unit training to obtain improved baseline models as suggested in Donovan (1996), and Yuan et al (2013), the inclusion of both CI and CD models in the fusion process, and the application of predictive models for refinements. As presented in Table 4, the proposed scheme exhibits higher accuracies on TIMIT as compared with recent studies.…”
Section: Improving Segmentation Results With the Hybrid Refinement Scmentioning
confidence: 61%
“…As the automatically detected boundary is defined by the onset of the first state of a phone, it is possible to calculate the correction term as a ratio, i.e., a relative term, of the state-level segmentations around the automatically detected boundaries (Zhao et al, 2013):…”
Section: Statistical Correction On Segmentation Results and Multi-resolmentioning
confidence: 99%
“…The scheme presented in this paper is an extension of the previous work reported in Zhao et al (2013). It is a hybrid refinement scheme for phonetic segmentation consisting of three components: (1) statistical correction which addresses systematic biases of acoustic models using local information from the most relevant range (i.e., "state-selection" in Section 3); (2) a fusion method which incorporates complementary effects from acoustic models with various resolutions (i.e., stepsizes -the interval to extract each frame of feature vector) which are demonstrated to affect the results significantly; and (3) predictive models which correct non-systematic segmentation biases using predictive models.…”
Section: A Hybrid Scheme For Post-processing Of Phonetic Boundariesmentioning
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.