The BTEC (Basic Travel Expression Corpus) is developed by NICT, Japan and has a wide-coverage of basic Japanese travel expressions with English counterparts for the purpose of using it as the basic data for developing high quality speech translation system. The English counterpart of this corpus has been translated Hindi manually. It is used for development of English-Hindi speech translation system. In this paper, we present the statistical analysis of this translated Hindi BTEC corpus. Besides that, the translation methodology adopted in development of the corpus is also described. The statistical evaluations performed in the experiments, provide information of distribution of sentences, words, various phonemes and their growth behavior which provide direction for future enhancement of the corpus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.