An approach to automatic translation is outlined that utilizes technklues of statistical inl'ormatiml extraction from large data bases. The method is based on the availability of pairs of large corresponding texts that are translations of each other. In our case, the iexts are in English and French. Fundamental to the technique is a complex glossary of correspondence of fixed locutions. The steps of the proposed translation process are: (1) Partition the source text into a set of fixed locutioris. (2) Use the glossary plus coutextual information to select tim corresponding set of fixed Ioctttions into a sequen{e forming the target sentence. (3) Arrange the words of the talget fixed locutions into a sequence forming the target sentence. We have developed stalistical techniques facilitating both tile autonlatic creation of the glossary, and the performance of tile three translation steps, all on the basis of an aliglnncllt of corresponding sentences in tile two texts. While wc are not yet able to provide examples of French / English tcanslation, we present some encouraging intermediate results concerning glossary creation and the arrangement of target WOl'd seq lie)lees.
A method for estimating the parameters of hidden Markov models of speech is described. Parameter values are chosen to maximize the mutual information between an acoustic observation sequence and the corresponding word sequence.Recognition results are presented comparing this method with maximum likelihood estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.