We have proposed a method of machine translation, which acquires translation rules from translation examples using inductive learning, and have evaluated the method. And we have confirmed that the method requires many translation examples. To resolve this problem, we applied genetic algorithms to the method. In this paper, we describe our method with genetic algorithms and evaluated it by some experiments. We confirmed that the accuracy rate of translation increased from 52.8% to 61.9% by applying genetic algorithms.
SUMMARYRule-based machine translation analyzes source-language sentences using large-scale linguistic knowledge that is given by the developer beforehand. However, it is difficult to give complete linguistic knowledge to the system ex ante because natural language has various linguistic phenomena. Therefore, we worked to develop learning-based machine translation. In learning-based machine translation, a system acquires translation rules automatically from translation examples that are pairs of source and target language sentences. However, existing learning-based machine translation presents the problem that it requires a large number of similar translation examples. Consequently, it cannot acquire enough useful translation rules from sparse translation examples. This paper proposes a method of machine translation using Recursive Chain-Link-type Learning, which can acquire many useful translation rules from sparse translation examples. Our system, based on this method, efficiently acquires translation rules from each translation example without requiring two similar translation examples. Translation rules are acquired by extracting corresponding parts between source and target language sentences in translation examples. Our system determines those corresponding parts using previously acquired translation rules. Therefore, the system engenders a chain reaction in acquisition of new translation rules. Evaluation experiments using our system demonstrated an effective translation rate of 61.1%. Moreover, the effective translation rate was 85.0% when sufficient learning data were given to our system.
A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text corpora. This means that it is difficult for the systems to acquire the translation rules from sparse data. As a result, these methods require large amounts of training data in order to acquire high-quality translation rules. To overcome this problem, we propose a method of machine translation using a Recursive Chain-linktype Learning. In our new method, the system can acquire many new high-quality translation rules from sparse translation examples based on already acquired translation rules. Therefore, acquisition of new translation rules results in the generation of more new translation rules. Such a process of acquisition of translation rules is like a linked chain. From the results of evaluation experiments, we confirmed the effectiveness of Recursive Chain-link-type Learning.
We propose a new automatic evaluation metric for machine translation. Our proposed metric is obtained by adjusting the Earth Mover's Distance (EMD) to the evaluation task. The EMD measure is used to obtain the distance between two probability distributions consisting of some signatures having a feature and a weight. We use word embeddings, sentence-level tf • idf , and cosine similarity between two word embeddings, respectively, as the features, weight, and the distance between two features. Results show that our proposed metric can evaluate machine translation based on word meaning. Moreover, for distance, cosine similarity and word position information are used to address wordorder differences. We designate this metric as Word Embedding-based automatic MT evaluation using Word Position Information (WE WPI). A meta-evaluation using WMT16 metrics shared task set indicates that our WE WPI achieves the highest correlation with human judgment among several representative metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.