Machine translation makes it easy for people to communicate across languages. Multimodal machine translation is also one of the important directions of research in machine translation, which uses feature information such as images and audio to assist translation models in obtaining higher quality target languages. However, in the vast majority of current research work has been conducted on the basis of commonly used corpora such as English, French, German, less research has been done on low-resource languages, and this has left the translation of low-resource languages relatively behind. This paper selects the English-Hindi and English-Hausa corpus, researched on low-resource language translation. The different models we use for image feature information extraction are fusion of image features with text information in the text encoding process of translation, using image features to provide additional information, and assisting the translation model for translation. Compared with text-only machine translation, the experimental results show that our method improves 3 BLEU in the English-Hindi dataset and improves 0.47 BLEU in the English-Hausa dataset. In addition, we also analyze the effect of image feature information extracted by different feature extraction models on the translation results. Different models pay different attention to each region of the image, and ResNet model is able to extract more feature information compared to VGG model, which is more effective for translation.
At present, machine translation in the market depends on parallel sentence corpus, and the number of parallel sentences will affect the performance of machine translation, especially in low resource corpus. In recent years, the use of non parallel corpora to learn cross language word representation as low resources and less supervision to obtain bilingual sentence pairs provides a new idea. In this paper, we propose a new method. First, we create cross domain mappings in a small number of single languages. Then a classifier is constructed to extract bilingual parallel sentence pairs. Finally, we prove the effectiveness of our method in Uygur Chinese low resource language by using machine translation, and achieve good results.
Traditional machine translation mainly realizes the introduction of static images from other modal information to improve translation quality. In processing, a variety of methods are combined to improve the data and features, so that the translation result is close to the upper limit, and some even need to rely on the sensitivity of the sample distance algorithm to the data. At the same time, multi-modal MT will cause problems such as lack of semantic interaction in the attention mechanism in the same corpus, or excessive encoding of the same text image information and corpus irrelevant information, resulting in excessive noise. In order to solve these problems, this article proposes a new input port that adds visual image processing to the decoder. The core idea is to combine visual image information with traditional attention mechanisms at each time step specific to decoding. The dynamic router extracts the relevant visual features, integrates the multi-modal visual features into the decoder, and predicts the target word by introducing the visual image process. At the same time, experiments were carried out on more than 30K datasets translated in the United Kingdom, France and the Czech Republic, which proved the superiority of adding visual images to the decoder to extract features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.