We present a novel method for discovering parallel sentences in comparable, non-parallel corpora. We train a maximum entropy classifier that, given a pair of sentences, can reliably determine whether or not they are translations of each other. Using this approach, we extract parallel data from large Chinese, Arabic, and English non-parallel newspaper corpora. We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system. We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a large non-parallel corpus. Thus, our method can be applied with great benefit to language pairs for which only scarce resources are available.
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs using a signal processinginspired approach, we detect which segments of the source sentence are translated into segments in the target sentence, and which are not. This method enables us to extract useful machine translation training data even from very non-parallel corpora, which contain no parallel sentence pairs. We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system.
We develop two techniques for analyzing the effect of porting a machine translation system to a new domain. One is a macro-level analysis that measures how domain shift affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. We apply these methods to understand what happens when a Parliament-trained phrase-based machine translation system is applied in four very different domains: news, medical texts, scientific articles and movie subtitles. We present quantitative and qualitative experiments that highlight opportunities for future research in domain adaptation for machine translation.
(1) Background: Many studies suggest that Helicobacter pylori (H. pylori) infection is associated with a higher prevalence of anemia. The aim of this study is to explore this fact for a pediatric population from the northeast of Romania; (2) Methods: A correlational retrospective study between infection with H. pylori and anemia was performed on a group of 542 children in a pediatric gastroenterology regional center in Northeast Romania; (3) Results: Out of 542 children with confirmed H. pylori infection, microcytic hypochromic anemia was present in 48 children, of whom 7 (14.5%) also had iron deficiency.; (4) Conclusions: The study results demonstrate a significant association of H. pylori infection with iron-deficiency anemia and iron deficiency in children in accordance with the results established in the published literature. Although the direct relationship between them it is not clear yet, prevention represents one of the first clinical measures that need to be implemented when encountering a refractory moderate to severe iron-deficiency anemia and, especially, when associated with gastrointestinal tract symptoms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.