This paper presents a heuristic method that uses information in the Japanese text along with knowledge of English countability and number stored in transfer dictionaries to determine the countability and number of English noun phrases. Incorporating this method into the machine translation system ALT-J/E, helped to raise the percentage of noun phrases generated with correct use of articles and number from 65% to 73%.
This paper I)roposes ;t new tnethod for learning bilingual colloca, tions from sentence-aligned paralM corpora. Our method COml)ris('s two steps: (1) extracting llseftll word chunks (n-grmns) by word-level sorting and (2) constructing bilingua,l ('ollocations t)y combining the word-(;hunl(s a(-quired iu stag(' (1). We apply the method to a very ('hallenging text l)~tir: a stock market 1)ullet;in in Japanese and il;s abstract in En-glish. I)om;tin sl)ecific collocations are well captured ewm if they were not conta.ined in the dictionaric's of economic tel?IllS.
In optical character recognition and coni.inuous speech recognition of a natural language, it has been diflicult to detect error characters which are wrongly deleted and inserted. ]n <>rder to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese "bunsetsu" and an l';nglish word, and to correct these errors, this paper proposes new methods using rn-th order Markov chain model for Japanese "l~anjikana" characters and Fmglish alphabets, assuming that Markov l)robability of a correct chain of syllables or "kanji-kana" characters is greater than that of erroneous chains. From the results of the experiments, it is concluded that the methods is usefld for detecting as well as correcting these errors in Japanese "bunsetsu" and English words.
This paper proposes a method to resolve intrasentential references of Japanese zero pronouns suitable for application in widely used and practical machine translation systems. This method focuses on semantic and pragmatic constraints such as conjunctions, verbal semantic attributes and modal expressions to determine intrasentential antecedents of Japanese zero pronouns. This method is highly effective because the volume of knowledge that must be prepared beforehand is not so large and its precision of resolution is good. This method was realized in Japanese to English machine translation system, ALT-J/E. To evaluate the performance of our method, we conducted a windowed test for 139 zero pronouns with intrasentential antecedents in a sentence set for the evaluation of the performance of Japanese to English machine translation systems (3718 sentences). According to the evaluation, intrasentential antecedents could be resolved correctly for 98% of the zero pronouns examined using rules consistent for intersentential and extrasentential resolution. The accuracy was higher than the accuracy of the centering algorithm which is a conventional method to resolve zero pronouns. By the further examination of the evaluation, we found that this method can achieve high accuracy using relatively simple rules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.