Mitsuo Shimohata scite author profile

Mitsuo Shimohata

5Publications

17Citation Statements Received

36Citation Statements Given

How they've been cited

How they cite others

Affiliations

Oki Electric Industry (Japan)

Publications

Order By: Most citations

A corpus-centered approach to spoken language translation

Sumita¹,

Akiba²,

Doi³

et al. 2003

View full text Add to dashboard Cite

This paper reports the latest performance of components and features of a project named Corpus-Centered Computation (C'3), which targets a translation technology suitable for spoken language translation. C3 places corpora at the center of the technology. Translation knowledge is extracted from corpora by both EBMT and SMT methods, translation quality is gauged by referring to corpora, the best translation among multiple-engine outputs is selected based on corpora and the corpora themselves are paraphrased or filtered by automated processes.

show abstract

Identifying synonymous expressions from a bilingual corpus for example-based machine translation

Shimohata¹,

Sumita²

2002

View full text Add to dashboard Cite

Example-based machine translation (EBMT) is based on a bilingual corpus. In EBMT, sentences similar to an input sentence are retrieved from a bilingual corpus and then output is generated from translations of similar sentences. Therefore, a similarity measure between the input sentence and each sentence in the bilingual corpus is important for EBMT. If some similar sentences are missed from retrieval, the quality of translations drops. In this paper, we describe a method to acquire synonymous expressions from a bilingual corpus and utilize them to expand retrieval of similar sentences. Synonymous expressions are acquired from dierences in synonymous sentences. Synonymous sentences are clustered by the equivalence of translations. Our method has the advantage of not relying on rich linguistic knowledge, such as sentence structure and dictionaries. We demonstrate the eect on applying our method to a simple EBMT.

show abstract

A Method for Retrieving a Similar Sentence and Its Application to Speech Translation

Shimohata¹,

Sumita²,

Matsumoto³

2004

Journal of Natural Language Processing

View full text Add to dashboard Cite

In this paper, we propose incorporating similar sentence retrieval in machine translation to improve the translation of hard-to-translate input sentences. If a given input sentence is hard to translate, a sentence similar to the input sentence is retrieved from a monolingual corpus of translatable sentences and then provided to the MT system instead of the original sentence. This method is advantageous in that it relies only on a monolingual corpus. The similarity between an input sentence and each sentence in the corpus is determined from the ratio of the common N-gram. We use two conditions to improve the retrieval precision and add a filtering method to avoid inappropriate sentences. An experiment using a Japanese-to-English MT system in a travel conversation domain proves that our method improves the translation quality of hard-to-translate input sentences by 9.8 %.

show abstract

Acquiring Synonyms from Monolingual Comparable Texts

Shimohata

Sumita²

2005

View full text Add to dashboard Cite

Abstract. This paper presents a method for acquiring synonyms from monolingual comparable text (MCT). MCT denotes a set of monolingual texts whose contents are similar and can be obtained automatically. Our acquisition method takes advantage of a characteristic of MCT that included words and their relations are confined. Our method uses contextual information of surrounding one word on each side of the target words. To improve acquisition precision, prevention of outside appearance is used. This method has advantages in that it requires only part-ofspeech information and it can acquire infrequent synonyms. We evaluated our method with two kinds of news article data: sentence-aligned parallel texts and document-aligned comparable texts. When applying the former data, our method acquires synonym pairs with 70.0% precision. Re-evaluation of incorrect word pairs with source texts indicates that the method captures the appropriate parts of source texts with 89.5% precision. When applying the latter data, acquisition precision reaches 76.0% in English and 76.3% in Japanese.

show abstract

Retrieving meaning-equivalent sentences for example-based rough translation

Shimohata¹,

Sumita²,

Matsumoto

2003

View full text Add to dashboard Cite

Example-based machine translation (EBMT) is a promising translation method for speechto-speech translation because of its robustness. It retrieves example sentences similar to the input and adjusts their translations to obtain the output. However, it has problems in that the performance degrades when input sentences are long and when the style of inputs and that of the example corpus are different. This paper proposes a method for retrieving "meaning-equivalent sentences" to overcome these two problems. A meaning-equivalent sentence shares the main meaning with an input despite lacking some unimportant information. The translations of meaning-equivalent sentences correspond to "rough translations." The retrieval is based on content words, modality, and tense.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.