Daniel Marcu scite author profile

We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outperform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy wordlevel alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

show abstract

Statistical phrase-based translation

Koehn

2003

View full text Add to dashboard Cite

show abstract

Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

Carlson¹,

Marcu

Okurowski³

2001

368

370

View full text Add to dashboard Cite

Abstract:We describe our experience in developing a discourse-annotated corpus for community-wide use. Working in the framework of Rhetorical Structure Theory, we were able to create a large annotated resource with very high consistency, using a well-defined methodology and protocol. This resource is made publicly available through the Linguistic Data Consortium to enable researchers to develop empirically grounded, discourse-specific applications.

show abstract

Search-based structured prediction

2009

View full text Add to dashboard Cite

We present SEARN, an algorithm for integrating SEARch and lEARNing to solve complex structured prediction problems such as those that occur in natural language, speech, computational biology, and vision. SEARN is a meta-algorithm that transforms these complex problems into simple classification problems to which any binary classifier may be applied. Unlike current algorithms for structured learning that require decomposition of both the loss function and the feature functions over the predicted structure, SEARN is able to learn prediction functions for any loss function and any class of features. Moreover, SEARN comes with a strong, natural theoretical guarantee: good performance on the derived classification problems implies good performance on the structured prediction problem.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daniel Marcu

Statistical Phrase-Based Translation

Statistical phrase-based translation

Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

Search-based structured prediction

Contact Info

Product

Resources

About