Automatic syntactic analysis of a corpus requires detailed lexical and morphological information that cannot always be harvested from traditional dictionaries. Therefore the development of a treebank presents an opportunity to simultaneously enrich the lexicon. In building NorGramBank, we use an incremental parsebanking approach, in which a corpus is parsed and disambiguated, and after improvements to the grammar and the lexicon, reparsed. In this context we have implemented a text preprocessing interface where annotators can enter unknown words or missing lexical information either before parsing or during disambiguation. The information added to the lexicon in this way may be of great interest both to lexicographers and to other language technology efforts.
In translation studies, the theoretical concept of 'translation unit' has traditionally been a subject of debate. This paper will discuss different views of the concept, relating it to the dichotomy between product and process-oriented translation studies. It will be argued that 'translation unit' has two readings: 'unit of analysis' in product-based studies, and 'unit of processing' in cognitive translation studies. With the exception of literary translation, translation services may now be said to fall within the domain of the language industry, which calls for considering the relevance of 'translation unit' to machine translation (MT). From a historical perspective, the concept will be related to the main issues of system design and translation quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.