The role of terminology in content management has often been underrated. Term extraction has been identified by the information industry as an area requiring focus. Term extraction benefits both the content authoring and the translation process. Supplying key product terms to translation services several weeks before the actual translation begins reduces translation time, improves translation quality, and saves effort (and thus money) by reducing duplication of work. Getting the key terms ready in a timely manner can be difficult without some automation. This paper describes the process of proposing, designing, developing, and deploying a terminology extraction tool. The tool extracts nouns and noun groups, excludes non-translatable terms and known product terms, and displays a context for each extracted item. This is done based on full parsing of the text with a broad-coverage parser. The tool is made available to users on a Web server.
Companies must translate their content if they want to operate multinationally. Both quality and speed of translation are key factors in determining market share in the target market. Proactively managing terminology, including pretranslating key terms for a translation project, has beneficial effects on these factors. However, Ln commercial environments, the volumes of content and subsequently of the required terminology are typically large. Therefore, integrating terminology into the translation pipeline requires a process that is as automated as possible. Term extraction is the cornerstone of this process, but to maximize efficiency it requires a post-processing strategy that repurposes existing lexical resources. Terms extracted from corpora and subsequently translated should be channeled into the company termbase so that they can be leveraged for other purposes. These and other effective practices for processing extracted terms are discussed, based on the author's experiences Ln one large company.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.