The article describes the process of creating a Finnish language FrameNet or FinnFN, based on the original English language FrameNet hosted at the International Computer Science Institute in Berkeley, California. We outline the goals and results relating to the FinnFN project and especially to the creation of the FinnFrame corpus. The main aim of the project was to test the universal applicability of frame semantics by annotating real Finnish using the same frames and annotation conventions as in the original Berkeley FrameNet project. From Finnish newspaper corpora, 40,721 sentences were automatically retrieved and manually annotated as example sentences evoking certain frames. This became the FinnFrame corpus. Applying the Berkeley FrameNet annotation conventions to the Finnish language required some modifications due to Finnish morphology, and a convention for annotating individual morphemes within words was introduced for phenomena such as compounding, comparatives and case endings. Various questions about cultural salience across the two languages arose during the project, but problematic situations occurred only in a few examples, which we also discuss in the article. The article shows that, barring a few minor instances, the universality hypothesis of frames is largely confirmed for languages as different as Finnish and English.
The article details the formational process of the FinnTransFrame corpus, a part of the FinnFrameNet project. In addition to a large annotated frame semantic corpus of natural language examples, the project created a separate corpus of examples translated from English to Finnish. The research question when creating the FinnTransFrame corpus was to see to what extent the various frames of the original Berkeley FrameNet transfer into Finnish in translated examples, i.e. what are the main problems and how can they be categorized? A variety of Berkeley FrameNet examples were chosen from different frames and then translated by professionals. The FinnFrameNet annotation team checked all the examples and their translations to see if the frames remained intact in translation. Problematic examples were tagged according to the type of the encountered problem, with the main focus on the type of fine-grained mismatches of meaning that caused frame changes even when the translation was the best possible one. The frame-loss amounted to 4.2% of the 88,209 relevant example sentences. Filtering out sentences & Krister Lindén
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.