The authors address the legal issues relating to the creation and use of language models. The article begins with an explanation of the development of language technologies. The authors analyse the technological process within the framework copyright, related rights and personal data protection law. The authors also cover commercial use of language models. The authors' main argument is that legal restrictions applicable to language data containing copyrighted material and personal data usually do not apply to language models. Language models are generally not considered derivative works. Due to a wide range of language models, this position is not absolute.
The chapter deals with the analysis of sentence structure on the layer where semantic role categories such as AGENT, PATIENT, OBJECT, etc. play a crucial role. The main objective is to find out which regularities prevail in categorization of spatial characteristics of motion events in Estonian in terms of these categories, and which morphosyntactic means are used to encode them. For the analysis a (mini) corpus containing sentences with a motion verb as predicate was automatically created. First, a brief overview of verbs of motion in Estonian is given. After that, the chapter concentrates on the analysis of semantic roles and their encoding in motion sentences in Estonian. We found it necessary to differentiate between the following roles: SOURCE, GOAL, ROUTE, LOCATION. The realization of each category by morphosyntactic means of Estonian is described in detail: NPs in certain case forms, PPs, adverbs. One of the results of the chapter is the fact that there is a statistical asymmetry in the use of SOURCE, GOAL, and LOCATION: GOAL is the most frequently encoded category of these three.
The authors address the transformation of research data into open data. The article draws on the experience in four countries: Sweden, Finland, Estonia and Lithuania. The transformation process presents several challenges where legal, organizational and individual aspects influence the process. Research data often contain personal data. Research data could also covered with intellectual property (IP) rights. This means that personal data and IP regulations should be integrated into the dissemination model. While there is a potential conflict between the policies for open data that aim to make data freely available and those of an entrepreneurial university that emphasize commercialization of research results, these policies need to be made compatible. Researchers producing data are vital for reconciling the two, but they currently lack the motivation to contribute towards the implementation of the open data policy due to missing career incentives. Keywords open data research data entrepreneurial university personal data protection intellectual property academic career incentives
Ülevaade. Artiklis vaadeldakse tekstikorpuste morfoloogilise ja semantilise ühestamise käigus kerkinud lingvistilisi probleeme, keskendudes tekstisõna sõnaliigilise kuuluvuse probleemile.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.