The Irish National Morphology Database is a human-verified, Official Standard-compliant dataset containing the inflected forms and other morpho-syntactic properties of Irish nouns, adjectives, verbs and prepositions. It is being developed by Foras na Gaeilge as part of the New English-Irish Dictionary project. This paper introduces this dataset and its accompanying software library Gramadán.
Recursion, and recursion-like design patterns, are used in the entry schemas of dictionaries to model subsenses and subentries. Recursion occurs when elements of a given type, such as sense, are allowed to contain elements of the same or similar type, such as sense or subsense. This article argues that recursion unnecessarily increases the computational complexity of entries, making dictionaries less easily processable by machines. The article will show how entry schemas can be simplified by re-engineering subsenses and subentries as relations (like in a relational database) such that we only have flat lists of senses and entries, while the is-subsense-of and is-subentry-of relations are encoded using pairs of unique identifiers. This design pattern losslessly records the same information as recursion (including – importantly – the listing order of items inside an entry) but decreases the complexity of the entry structure and makes dictionary entries more easily machine-processable.
Irish, a low-resourced lesser-used language, is striving to punch above its weight when it comes to some of the digital language tools and resources available to its users. High-tech language tools and resources for Irish are being developed in a number of universities in Ireland and elsewhere, in language technology areas relating to search, parsing, proofing, speech, translation, etc. (Judge at al., 2012). This paper aims to highlight work done by researchers at Fiontar, Dublin City University (DCU), to make a number of valuable Irish-language terminological, lexicographical, onomastic, and folkloristic data stocks more readily accessible, usable, and manageable using web and database technologies. Tools built with these technologies have facilitated the re-organisation, distributed development, and more widespread dissemination of these data stocks, as well as the creation of new data stocks. These language tools, which are on a par with tools that are available to users of well-resourced languages (take for example the online interface of the multilingual terminology database of the European Union, IATE: http://iate.europa.eu/), are now enabling Irish language users, language professionals, and linguists operate in an environment similar to that of their major language counterparts. The public interfaces of all Irish-language tools and resources developed by Fiontar are made available at http://www.gaois.ie/.
It is now commonplace to see surnames written in the Irish language in Ireland, yet there is no online resource for checking the standard spelling and grammar of Irish-language surnames. We propose a data structure for handling Irish-language surnames which comprises bilingual (Irish–English) clusters of surname forms. We present the first open, data-driven linguistic database of common Irish-language surnames, containing 664 surname clusters, and a method for deriving Irish-language inflected forms. Unlike other Irish surname dictionaries, our aim is not to list variants or explain origins, but rather to provide standard Irish-language surname forms via the web for use in the educational, cultural, and public spheres, as well as in the library and information sciences. The database can be queried via a web application, and the dataset is available to download under an open licence. The web application uses a comprehensive list of surname forms for query expansion. We envisage the database being applied to name authority control in Irish libraries to provide for bilingual access points.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.