The GOLD Community of Practice: an infrastructure for linguistic data on the Web

Farrar, Scott; Lewis, William D.

doi:10.1007/s10579-007-9016-x

Cited by 16 publications

(15 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The GOLD ontology contains the basis linguistic knowledge of any theoretical framework. According to (Farrar and Lewis, 2005), GOLD defines linguistic knowledge as axioms, for example "a verb is a part of speech", and uses at the same time language neutral, for example "parts of speech are subclasses of gold: GrammaticalUnit". The classes are presented in the protégée editor and then expressed as concepts in the GOLD ontology (Farrar and Langendoen, 2003).…”

Section: Interoperability Issuementioning

confidence: 99%

“…Thus, GOLD is an abstract model and representation formalisms such as HPSG are the instantiation of this abstract model. (Farrar and Lewis, 2005) consider these instantiations as sub-communities of practice noted Communities Of Practice Extension (COPEs). COPEs, sub-communities or sub-ontologies designed the same nomenclature and extend the overall GOLD ontology (Wilcock, 2007).…”

Section: Interoperability Issuementioning

confidence: 99%

See 1 more Smart Citation

A New Method for Interoperability Between Lexical Resources Using MDA Approach

Lhioui

Haddar

Romary

2016

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

Section: Interoperability Issuementioning

confidence: 99%

Section: Interoperability Issuementioning

confidence: 99%

A New Method for Interoperability Between Lexical Resources Using MDA Approach

Lhioui

Haddar

Romary

2016

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

“…ODIN was developed as part of the greater effort within the GOLD Community of Practice [10] 2 and the Electronic Metastructure for Endangered Languages Data efforts 3 , whose goals are to promote best practice standards and software, specif-ically those that facilitate interoperation over disparate sets of linguistic data. ODIN's genesis came from the realization that despite the fact that significant amounts of language data are being posted and maintained on the Web, there is no uniform search strategy for discovering these data, and most that can be discovered cannot be easily manipulated or used.…”

Section: The Odin Visionmentioning

confidence: 99%

“…The initial search strategy behind ODIN was to comply with the Open Languages Archives Community (OLAC) model [19], that is, allow search by language name or code, either through OLAC's search interfaces 10 or through a locally provided facility. IGT, however, has remarkably rich content that opens possibilities for other types of search.…”

Section: Enriching the Datamentioning

confidence: 99%

ODIN: A Model for Adapting and Enriching Legacy Infrastructure

Lewis¹

2006

2006 Second IEEE International Conference on E-Science and Grid Computing (E-Science'06)

Self Cite

View full text Add to dashboard Cite

The Online Database of Interlinear Text (ODIN) 1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large amounts of language data are posted to the Web as part of scholarly discourse, making the existing "e-Linguistic infrastructure" surprisingly rich, most linguistic data available on the Web exists in legacy formats, is highly displaycentric, and is often difficult to locate or interoperate over. ODIN seeks to leverage this existing infrastructure into a rich, searchable, and interoperable resource by converting readily available semi-structured data to content-centric, searchable formats. To do this, ODIN mines scholarly papers and webpages for instances of linguistic data, focusing mostly on interlinear texts, extracts them, identifies source languages, and makes the instances available to search. Through ODIN's standard search feature, users can locate data by language name or Ethnologue code, and display lists of data by document for languages of interest. The newer Advanced Search feature allows users to locate instances by grammatical markup that is used (e.g., NOM, ACC, ERG, PST, 3SG), and by linguistic constructions (e.g., passives, conditionals, possessives, raising constructions, etc.). The latter are made possible through additional enrichment of discovered data using automated statistical taggers and parsers. The ODIN VisionThe Online Database of Interlinear Text (ODIN) is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. ODIN was developed as part of the greater effort within the GOLD Community of Practice [10] 2 and the Electronic Metastructure for Endangered Languages Data efforts 3 , whose goals are to promote best practice standards and software, specif-1 http://www.csufresno.edu/odin 2 http://www.linguistics-ontology.org 3 http://emeld.org ically those that facilitate interoperation over disparate sets of linguistic data. ODIN's genesis came from the realization that despite the fact that significant amounts of language data are being posted and maintained on the Web, there is no uniform search strategy for discovering these data, and most that can be discovered cannot be easily manipulated or used. The e-Linguistics infrastructure may be expansive and rich, yet "discovering" language data on the Web often depends on haphazard, low-precision stringbased search strategies (using tools such as Google 4 or Yahoo 5 ), or even on decidedly low-tech discoveries made by word-of-mouth. 6 In our pursuit of a better way to locate and use language data within the existing infrastructure, we came to realize certain norms in the presentation of data could be tapped for automated discovery and manipulation. One of the more typical semi-structured formats that linguists use is Interlinear Glossed Text, or IGT. We conceived of ODIN as a means to locate instances of IGT on the Web by language name and code, such that the linguist doing a search could be reasonably confident that th...

show abstract

“…It includes general knowledge of writing systems and transcription systems that are core to the General Ontology of Linguistic Description (GOLD) 2 (Farrar and Langendoen 2003). Other portions of OATS, including the relationships encoded for relating segments of transcription systems, or the computational representations of these elements, extend GOLD as a Community of Practice Extension (COPE) (Farrar and Lewis 2005). OATS provides interoperability for transcription systems and practical orthographies that map phones and phonemes in unique relationships to their graphemic representations.…”

mentioning

confidence: 99%

An ontology for accessing transcription systems

Moran

2011

Lang Resources & Evaluation

View full text Add to dashboard Cite

This paper presents the design and implementation of the Ontology for Accessing Transcription Systems (OATS), a knowledge base that supports interoperation over disparate transcription systems and practical orthographies. OATS uses RDF, SPARQL and Unicode to facilitate resource discovery and intelligent search over linguistic data. The knowledge base includes an ontological description of writing systems and relations for mapping transcription system segments to an interlingua pivot, the IPA. It includes orthographic and phonemic inventories from 203 African languages, which were mined from the Web. OATS is motivated by four use cases: querying data in the knowledge base via IPA, querying it in native orthography, error checking of digitized data, and conversion between transcription systems. The model in this paper implements each of these use cases.

show abstract

The GOLD Community of Practice: an infrastructure for linguistic data on the Web

Cited by 16 publications

References 13 publications

A New Method for Interoperability Between Lexical Resources Using MDA Approach

A New Method for Interoperability Between Lexical Resources Using MDA Approach

ODIN: A Model for Adapting and Enriching Legacy Infrastructure

An ontology for accessing transcription systems

Contact Info

Product

Resources

About