Current Methods in Historical Semantics 2011
DOI: 10.1515/9783110252903.59
|View full text |Cite
|
Sign up to set email alerts
|

The NeoCrawler: identifying and retrieving neologisms from the internet and monitoring ongoing change

Abstract: Why do some new words manage to enter the lexicon and stay there while others drop out of use and are neither used nor heard anymore? Of interest to both lay people and linguists, this question has not been answered in an empirically convincing manner to date, mainly because systematic methods have not yet been found for spotting new words as soon as possible after their first occurrence and monitoring their early development and spread as exhaustively as possible. In this paper we present a new and improved t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0
3

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 26 publications
(23 citation statements)
references
References 16 publications
0
20
0
3
Order By: Relevance
“…More recently, researchers have focused on how words become lexicalized as they gradually settle in on particular forms and meanings and how words become institutionalized as they enter into the standard vocabulary of a language (Brinton & Traugott 2005). Lexicographers have also devoted considerable effort to identifying and defining neologisms, including through the analysis of newspaper corpora (Baayen & Renouf 1996) and internet search engine results (Kerremans et al 2011). Although there has been sustained interest from linguists on the formation and development of new words, regional patterns of lexical innovation-both in terms of their origin and spread-have been left largely unexplored.…”
Section: Theories Of Linguistic Innovationmentioning
confidence: 99%
“…More recently, researchers have focused on how words become lexicalized as they gradually settle in on particular forms and meanings and how words become institutionalized as they enter into the standard vocabulary of a language (Brinton & Traugott 2005). Lexicographers have also devoted considerable effort to identifying and defining neologisms, including through the analysis of newspaper corpora (Baayen & Renouf 1996) and internet search engine results (Kerremans et al 2011). Although there has been sustained interest from linguists on the formation and development of new words, regional patterns of lexical innovation-both in terms of their origin and spread-have been left largely unexplored.…”
Section: Theories Of Linguistic Innovationmentioning
confidence: 99%
“…Further filters then apply to eliminate spellings errors and Proper Nouns. Subsequent developments all replicate this architecture : OBNEO (Cabré and De Yzaguirre, 1995), NeoCrawler (Kerremans et al, 2012), Logoscope (Gérard et al, 2014) and more recently Neoveille (Cartier, 2016). Four main difficulties arise from these architecture : first, EDA can not track semantic neologisms, as they use existing lexical units to convey innovative meanings; second, the design of a reference exclusion dictionary is not that obvious as it requires the existence of a machinereadable dictionary : this entails specific procedures to apply this architecture to less-resourced languages, and the availability of an up-to-date machine-readable dictionary for more resourced languages ; third, the EDA architecture is not sufficient in itself : among unknown forms, most of them are Proper Nouns, spelling mistakes and other cases derived from corpus boilerplate removal : this entails a post-processing phase to depart cases; Fourth, these systems do not take into account the sociological and diatopic aspects of neologism, as they limit their corpora to specific domains : a ideal system should be able to extend its monitoring to new corpora and maintain diastratic meta-datas to characterize novel forms.…”
Section: Computational Models Of Neologymentioning
confidence: 96%
“…Diachrone Wortschatzveränderungen werden in der Regel exemplarisch anhand bestimmter Phänomene oder Phänomenbereiche untersucht. Zum Beispiel analysiert man, welche lexikalischen Innovationen im Zeitverlauf zu beobachten sind (Kerremans/Stegmayr/Schmid 2012;Würschinger et al 2016; Müller-Spitzer/Wolfer/Kolpenig 2018). Unter einer quantitativen Perspektive wird dabei nicht nach neuen Wörtern im linguistischen Sinn gesucht, sondern entweder nach neuen Token oder neuen automatisch lemmatisierten Worttypen, die in einem neuen Jahr, einem neuen Monat oder einer neuen Ausgabe auftauchen.…”
Section: Etablierte Wege Zur Darstellung Und Exploration Von Wortschaunclassified