2020
DOI: 10.1093/ijl/ecaa022
|View full text |Cite
|
Sign up to set email alerts
|

Slipping Through the Cracks in e-Lexicography

Abstract: Despite the remarkable advances made in recent years to facilitate the lexicographer’s work of interpreting and synthesizing the complexity of language uncovered by corpora, an uncritical use of cutting-edge corpus tools and resources can instill a false sense of assurance. In this paper, authentic examples pertaining to wordlist use, collocation research and example selection that arose when compiling a real-world lexical database are discussed through the lens of problems that can easily slip through the cra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…As Frankenberg‐Garcia et al. (2020, p. 14) remarked, “whereas there is no doubt that word sketches make compiling a collocation database much more efficient, it is essential to bear in mind their results hinge on a number of factors operating silently behind the scenes.” In our research, these factors most likely included the size of our corpus (868,709,843 tokens) and the representative nature of the web‐crawled texts. It is our belief that the ENGRI corpus is satisfactory both in size and representativeness, and since we also checked our results against the HrWaC corpus, we would argue our corpus search yielded reliable data and a fairly comprehensive word list of English words in Croatian.…”
Section: Resultsmentioning
confidence: 99%
“…As Frankenberg‐Garcia et al. (2020, p. 14) remarked, “whereas there is no doubt that word sketches make compiling a collocation database much more efficient, it is essential to bear in mind their results hinge on a number of factors operating silently behind the scenes.” In our research, these factors most likely included the size of our corpus (868,709,843 tokens) and the representative nature of the web‐crawled texts. It is our belief that the ENGRI corpus is satisfactory both in size and representativeness, and since we also checked our results against the HrWaC corpus, we would argue our corpus search yielded reliable data and a fairly comprehensive word list of English words in Croatian.…”
Section: Resultsmentioning
confidence: 99%
“…Mass communication is considered in the context of its correctness and ecological safety (information literacy) (Frechette, Williams, 2016); language innovations quickly enter mass communication, which leads to numerous semantic errors in the future. To prevent this when replicating, e-lexicography becomes relevant (Frankenberg-Garcia et al, 2021;Granger, Paquot 2012). This type of innovative lexicography quickly develops according to the demands of the time, which is particularly productive in the context of the Digital Age (Sujon, Dyer, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…(Paet & Risberg 2021: 970) Corpus methods empower lexicographers to work in a more systematic and objective manner, encompassing not only their own intuition but also the regular structures observed in the Estonian language community. All these advantages are widely acknowledged in contemporary international lexicography (e.g., see Storjohann 2021;Frankenberg-Garcia et al 2021). Still, in Estonia, there has been some confusion about the benefits of usage-based linguistics and corpus data for language planning (Veldre 2022;Vider 2022).…”
Section: Usage-based Linguistics and Corpus Linguisticsmentioning
confidence: 99%