2018
DOI: 10.1371/journal.pone.0197775
|View full text |Cite
|
Sign up to set email alerts
|

Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits

Abstract: This research assesses the evolution of lexical diversity in scholarly titles using a new indicator based on zipfian frequency-rank distribution tail fits. At the operational level, while both head and tail fits of zipfian word distributions are more independent of corpus size than other lexical diversity indicators, the latter however neatly outperforms the former in that regard. This benchmark-setting performance of zipfian distribution tails proves extremely handy in distinguishing actual patterns in lexica… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 91 publications
0
8
0
2
Order By: Relevance
“…Given the There is, however, no objective way to accurately quantify topical diversity based on text data, such as keywords. 20,43 It is not only changes to the natural environment that explain the increase in permafrost publications. There has also been a considerable increase in publications on permafrost at the third pole (i.e., the Qinghai-Tibetan Plateau).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Given the There is, however, no objective way to accurately quantify topical diversity based on text data, such as keywords. 20,43 It is not only changes to the natural environment that explain the increase in permafrost publications. There has also been a considerable increase in publications on permafrost at the third pole (i.e., the Qinghai-Tibetan Plateau).…”
Section: Discussionmentioning
confidence: 99%
“…There is currently no consensus on which method most accurately quantifies the topical diversity of a text, which is partly related to the fact that the concept of topical diversity is elusive to define. 20 We applied three of the most commonly used methods on the keywords of papers for the 1998-2007 and 2008-2017 periods. The Measure of Textual Lexical Diversity (MTLD) method measures the mean length of a text string that maintains a certain TTR.…”
Section: Methods and Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Os títulos, nos vários tipos de texto típicos de cada domínio, são cada vez mais um tópico de interesse no que respeita ao estudo das línguas de especialidade (Baicchi, 2003;Roy, 2008;Moore, 2010;Soler, 2011;Méndez et al 2014), desde logo porque permitem a identificação do conteúdo dos textos sem a sua leitura (Hoek, 1981). Devido à produção e à acessibilidade massiva de conteúdos, a importância dos títulos tem, aliás, crescido, havendo provas de que muitos especialistas confiam muitas vezes apenas na informação dada no título para integrarem as obras como referências nos seus próprios textos, ou para tomarem decisões técnicas (Bérubé et al, 2018). A extração e exploração de títulos para extração de palavras-chave, léxico e terminologia, tem sido assim, desde há algum tempo, uma prática disseminada com sucesso (Hu et al, 2005;Poulimenou et al, 2014), o que motiva a sua consideração na metodologia aqui apresentada.…”
Section: Extração De Léxico Dos Títulosunclassified
“…No entanto, tem vindo a ser recentemente explorada a aplicação de padrões léxico-sintáticos, com base em conhecimento de forma sistemática, sob a forma de filtros linguísticos que facilitam e automatizam a análise das concordâncias para a extração tanto de unidades lexicais relevantes como de relações semânticas estáveis que estruturem modelos relacionais (e.g. Wordnet, Framenet) e /ou ontologias (Amaro, 2014;Cabezas-García;Faber, 2018;Faber, 2012;Gil-Berrozpe et al, 2017;San Martín, 2018).…”
unclassified