The importance of statistical patterns of language has been debated over decades. Although Zipf 's law is perhaps the most popular case, recently, Menzerath's law has begun to be involved. Menzerath's law manifests in language, music and genomes as a tendency of the mean size of the parts to decrease as the number of parts increases in many situations. This statistical regularity emerges also in the context of genomes, for instance, as a tendency of species with more chromosomes to have a smaller mean chromosome size. It has been argued that the instantiation of this law in genomes is not indicative of any parallel between language and genomes because (a) the law is inevitable and (b) noncoding DNA dominates genomes. Here mathematical, statistical, and conceptual challenges of these criticisms are discussed. Two major conclusions are drawn: the law is not inevitable and languages also have a correlate of noncoding DNA. However, the wide range of manifestations of the law in and outside genomes suggests that the striking similarities between noncoding DNA and certain linguistics units could be anecdotal for understanding the recurrence of that statistical law.2012 Wiley Periodicals, Inc. Complexity 18: [11][12][13][14][15][16][17] 2013 Key Words: statistical laws; language; genomes; music; non-coding DNA; Menzerath's law INTRODUCTIONA ttempts to demonstrate that statistical patterns of language have a trivial explanation have a long history that goes back at least to the research by G. A. Miller and collaborators questioning the relevance of Zipf's law for word frequencies around 1960 [1-3]. Zipf's law states that the curve that relates the frequency of a word f and its rank r (the most frequent word having rank 1, the second most frequent word having rank 2, and so on) should follow f $ r 2a [4]. Miller argued that if monkeys were chained ''to typewriters until they had produced some very long and random sequence of characters'' one would find ''exactly the same 'Zipf curves' for the monkeys as for the human authors '' [3]. Under his view, Zipf's law would be an inevitable consequence of the fact that words are made of units, e.g., letters or phonemes. The typewriter argument has been revived many times since then [5][6][7][8]. However, rigorous analyses indicate that the curves do not really look the same and the parameters of this random typing model giving a good fit to real word frequencies are not forthcoming [9,10] claim that the finding of another statistical pattern of language, Menzerath's law, is also inevitable [11]. P. Menzerath hypothesized that ''the greater the whole, the smaller its constituents'' (''Je größer das Ganze, desto kleiner die Teile'') in the context of language [12] (pp. 101). Converging research in music and genomes [13][14][15][16] suggests that Menzerath's law is a general law of natural and humanmade systems. In this article, we leave the term Menzerath-Altmann law for referring to the exact mathematical dependency that has been proposed by the quantitative linguistics traditi...
Los discursos políticos en campañas electorales están orientados a movilizar y atraer con mensajes persuasivos al electorado y principalmente se argumenta apelando a las emociones incurriendo en falacias. Este artículo presenta un corpus de falacias en discursos políticos elaborados por candidatos a la presidencia de México, con el objetivo de obtener un recurso lingüístico en español que permita desarrollar sistemas computacionales para su minería. Hasta ahora no se conoce un corpus de falacias para el idioma español y los corpus de argumentos elaborados en el área de Minería de Argumentos se limitan a un etiquetado de la estructura argumentativa y no están elaborados a partir de discursos políticos. El corpus se elaboró con argumentos extraídos de los discursos y se realizó una anotación manual de premisas y conclusiones. Se obtuvo un acuerdo entre anotadores de 0.692utilizando el índice kappa de Cohen. Posteriormente, se identificaron los argumentos válidos y las falacias, y como resultado se obtuvo un acuerdo de 0.442 con el mismo índice. Como contribución adicional, se presenta una línea base para la identificación de falacias utilizando los métodos de similitud coseno, support vector machine, logistic regression y decision trees, y la extracción de términos afectivos en los argumentos. En esta línea base se obtuvo un F1-score de 0.62 y es un resultado de comparación para futuras investigaciones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.