2018
DOI: 10.1080/09296174.2018.1523777
|View full text |Cite
|
Sign up to set email alerts
|

A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers

Abstract: Previous studies demonstrate that morphosyntactic plural markers and the structure of numeral systems have individually strong predictive power with regard to the usage of sortal classifiers in languages. We use these two factors as explanatory variables to train the computational classifier of random forests and evaluate the accuracy of their predictive power when selecting the existence/absence of sortal classifiers as response variable. Our results show that these two factors result in an excellent discrimi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…Here are some linguistic studies by random forest algorithm. Her and Tang (2020), for example, ranked the feature importance by random forest to understand the predictive power of input variables. There are also some other similar cases, such as Deshors (2020b), andWiechmann andKerz (2014).…”
Section: Results Of Questionmentioning
confidence: 99%
“…Here are some linguistic studies by random forest algorithm. Her and Tang (2020), for example, ranked the feature importance by random forest to understand the predictive power of input variables. There are also some other similar cases, such as Deshors (2020b), andWiechmann andKerz (2014).…”
Section: Results Of Questionmentioning
confidence: 99%
“…For (i), both experimental studies [ 25 , 26 ] fail to find any evidence for an effect of Microcephalin and, while this could be a false negative (its effect is too weak to be detected even with more than 400 participants) or simply not captured by the tasks, it is worthwhile taking it at face value and assuming that Microcephalin could have very well been a false positive in the original [ 9 ] study. For (ii), the intervening years have seen a revolution in the methods used to ask cross-cultural questions [ 27 ], ranging from the generalisation of mixed-effects/hierarchical regression [ 28 , 29 ], to the use of Bayesian methods [ 2 , 30 ], permutation/randomisation [ 31 ], and of phylogenetics [ 32 ] and machine learning [ 33 ]. Likewise, the availability and quality of linguistic (and cultural) data has dramatically improved, with databases such as WALS Online ; [ 34 ], PHOIBLE ; [ 35 ], LAPSyD and D-PLACE [ 36 ] being easily accessed by humans and machines.…”
Section: Introductionmentioning
confidence: 99%