2006
DOI: 10.1093/llc/fql044
|View full text |Cite
|
Sign up to set email alerts
|

Use of the Chi-Squared Test to Examine Vocabulary Differences in English Language Corpora Representing Seven Different Countries

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0
2

Year Published

2012
2012
2020
2020

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 88 publications
(37 citation statements)
references
References 2 publications
0
35
0
2
Order By: Relevance
“…It follows that the presence of some very specific texts, or even a single one, in a corpus may be sufficient to increase the frequency of certain words and thus to modify the words considered as being significantly more frequent in this corpus according to the Chi 2 and LL tests. This phenomenon is perfectly illustrated in the following example reported in Oakes and Farrow (2007) It is important to note that it is not just such extreme cases that invalidate the Chi 2 and LL tests. The simple fact that the probability of a word occurring in a text for a second time is far higher than that of having it for the first time, shows that non-independence is general and not occasional (Church, 2000).…”
Section: The Problemmentioning
confidence: 60%
See 4 more Smart Citations
“…It follows that the presence of some very specific texts, or even a single one, in a corpus may be sufficient to increase the frequency of certain words and thus to modify the words considered as being significantly more frequent in this corpus according to the Chi 2 and LL tests. This phenomenon is perfectly illustrated in the following example reported in Oakes and Farrow (2007) It is important to note that it is not just such extreme cases that invalidate the Chi 2 and LL tests. The simple fact that the probability of a word occurring in a text for a second time is far higher than that of having it for the first time, shows that non-independence is general and not occasional (Church, 2000).…”
Section: The Problemmentioning
confidence: 60%
“…An advantage of the range over many other measures of dispersion is that it is easily interpretable. It is important to compare the performance of the tests with and without a dispersion threshold because few studies use them, whereas Oakes and Farrow (2007) have shown that it is useful for filtering uninteresting words when using the Chi 2 test. Table 2 summarizes the main results of the analyses.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations