2015
DOI: 10.1002/asi.23283
|View full text |Cite
|
Sign up to set email alerts
|

Text clustering: An application with the State of the Union addresses

Abstract: This paper describes a clustering and authorship attribution study over the State of the Union addresses from 1790 to 2014 (224 speeches delivered by 41 presidents). To define the style of each presidency, we have applied a principal component analysis (PCA) based on the part‐of‐speech (POS) frequencies. From Roosevelt (1934), each president tends to own a distinctive style whereas previous presidents tend usually to share some stylistic aspects with others. Applying an automatic classification based on the fr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
10
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 30 publications
(11 citation statements)
references
References 20 publications
1
10
0
Order By: Relevance
“…In the current study, the target application pursues a larger scale than a few authors in describing the lexical specificities associated with each U.S. president. Moreover, a recent study shows that when applying a clustering algorithm on this corpus, all speeches appearing under the same presidency tend to regroup themselves under the same cluster (Savoy, 2015).…”
Section: Term Specificity Measurementioning
confidence: 93%
See 1 more Smart Citation
“…In the current study, the target application pursues a larger scale than a few authors in describing the lexical specificities associated with each U.S. president. Moreover, a recent study shows that when applying a clustering algorithm on this corpus, all speeches appearing under the same presidency tend to regroup themselves under the same cluster (Savoy, 2015).…”
Section: Term Specificity Measurementioning
confidence: 93%
“…From a lexical point of view, the presidencies of Eisenhower, Kennedy, or Ford present only a few overused terms reused by the following presidents. When analyzing the style of the U.S. presidents (Savoy, 2015), we can see that these three presidents are strongly related to only one other president and relatively distant from the others. In other words, and from a lexical point of view, they are isolated.…”
Section: Lexical Leadersmentioning
confidence: 98%
“…Following Lei and Wen (2020), the present study employed the State of the Union Addresses delivered by 43 presidents of the United States of America as the dataset. The dataset was used for its features such as availability, long timespan, and comparability in genre (Savoy, 2015). First, all the texts of the addresses are freely available from the American Presidency Project homepage (http://presidency.proxied.lsit.ucsb.…”
Section: Methodsmentioning
confidence: 99%
“…The methodological approach employed in this study is textual analysis. Textual analysis has often been used to qualitatively analyse communication data (Osei Fordjour, 2022;Savoy, 2015;Sikanku, 2020;Smith, 2017;Sowińska, 2013). Textual analysis is a methodology that helps researchers gather data and make sense of written or spoken language (Chung & Park, 2010;McKee, 2003).…”
Section: Methodology: Textual Analysismentioning
confidence: 99%