2012
DOI: 10.1080/19331681.2012.669191
|View full text |Cite
|
Sign up to set email alerts
|

Tradeoffs in Accuracy and Efficiency in Supervised Learning Methods

Abstract: Text is becoming a central source of data for social science research. With advances in digitization and open records practices, the central challenge has in large part shifted away from availability to usability. Automated text classification methodologies are becoming increasingly important within political science because they hold the promise of substantially reducing the costs of converting text to data for a variety of tasks. In this paper, we consider a number of questions of interest to prospective use… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 59 publications
(45 citation statements)
references
References 18 publications
0
45
0
Order By: Relevance
“…The quality of the algorithm results depend on three factors 37. First, the quality of the data provided may make it more ‘difficult’ or ‘easy’ it is to identify patterns.…”
Section: Discussionmentioning
confidence: 99%
“…The quality of the algorithm results depend on three factors 37. First, the quality of the data provided may make it more ‘difficult’ or ‘easy’ it is to identify patterns.…”
Section: Discussionmentioning
confidence: 99%
“…Specifically, ForSight has been verified through comparison with surveys data and election results [Ceron et al, 2014;Hitlin, 2015]. These scholars, among others, have also verified the resilience of supervised learning programs based on the training set used for the program [Collingwood and Wilkerson, 2012;Hopkins and King, 2010]. Using a large and randomly distributed subset of the sample posts improves the accuracy of the program, in addition to extensive human coding [Collingwood and Wilkerson, 2012;Neuendorf, 2017].…”
Section: Twitter Datamentioning
confidence: 96%
“…Scholars have argued for applying this hybrid content analysis method to social media discourses as it possesses the reliability and efficiency of computer-based coding while preserving the latent validity of human coding Su, Akin and Brossard, 2017]. Others have examined and verified such supervised learning programs [Collingwood and Wilkerson, 2012]. Specifically, ForSight has been verified through comparison with surveys data and election results [Ceron et al, 2014;Hitlin, 2015].…”
Section: Twitter Datamentioning
confidence: 99%
“…Such approaches have been successfully used for automatically determining the topic of a text, for example using machine learning (Sebastiani, 2002). In the comparative agendas project, a technique known as active learning was used to accurately determine the topic of large amounts of documents at a fraction of the cost of coding all documents (Hillard et al, 2008;Collingwood and Wilkerson, 2012). Recent work on unsupervised topic models (i.e.…”
Section: Automatic Analysis Of Political Communicationmentioning
confidence: 99%