2013
DOI: 10.1145/2528272.2528276
|View full text |Cite
|
Sign up to set email alerts
|

"TweetGenie: automatic age prediction from tweets" by D. Nguyen, R. Gravel, D. Trieschnigg, and T. Meder; with Ching-man Au Yeung as coordinator

Abstract: A person's language use reveals much about the person's social identity, which is based on the social categories a person belongs to including age and gender. We discuss the development of TweetGenie, a computer program that predicts the age of Twitter users based on their language use. We explore age prediction in three different ways: classifying users into age categories, by life stages, and predicting their exact age. An automatic system achieves better performance than humans on these tasks. Both humans a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
1

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(16 citation statements)
references
References 13 publications
0
15
1
Order By: Relevance
“…For example, in Nguyen et al. (2013), chronological age was a better predictor of the linguistic phenomena in question for younger speakers, while for older speakers, chronological age did not yield clear results. Such findings point to the question of whether chronological age is in fact a good correlate of socially conditioned language change, as is often implicitly assumed in language change studies (both apparent‐ and real‐time ones; see Sankoff, 2006).…”
Section: Introductionmentioning
confidence: 88%
See 2 more Smart Citations
“…For example, in Nguyen et al. (2013), chronological age was a better predictor of the linguistic phenomena in question for younger speakers, while for older speakers, chronological age did not yield clear results. Such findings point to the question of whether chronological age is in fact a good correlate of socially conditioned language change, as is often implicitly assumed in language change studies (both apparent‐ and real‐time ones; see Sankoff, 2006).…”
Section: Introductionmentioning
confidence: 88%
“…As pointed out in the studies cited above, as well as by Eckert (1996), the vast majority of experimental work generally views a speaker's date of birth, or chronological age , as a reflection of their social age , that is, the age‐related social meanings and roles claimed by the speaker (for a discussion of the term ‘social age’, see Clark‐Kazak (2009). Nevertheless, as shown by studies such as Dubois and Horvath (1999) and Nguyen, Gravel, Trieschnigg, and Meder (2013), chronological age does not necessarily correlate strongly with social age. Speakers may feel and act a different age than what their birth certificates would predict—and they might also not.…”
Section: Introductionmentioning
confidence: 94%
See 1 more Smart Citation
“…The algorithm we used to automatically determine the age and sex of authors was developed using Facebook data [14], was further evaluated on blogs and tweets and we judged this tool to be suitable to predict age and sex of the authors in our Reddit post dataset. However, it has been recognised that using textual content from online media to predict age can be insufficient [20].…”
Section: Discussionmentioning
confidence: 99%
“…In computer science, there has been growing interest in automatically identifying author traits from their text using machine learning techniques, predicting gender (Burger, Henderson, King, & Zarrella, 2011), age (Rao et al, 2011; Sap et al, 2014), political orientation (Pennacchiotti & Popescu, 2011), and income (Preoţiuc-Pietro, Volkova, Lampos, Bachrach, & Aletras, 2015). Other studies have predicted language associations with perceived author traits for demographic characteristics such as gender (Nguyen et al, 2014), age (Nguyen, Gravel, Trieschnigg, & Meder, 2013), and other features. In these cases, researchers employed raters to categorize author profiles according to their perceptions of the author’s traits.…”
Section: Language Use As a Representation Of Group Differencesmentioning
confidence: 99%