2021
DOI: 10.1609/icwsm.v6i1.14320
|View full text |Cite
|
Sign up to set email alerts
|

Inferring Gender from the Content of Tweets: A Region Specific Example

Abstract: There is growing interest in using social networking sites such as Twitter to gather real-time data on the reactions and opinions of a region's population, including locations in the developing world where social media has played an important role in recent events, such as the 2011 Arab Spring. However, many interesting and important opinions and reactions may differ significantly within a given region depending on the demographics of the subpopulation, including such categories as gender and ethnicity. Unfort… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(15 citation statements)
references
References 6 publications
0
15
0
Order By: Relevance
“…Some scholars have documented the linguistic style of particular media, in order to better understand how users exploit and expand their communication affordances (e.g., (Hu, Talamadupula, and Kambhampati 2013)) or when and why people might change their linguistic style (Danescu-Niculescu-Mizil et al 2013;Michael and Otterbacher 2014). In contrast, others have studied the correlation between participants' linguistic patterns and their offline demographics and identities (e.g., (Pennacchiotti and Popescu 2011;Fink, Kopecky, and Morawski 2012;Park et al 2013).…”
Section: Discussionmentioning
confidence: 99%
“…Some scholars have documented the linguistic style of particular media, in order to better understand how users exploit and expand their communication affordances (e.g., (Hu, Talamadupula, and Kambhampati 2013)) or when and why people might change their linguistic style (Danescu-Niculescu-Mizil et al 2013;Michael and Otterbacher 2014). In contrast, others have studied the correlation between participants' linguistic patterns and their offline demographics and identities (e.g., (Pennacchiotti and Popescu 2011;Fink, Kopecky, and Morawski 2012;Park et al 2013).…”
Section: Discussionmentioning
confidence: 99%
“…Experiments on automatic classification of users according to latent attributes such as gender and age have been done on a wide range of resources, including telephone conversations (Garera and Yarowsky 2009), blogs (Sarawgi, Gajulapalli, and Choi 2011), forum posts (Nguyen, Smith, and Rosé 2011) and scientific articles (Bergsma, Post, and Yarowsky 2012;Sarawgi, Gajulapalli, and Choi 2011). Recently, Twitter has started to attract interest by researchers as a resource to study automatic identification of user attributes, such as ethnicity (Pennacchiotti and Popescu 2011;Rao et al 2011), gender (Fink, Kopecky, and Morawski 2012;Bamman, Eisenstein, and Schnoebelen 2012;Rao et al 2010;Burger et al 2011;Rao et al 2011), geographical location (Eisenstein et al 2010) and age (Rao et al 2010).…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, their set of users was restricted to users having blogs and willing to link them using Twitter. Some approaches used lists of male and female names, for example obtained using Facebook (Fink, Kopecky, and Morawski 2012) or from the US social security department (Zamal, Liu, and Ruths 2012;Bamman, Eisenstein, and Schnoebelen 2012).…”
Section: Selecting and Crawling Usersmentioning
confidence: 99%
“…A large body of work has been focused on how to infer users attributes or classify users into certain category according to their explicit or implicit characteristics. Previous work has looked at identifying user's age (Rao et al 2010), gender (Fink, Kopecky, and Morawski 2012), ethnicity (Pennacchiotti and Popescu 2011), regional origin (Huang, Weber, and Vieweg 2014;Rao et al 2010), political affiliation (Cohen and Ruths 2013;Pennacchiotti and Popescu 2011), interest in particular articles (Carreira et al 2004), roles in a conversation (Tinati et al 2012), and etc.…”
Section: Previous Work Machine Learning For User Classificationmentioning
confidence: 99%