Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2013
DOI: 10.1145/2492517.2492632
|View full text |Cite
|
Sign up to set email alerts
|

Language independent gender classification on Twitter

Abstract: Online Social Networks (OSNs) generate a huge volume of user-originated texts. Gender classification can serve multiple purposes. For example, commercial organizations can use gender classification for advertising. Law enforcement may use gender classification as part of legal investigations. Others may use gender information for social reasons. Here we explore language independent gender classification. Our approach predicts gender using five color-based features extracted from Twitter profiles (e.g., the bac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
27
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 63 publications
(29 citation statements)
references
References 10 publications
1
27
1
Order By: Relevance
“…This could be solved by adding more language independent features like "frequency statistics", "retweeting tendency", or "neighborhood size:" to the feature vector. For instance, one could use text-independent approaches like using profile image attributes [1] or by extracting user attributes from the tweeted images [18].…”
Section: Discussionmentioning
confidence: 99%
“…This could be solved by adding more language independent features like "frequency statistics", "retweeting tendency", or "neighborhood size:" to the feature vector. For instance, one could use text-independent approaches like using profile image attributes [1] or by extracting user attributes from the tweeted images [18].…”
Section: Discussionmentioning
confidence: 99%
“…All of them were comfortable with computers and familiar with Twitter. A dataset for trial was collected using an experimental version of Chorus Tweetcatcher (TCD) 1 that is able to collect a table of Twitter users where each of them has all the attributes listed in Table 7 The dataset obtained through Chorus TCD was composed of 50 Twitter users. The test participants had to inspect 25 gender unclassified users and 21 age unclassified users belonging to this dataset.…”
Section: Methodsmentioning
confidence: 99%
“…For instance, the simplest profile-based method assigns gender class based on a dictionary look-up of the user's first-name, see [13,22]. An alternative approach is to infer a user's gender based on profile colour preference [1]. When it comes to age inference, profile-based features tend not to be used alone, but combined with content-based features.…”
Section: Related Workmentioning
confidence: 99%
“…Previous researches shows that letter n-grams are very efficient for classifying text. They are language independent and does not require expensive text pre-processing techniques like tokenization, stemming and stop words removal, hence in the case of code-mix texts, this could yield good results (Miller et al, 2012;Alowibdi et al, 2013).…”
Section: N-gramsmentioning
confidence: 99%