Defining and Evaluating Blog Characteristics

Perez-Tellez, Fernando; Pinto, David; Cardiff, John; Rosso, Paolo

doi:10.1109/micai.2009.21

Cited by 2 publications

(4 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• Shortness (S). It is calculated as the arithmetic mean of the lengths of the document from the collection (Tellez et al, 2009). Thus, the higher its value, the longer the documents.…”

Section: Inaoementioning

confidence: 99%

See 1 more Smart Citation

Author Profiling in Social Media with Multimodal Information

Carmona¹,

Tello²,

Gómez³

et al. 2020

CyS

View full text Add to dashboard Cite

Determine aspects of a person as gender, age, residency, occupation, among others, through his/her texts is a task that is part of the natural language processing and is known as author profiling.In this thesis work, we propose a solution for the task of profiling authors in social networks. Our solution uses a multimodal approach to extracting information from Agradecimientos Mamá, no hay palabras para poder agradecerle todo tu apoyo a lo largo de toda una vida y más. ¡Simplemente Gracias! Mi agradecimiento al pueblo mexicano y al Consejo Nacional de Ciencia y Tecnología (CONACYT) por el apoyo otorgado a través de la beca no. 401887. También agradezco el apoyo que se dio a esta tesis a través de los proyectos CB-2015-01-258588,

show abstract

“…• Shortness (S). It is calculated as the arithmetic mean of the lengths of the document from the collection (Tellez et al, 2009). Thus, the higher its value, the longer the documents.…”

Section: Inaoementioning

confidence: 99%

“…The ideal number of documents per category is defined as the ratio of the number of documents in the collection and the number of categories. The higher the value of class imbalance, the more unbalanced the collection is (Tellez et al, 2009). we calculated the text overlap average.…”

Section: Inaoementioning

confidence: 99%

Author Profiling in Social Media with Multimodal Information

Carmona¹,

Tello²,

Gómez³

et al. 2020

CyS

View full text Add to dashboard Cite

show abstract

“…• Shortness (S). It is calculated as the arithmetic mean of the lengths of the document from the collection [41]. Thus, the higher its value, the longer the documents.…”

Section: On the Role Of The Collection Characteristicsmentioning

confidence: 99%

Section: On the Role Of The Collection Characteristicsmentioning

confidence: 99%

A comparative analysis of distributional term representations for author profiling in social media

Álvarez‐Carmona

Villatoro-Tello

Montes-y-Gómez

et al. 2019

IFS

View full text Add to dashboard Cite

Author Profiling (AP) aims at predicting specific characteristics from a group of authors by analyzing their written documents. Many research has been focused on determining suitable features for modeling writing patterns from authors. Reported results indicate that contentbased features continue to be the most relevant and discriminant features for solving this task. Thus, in this paper, we present a thorough analysis regarding the appropriateness of different distributional term representations (DTR) for the AP task. In this regard, we introduce a novel framework for supervised AP using these representations and, supported on it. We approach a comparative analysis of representations such as DOR, TCOR, SSR, and word2vec in the AP problem. We also compare the performance of the DTRs against classic approaches including popular topic-based methods. The obtained results indicate that DTRs are suitable for solving the AP task in social media domains as they achieve competitive results while providing meaningful interpretability.[2] becoming an important issue for many companies and organizations. For example, from the marketing perspective, knowing characteristics of a group of Internet users could help in improving the impact of some particular products, and, from the forensic linguistics view, knowing the linguistic profile of an author could be used as valuable additional evidence in criminal investigations.Generally speaking, the author profiling (AP) task consists in analyzing written documents to extract relevant demographic information from their authors [15], such as gender, age range, personality traits, native language, political orientation, among others. Traditionally, the AP task has been approached as a single-labeled classification problem, where the different categories (e.g., male vs. female, or teenager vs. young vs. old) stand for the target classes. The common pipeline is as follows: i) extracting textual features from the documents; ii) building the documents' representation using the extracted features, and iii) learning a classification model from the built representations. As it is possible to imagine, extracting the relevant features is a key aspect for learning the textual patterns of the different profiles. Accordingly, previous research has evaluated the importance of thematic (content-based) features [15,33] and stylistic characteristics [7]. More recently, some works have also considered learning such representations utilizing Convolutional and Recurrent Neural Networks [39,14,40].Although many textual features have been used and proposed, a common conclusion among previous research is that content-based features are the most relevant for this task. The later can be confirmed by reviewing the results from the PAN 1 competitions [35], where the best-performing systems employed content-based features for representing the documents regardless of their genre. This result is somehow intuitive since AP is not focused on distinguishing a particular author through modeling its writing style 2 ,...

show abstract

Defining and Evaluating Blog Characteristics

Cited by 2 publications

References 6 publications

Author Profiling in Social Media with Multimodal Information

Author Profiling in Social Media with Multimodal Information

A comparative analysis of distributional term representations for author profiling in social media

Contact Info

Product

Resources

About