Oscar S. Siordia scite author profile

Oscar S. Siordia

5Publications

73Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Consejo Nacional de Humanidades, Ciencias y Tecnologías, King Juan Carlos University, Centro Nacional de Investigación y Desarrollo Tecnológico

Publications

Order By: Most citations

A case study of Spanish text transformations for twitter sentiment analysis

Téllez

Miranda-Jiménez

Graff

et al. 2017

Expert Systems with Applications

View full text Add to dashboard Cite

Sentiment analysis is a text mining task that determines the polarity of a given text, i.e., its positiveness or negativeness. Recently, it has received a lot of attention given the interest in opinion mining in micro-blogging platforms. These new forms of textual expressions present new challenges to analyze text given the use of slang, orthographic and grammatical errors, among others. Along with these challenges, a practical sentiment classifier should be able to handle efficiently large workloads.The aim of this research is to identify which text transformations (lemmatization, stemming, entity removal, among others), tokenizers (e.g., words n-grams), and tokens weighting schemes impact the most the accuracy of a classifier (Support Vector Machine) trained on two Spanish corpus. The methodology used is to exhaustively analyze all the combinations of the text transformations and their respective parameters to find out which characteristics the best performing classifiers have in common. Furthermore, among the different text transformations studied, we introduce a novel approach based on the combination of word based n-grams and character based q-grams. The results show that this novel combination of words and characters produces a classifier that outperforms the traditional word based combination by 11.17% and 5.62% on the INEGI and TASS'15 dataset, respectively.

show abstract

A simple approach to multilingual polarity classification in Twitter

Téllez¹,

Miranda-Jiménez²,

Graff³

et al. 2017

Pattern Recognition Letters

View full text Add to dashboard Cite

Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them have important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results.

show abstract

The “War on Drugs” in Mexico: (Official) Database of Events between December 2006 and November 2011

Atuesta

Siordia²,

Lajous

2018

Journal of Conflict Resolution

View full text Add to dashboard Cite

The objective of this text is to describe the three categories that the Drug Policy Program at the Center for Teaching and Research in Economics (CIDE-PPD) database comprises, their limitations, and their main features. Additionally, we explain what we believe to be the source of the database we originally received and analyze its accuracy by comparing it with public records. We describe the validation and codification processes the database was subjected to, as well as the main biases and limitations the database may have. Additionally, we offer a preliminary analysis of the type of research that the CIDE-PPD Database can support. This analysis is not only relevant to those interested in studying the “war on drugs” in Mexico but also to those studying conflict in other countries involved in illegal drug production and trafficking, as well as countries experiencing conflicts related to organized crime.

show abstract

Driving risk classification based on experts evaluation

Siordia

Diego

Conde

et al. 2010

View full text Add to dashboard Cite

Accident reproduction system for the identification of human factors involved on traffic accidents

Siordia

Diego

Conde

et al. 2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Oscar S. Siordia

A case study of Spanish text transformations for twitter sentiment analysis

A simple approach to multilingual polarity classification in Twitter

The “War on Drugs” in Mexico: (Official) Database of Events between December 2006 and November 2011

Driving risk classification based on experts evaluation

Accident reproduction system for the identification of human factors involved on traffic accidents

Contact Info

Product

Resources

About