pysentimiento: A Python Toolkit for Opinion Mining and Social NLP tasks

Perez, Juan Manuel; Rajngewerc, Mariela; Giudici, Juan Carlos; Furman, Damián Ariel; Luque, Franco; Alemany, Laura Alonso; Martínez, María Vanina

doi:10.21203/rs.3.rs-3570648/v1

Cited by 10 publications

(2 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pysentimiento is an open-source Python library that includes models for sentiment analysis and social NLP tasks, such as hate speech detection, irony detection, emotion analysis, named entity recognition, and part-of-speech tagging, in several languages such as English, Spanish, Portuguese, and Italian [101,102]. The English model for sentiment analysis is based on BERTweet [103], a RoBERTa model, trained on English tweets and also fine-tuned on the SemEval 2017 sentiment analysis data set [91].…”

Section: Pysentimientomentioning

confidence: 99%

A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data (Preprint)

Lossio-Ventura¹,

Weger²,

Lee³

et al. 2023

Preprint

View full text Add to dashboard Cite

BACKGROUND Healthcare providers and health-related researchers face significant challenges when applying sen- timent analysis tools to health-related free-text survey data. Most state-of-the-art applications were developed in domains like social media, and their performance in the healthcare context remains relatively unknown. Moreover, existing studies indicate that these tools often lack accuracy and produce inconsistent results. OBJECTIVE This study aims to address the lack of comparative analysis on sentiment analysis tools applied to health-related free-text survey data in the context of COVID-19. The objective is to automatically predict sentence sentiment for two independent COVID-19 survey datasets from NIH and Stanford University. METHODS Gold-standard labels were created for a subset of each dataset using a panel of human raters. We compared eight state-of- the-art sentiment analysis tools on both datasets to evaluate variability and disagreement across tools. Additionally, few-shot learning was explored by fine-tuning OPT (a large language model [LLM] with publicly available weights) using a small annotated subset and zero-shot learning using ChatGPT (an LLM without available weights). RESULTS The comparison of sentiment analysis tools revealed high variability and disagreement across the evaluated tools when applied to health-related survey data. OPT and ChatGPT demonstrated superior performance, outperform- ing all other sentiment analysis tools. Moreover, ChatGPT exhibited higher accuracy, outperforming OPT by 6%, and f-score by 4% to 7%. CONCLUSIONS The findings suggest that using LLMs is a viable method for predicting sentiment in health surveys. The comparative analysis highlights the potential of LLMs in reducing the need for human labor in dataset annotation or redeploying it toward quality control of LLM predictions. The study demonstrates the effectiveness of LLMs, particularly the few-shot learning and zero-shot learning approaches, in sentiment analysis of health-related survey data. These results have implications for saving hu- man labor and improving efficiency in sentiment analysis tasks, contributing to advancements in the field of automated sentiment analysis.

show abstract

Section: Pysentimientomentioning

confidence: 99%

A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data (Preprint)

Lossio-Ventura¹,

Weger²,

Lee³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…-Sentiment (English [17], Italian [18], Spanish [17], German [19]) -Emotions (English [20], Italian [18], Spanish [21]) -Hate speech (English, Italian, German [22]) -Fake news (English [23], German [24]) -Irony (English [25]) -Sexism (English [26])…”

Section: Natural Language Processing and Machine Learning Apimentioning

confidence: 99%

The “Courage Companion” – An AI-Supported Environment for Training Teenagers in Handling Social Media Critically and Responsibly

Aprin

Malzahn

Lomonaco

et al. 2023

Communications in Computer and Information Science

View full text Add to dashboard Cite

The provision of toxic content and misinformation is a frequent phenomenon in current social media with specific impact and risks for younger users. We report on efforts taken in the project Courage to mitigate and overcome these threats through dedicated educational technology inspired by psychological and pedagogical approaches. The aim is to empower adolescents to confidently interact with and utilize social media and to increase their awareness and resilience. For this purpose, we have adopted approaches from the field of Intelligent Tutoring Systems, namely the provision of a virtual learning companion (VLC). The technical system is a browser-based environment that allows for combining a controllable social media space with a VLC as a plugin. This environment is backed by an API that bundles Machine Learning and Natural Language Processing algorithms for detecting and classifying different types of risks. The pedagogical scenarios that are supported by this technical environment and approach range from chat-based dialogues to more complex narrative scripts.

show abstract

Towards New Data Spaces for the Study of Multiple Documents with Va.Si.Li-Lab: A Conceptual Analysis

Mehler,

Bagci,

Schrottenbacher

et al. 2024

Students’, Graduates’ and Young Professionals’ Critical Use of Online Information

View full text Add to dashboard Cite

pysentimiento: A Python Toolkit for Opinion Mining and Social NLP tasks

Cited by 10 publications

References 41 publications

A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data (Preprint)

A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data (Preprint)

The “Courage Companion” – An AI-Supported Environment for Training Teenagers in Handling Social Media Critically and Responsibly

Towards New Data Spaces for the Study of Multiple Documents with Va.Si.Li-Lab: A Conceptual Analysis

Contact Info

Product

Resources

About