2022
DOI: 10.1016/j.jbi.2022.104142
|View full text |Cite
|
Sign up to set email alerts
|

They May Not Work! An evaluation of eleven sentiment analysis tools on seven social media datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 20 publications
(17 citation statements)
references
References 43 publications
0
17
0
Order By: Relevance
“…Off-the-shelf sentiment analysis models include Amazon Web Services Comprehend sentiment analysis [ 53 ] and VADER [ 54 - 60 ], which is a Python lexicon and rule-based sentiment analysis tool [ 43 ]. In a recent evaluation of 11 sentiment analysis tools on 7 social media data sets, He et al [ 61 ] observed that these tools do not provide results that are accurate enough to aid in public health decision-making. Among all tools, VADER was one of the 3 top performers.…”
Section: Discussionmentioning
confidence: 99%
“…Off-the-shelf sentiment analysis models include Amazon Web Services Comprehend sentiment analysis [ 53 ] and VADER [ 54 - 60 ], which is a Python lexicon and rule-based sentiment analysis tool [ 43 ]. In a recent evaluation of 11 sentiment analysis tools on 7 social media data sets, He et al [ 61 ] observed that these tools do not provide results that are accurate enough to aid in public health decision-making. Among all tools, VADER was one of the 3 top performers.…”
Section: Discussionmentioning
confidence: 99%
“…The datasets were selected based on our systematic review and must include human coders' annotation as ground truth labels. 5 More detailed description of the datasets can be found in He et al 4 The evaluation datasets included four health-related social media datasets: Health Care Reform ("HCR"), Human Papillomavirus ("HPV") Vaccine, COVID-19 Masking ("Mask"), and Vitals.com Physician Reviews ("Vitals"). A non-health dataset, the IMDB Dataset ("IMDB") created from movie reviews, was used for baseline comparison.…”
Section: Evaluation Datasetsmentioning
confidence: 99%
“…π‘…π‘’π‘π‘Žπ‘™π‘™ = π‘ π‘’π‘š 𝑐 𝑖𝑛 𝐢 π‘‡π‘Ÿπ‘’π‘’π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’π‘ _𝑐 /π‘ π‘’π‘š 𝑐 𝑖𝑛 𝐢 (π‘‡π‘Ÿπ‘’π‘’π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’π‘ _𝑐 + πΉπ‘Žπ‘™π‘ π‘’π‘π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’π‘ _𝑐) (5) After calculation precision and recall, F-measure is also calculated. F-score is the widely accepted measure to verify the accuracy of imbalanced classifications [30]. The harmonic mean of the two fractions precision and recall give the F1-score.…”
Section: π‘ƒπ‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› = π‘ π‘’π‘š 𝑐 𝑖𝑛 𝐢 π‘‡π‘Ÿπ‘’π‘’π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’π‘ _𝑐 /π‘ π‘’π‘š 𝑐 𝑖𝑛 𝐢 (π‘‡π‘Ÿπ‘’π‘’π‘ƒπ‘œπ‘ π‘–...mentioning
confidence: 99%