2021
DOI: 10.48550/arxiv.2104.05861
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews

Abstract: Context: Mobile app reviews written by users on app stores or social media are significant resources for app developers. Analyzing app reviews have proved to be useful for many areas of software engineering (e.g., requirement engineering, testing). Automatic classification of app reviews requires extensive efforts to manually curate a labeled dataset. When the classification purpose changes (e.g. identifying bugs versus usability issues or sentiment), new datasets should be labeled, which prevents the extensib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 44 publications
(93 reference statements)
0
6
0
Order By: Relevance
“…Among these three, only RRGEN model is available. There are a few studies in software engineering that evaluate the capability of PTMs for sentiment analysis [7], user feedback analysis [8], and programming and natural language tasks [9]. Our work is different with these studies as we are the first to investigate the application of pre-trained language models and Transformers for app review response generation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Among these three, only RRGEN model is available. There are a few studies in software engineering that evaluate the capability of PTMs for sentiment analysis [7], user feedback analysis [8], and programming and natural language tasks [9]. Our work is different with these studies as we are the first to investigate the application of pre-trained language models and Transformers for app review response generation.…”
Section: Related Workmentioning
confidence: 99%
“…The advantages of using PTMs in software engineering are explored for sentiment classification and code-related tasks (e.g. comment generation) [7,8,9]. However, there is no study that evaluates their performance for app review response generation.…”
Section: Introductionmentioning
confidence: 99%
“…Similarly, Henao et al demonstrated the increase in performance in user feedback classification when using pre-trained language models over both classical models as well as other deep models [16]. Hadi and Fard proposed a study where the classification accuracy of pre-trained language models is compared against that of previously constructed classifiers from the literature as well as exploring the effect of self-supervised pre-training, binary classification, multi-class classification, and zero-shot settings on classification performance [15]. Dhinakaran et al showed that models trained on training data that was chosen randomly were found to consistently underperform more sophisticated training data selection techniques, such as active learning [10].…”
Section: Comparisons Of Classification Techniquesmentioning
confidence: 99%
“…2. For further context, a zero-shot text classification model (denoted "Zero shot"), as was proposed by Hadi and Fard [15], was also evaluated on each dataset to provide a performance benchmark.…”
Section: Unseen Datasets (Rq2)mentioning
confidence: 99%
See 1 more Smart Citation