2019
DOI: 10.1145/3359274
|View full text |Cite
|
Sign up to set email alerts
|

Summarizing User-generated Textual Content: Motivation and Methods for Fairness in Algorithmic Summaries

Abstract: As the amount of user-generated textual content grows rapidly, text summarization algorithms are increasingly being used to provide users a quick overview of the information content. Traditionally, summarization algorithms have been evaluated only based on how well they match human-written summaries (e.g. as measured by ROUGE scores). In this work, we propose to evaluate summarization algorithms from a completely new perspective that is important when the user-generated data to be summarized comes from differe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 26 publications
(23 citation statements)
references
References 47 publications
0
22
0
Order By: Relevance
“…Celis et al [5] proposed a determinantal point process (DPP) based sampling method for fair data summarization. Dash et al [11] considered the fairness issue on summarizing user-generated textual content. Although these studies adopt similar definitions of fairness constraints to ours, their proposed methods cannot be applied to the FSM problem since the objective functions of the problems they study are not submodular.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Celis et al [5] proposed a determinantal point process (DPP) based sampling method for fair data summarization. Dash et al [11] considered the fairness issue on summarizing user-generated textual content. Although these studies adopt similar definitions of fairness constraints to ours, their proposed methods cannot be applied to the FSM problem since the objective functions of the problems they study are not submodular.…”
Section: Related Workmentioning
confidence: 99%
“…Despite the extensive studies on streaming submodular maximization, unfortunately, it seems that none of the existing methods consider the fairness issue of the subsets extracted from data streams. In fact, recent studies [5,10,11,20] reveal that data summaries automatically generated by algorithms might be biased with respect to sensitive attributes such as gender, race, or ethnicity, and the biases in summaries could be passed to data-driven decisionmaking processes in education, recruitment, banking, and judiciary systems. Thus, it is necessary to introduce fairness constraints into submodular maximization problems so that the selected subset can fairly represent each sensitive attribute in the dataset.…”
Section: Introductionmentioning
confidence: 99%
“…We believe any slight variation to 𝜖 will not change our observations significantly. While this threshold has been used in multiple prior studies [13,18], we acknowledge that the choice remains context dependent and can change based on the application and prior established regulations.…”
Section: Exposure Biasmentioning
confidence: 99%
“…Facebook, Twitter, user-generated content constitutes a large chunk of the textual information generated on the Web today. On social media, different user groups discuss different socio-political issues, and it has been observed that they often have very different opinions on the same topic or event [3], [4]. Hence, the textual information to be summarised has gradually become heterogeneous.…”
Section: Introductionmentioning
confidence: 99%
“…Hence, the textual information to be summarised has gradually become heterogeneous. In our prior work [3], we have shown that such text often contains very different opinions from people of different ideologies, social groups, etc. In many downstream applications, algorithm-generated summaries are consumed by people and hence they often play a vital role in shaping their opinion in different socio-political issues.…”
Section: Introductionmentioning
confidence: 99%