FairyTED: A Fair Rating Predictor for TED Talk Data

Acharyya, Rupam; Das, Shouman; Chattoraj, Ankani; Tanveer, Md. Iftekhar

doi:10.1609/aaai.v34i01.5368

Cited by 9 publications

(5 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Suppose the users' feedback on an item pair (𝑖, 𝑗) is probabilistic, and we can observe 𝑖 > 𝑗 and 𝑖 < 𝑗 with the probabilities of 𝜂 and 1 −𝜂, respectively. Then, 𝜂 can actually measure the hardness of the sample, when 𝜂 is closer to 1 2 , then the sample is harder, since the propensity between the items is more ambiguous. Suppose 𝜖, 𝛿 ∈ (0, 1), and we use a simple voting mechanism to determine the relation between 𝑖 and 𝑗, then we need to have This theory suggests that we need to generate more samples (i.e., larger log 1 𝛿 2(1−2𝜂) 2 ) for the harder (i.e., smaller 𝜂) item pairs.…”

Section: Theoretical Insights On the Learning-based Intervention Methodmentioning

confidence: 99%

“…Remark. (1) In order to consider the potential noisy information and randomness in the recommender system, the structure equation models (i.e., 𝑭 ) are defined in a stochastic manner, which helps to learn user preference in a more accurate and robust manner. (2) The exogenous variables in the recommendation problem can be explained as the conditions (e.g., system status, user habit, etc.)…”

Section: Recommender Simulatormentioning

confidence: 99%

“…To begin with, we reformulate the recommendation problem by a structure causal model 𝑴 = {𝑮, 𝑭 }. In specific, the causal graph 𝑮 is defined as follows (see Figure 1(b)): (1) U, R and S are the nodes representing the user, the recommendation list and the positive items 1 selected by the user. (2) U → R encodes the fact that the recommendation list is generated according to the user preference.…”

Section: Recommender Simulatormentioning

confidence: 99%

“…In another research line, causal inference (CI) has been recently introduced into the machine learning community to augment the training data for more comprehensive model optimization [1]. The basic idea is firstly assuming an underlying structure causal model (SCM), and then learning the model parameters based on the observed data.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Top-N Recommendation with Counterfactual User Preference Simulation

Yang

Dai

Dong

et al. 2021

Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

Top-N recommendation, which aims to learn user ranking-based preference, has long been a fundamental problem in a wide range of applications. Traditional models usually motivate themselves by designing complex or tailored architectures based on different assumptions. However, the training data of recommender system can be extremely sparse and imbalanced, which poses great challenges for boosting the recommendation performance. To alleviate this problem, in this paper, we propose to reformulate the recommendation task within the causal inference framework, which enables us to counterfactually simulate user ranking-based preferences to handle the data scarce problem. The core of our model lies in the counterfactual question: "what would be the user's decision if the recommended items had been different?". To answer this question, we firstly formulate the recommendation process with a series of structural equation models (SEMs), whose parameters are optimized based on the observed data. Then, we actively indicate many recommendation lists (called intervention in the causal inference terminology) which are not recorded in the dataset, and simulate user feedback according to the learned SEMs for generating new training samples. Instead of randomly intervening on the recommendation list, we design a learning-based method to discover more informative training samples. Considering that the learned SEMs can be not perfect, we, at last, theoretically analyze the relation between the number of generated samples and the model prediction error, based on which a heuristic method is designed to control the negative effect brought by the prediction error. Extensive experiments are conducted based on both synthetic and real-world datasets to demonstrate the effectiveness of our framework. CCS CONCEPTS• Information systems → Recommender systems.

show abstract

Section: Theoretical Insights On the Learning-based Intervention Methodmentioning

confidence: 99%

Section: Recommender Simulatormentioning

confidence: 99%

Section: Recommender Simulatormentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Top-N Recommendation with Counterfactual User Preference Simulation

Yang

Dai

Dong

et al. 2021

Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

show abstract

“…To understand how the relationship between linguistic features and engagement in podcasts compares to other spoken media, we carry out the same analysis on a corpus of 2480 talks from the TED Conferences (Tanveer et al, 2018;Acharyya et al, 2020). While we don't have access to the stream rate of the lectures, the data includes the total view count and ratings.…”

Section: Podcasting Vs Public Speaking: Modeling Engagement With Ted Talksmentioning

confidence: 99%

Modeling Language Usage and Listener Engagement in Podcasts

Reddy¹,

Lazarova²,

Yu³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

While there is an abundance of popular writing targeted to podcast creators on how to speak in ways that engage their listeners, there has been little data-driven analysis of podcasts that relates linguistic style with listener engagement. In this paper, we investigate how various factors -vocabulary diversity, distinctiveness, emotion, and syntax, among others -correlate with engagement, based on analysis of the creators' written descriptions and transcripts of the audio. We build models with different textual representations, and show that the identified features are highly predictive of engagement. Our analysis tests popular wisdom about stylistic elements in highengagement podcasts, corroborating some aspects, and adding new perspectives on others.

show abstract