Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Yang, Mengyue; Li, Qingyang; Qin, Zhiwei; Ye, Jieping

doi:10.1145/3366423.3380115

Cited by 12 publications

(7 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They adopt a generative process based on a topic model to explicitly formulate the arm dependencies as the clusters on arms, where dependent arms are assumed to be generated from the same cluster. Yang et al [197] consider the situations where there are exploration overheads, i.e., there are non-zero costs associated with executing a recommendation (arm) in the environment, and hence, the policy should be learned with a fixed exploration cost constraint. They propose a hierarchical learning structure to address the problem.…”

Section: Recommendation Via Mab-based Methodsmentioning

confidence: 99%

“…Linear UCB considering item features [92] Considering diversity of recommendation [137,103,40] Cascading bandits providing reliable negative samples [84,230] Combining offline data and online bandit signals [145] Considering pseudo-rewards for arms without feedback [30] Considering dependency among arms [180] Considering exploration overheads [197]…”

Section: Mab In Irssmentioning

confidence: 99%

See 1 more Smart Citation

Advances and Challenges in Conversational Recommender Systems: A Survey

Gao,

Lei,

et al. 2021

Preprint

View full text Add to dashboard Cite

Recommender systems exploit interaction history to estimate user preference, having been heavily used in a wide range of industry applications. However, static recommendation models are difficult to answer two important questions well due to inherent shortcomings: (a) What exactly does a user like? (b) Why does a user like an item? The shortcomings are due to the way that static models learn user preference, i.e., without explicit instructions and active feedback from users. The recent rise of conversational recommender systems (CRSs) changes this situation fundamentally. In a CRS, users and the system can dynamically communicate through natural language interactions, which provide unprecedented opportunities to explicitly obtain the exact preference of users.Considerable efforts, spread across disparate settings and applications, have been put into developing CRSs. Existing models, technologies, and evaluation methods for CRSs are far from mature. In this paper, we provide a systematic review of the techniques used in current CRSs. We summarize the key challenges of developing CRSs into five directions: (1) Question-based user preference elicitation.(2) Multi-turn conversational recommendation strategies. (3) Dialogue understanding and generation. (4) Exploitation-exploration trade-offs. (5) Evaluation and user simulation. These research directions involve multiple research fields like information retrieval (IR), natural language processing (NLP), and human-computer interaction (HCI). Based on these research directions, we discuss some future challenges and opportunities. We provide a road map for researchers from multiple communities to get started in this area. We hope this survey helps to identify and address challenges in CRSs and inspire future research.

show abstract

Section: Recommendation Via Mab-based Methodsmentioning

confidence: 99%

Section: Mab In Irssmentioning

confidence: 99%

Advances and Challenges in Conversational Recommender Systems: A Survey

Gao,

Lei,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Multi-armed bandits (MAB) problem is a typical sequential decision making process that is also treated as an online decision making problems [32]. A wide range of real world applications can be modeled as MAB problems, such as online recommendation system [16], online advertising [27] and information retrieval [15].…”

Section: Multi-armed Bandit Methodsmentioning

confidence: 99%

A Hybrid Bandit Model with Visual Priors for Creative Ranking in Display Advertising

Wang

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

Creative plays a great important role in e-commerce for exhibiting products. Sellers usually create multiple creatives for comprehensive demonstrations, thus it is crucial to display the most appealing design to maximize the Click-Through Rate (CTR). For this purpose, modern recommender systems dynamically rank creatives when a product is proposed for a user. However, this task suffers more cold-start problem than conventional products recommendation since the user-click data is more scarce and creatives potentially change more frequently. In this paper, we propose a hybrid bandit model with visual priors which first makes predictions with a visual evaluation, and then naturally evolves to focus on the specialities through the hybrid bandit model. Our contributions are three-fold: 1) We present a visual-aware ranking model (called VAM) that incorporates a list-wise ranking loss for ordering the creatives according to the visual appearance. 2) Regarding visual evaluation as a prior, the hybrid bandit model (called HBM) is proposed to evolve consistently to make better posteriori estimations by taking more observations into consideration for online scenarios. 3) A first large-scale creative dataset, CreativeRanking 1 , is constructed, which contains over 1.7M creatives of 500k products as well as their real impression and click data. Extensive experiments have also been conducted on both our dataset and public Mushroom dataset, demonstrating the effectiveness of the proposed method.

show abstract

“…For example, ConUCB [40] introduces conversations between the agent and users to ask whether the user is interested in a certain topic occasionally. HATCH [39] considers the resource consumption of exploration and proposes a strategy to conduct bandit exploration with budget limitation. S-MAB [6] considers two aspects, one is to maximize the cumulative rewards and the other is to decide how many arms to be pulled so as to reduce the exploration cost.…”

Section: Contextual Bandit Algorithmsmentioning

confidence: 99%

Show Me the Whole World

Song

Sun

Lian

et al. 2022

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

User interest exploration is an important and challenging topic in recommender systems, which alleviates the closed-loop effects between recommendation models and user-item interactions. Contextual bandit (CB) algorithms strive to make a good trade-off between exploration and exploitation so that users' potential interests have chances to expose. However, classical CB algorithms can only be applied to a small, sampled item set (usually hundreds), which forces the typical applications in recommender systems limited to candidate post-ranking, homepage top item ranking, ad creative selection, or online model selection (A/B test).In this paper, we introduce two simple but effective hierarchical CB algorithms to make a classical CB model (such as LinUCB and Thompson Sampling) capable to explore users' interest in the entire item space without limiting to a small item set. We first construct a hierarchy item tree via a bottom-up clustering algorithm to organize items in a coarse-to-fine manner. Then we propose a hierarchical CB (HCB) algorithm to explore users' interest on the hierarchy tree. HCB takes the exploration problem as a series of decisionmaking processes, where the goal is to find a path from the root to a leaf node, and the feedback will be back-propagated to all the nodes in the path. We further propose a progressive hierarchical CB (pHCB) algorithm, which progressively extends visible nodes which reach a confidence level for exploration, to avoid misleading actions on upper-level nodes in the sequential decision-making process. Extensive experiments on two public recommendation datasets demonstrate the effectiveness and flexibility of our methods.

show abstract

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Cited by 12 publications

References 16 publications

Advances and Challenges in Conversational Recommender Systems: A Survey

Advances and Challenges in Conversational Recommender Systems: A Survey

A Hybrid Bandit Model with Visual Priors for Creative Ranking in Display Advertising

Show Me the Whole World

Contact Info

Product

Resources

About