Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2018
DOI: 10.1145/3219819.3220007
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Abstract: Neural-based multi-task learning has been successfully used in many real-world large-scale applications such as recommendation systems. For example, in movie recommendations, beyond providing users movies which they tend to purchase and watch, the system might also optimize for users liking the movies afterwards. With multi-task learning, we aim to build a single model that learns these multiple goals and tasks simultaneously. However, the prediction quality of commonly used multi-task models is often sensitiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
467
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 760 publications
(467 citation statements)
references
References 15 publications
0
467
0
Order By: Relevance
“…Here we evaluate the ranking model's performance. The ranking model is a multi-layer neural network trained in a pointwise-fashion with multiple heads to predict the probability of a click, y, and a set of user engagement signals after the click, which we refer to in the aggregate by z; this is a similar setup to [25,36]. This model is continuously trained on a dataset of interactions with previous recommendations.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Here we evaluate the ranking model's performance. The ranking model is a multi-layer neural network trained in a pointwise-fashion with multiple heads to predict the probability of a click, y, and a set of user engagement signals after the click, which we refer to in the aggregate by z; this is a similar setup to [25,36]. This model is continuously trained on a dataset of interactions with previous recommendations.…”
Section: Methodsmentioning
confidence: 99%
“…These systems often follow cascading patterns with sequences of models being used [47,24]. More recently, there has been a strong growth in using state-of-the-art neural network techniques to improve recommender accuracy [25,16,36].…”
Section: Related Workmentioning
confidence: 99%
“…Gating Network Borrowing the idea from from Mixture-of-Experts model [23,35], we build a gating network to aggregate our encoders' results. The gate is also helpful to better understand our 1 It is unnecessary if d in is same as d enc .…”
Section: Three Different Range Encodersmentioning
confidence: 99%
“…A major limitation of most meta-RL methods (discussed thoroughly in § 2) is that they do not explicitly and adequately model the individuality and the commonness of tasks, which has proven to play an important role in the literature of multitask learning [Ruder, 2017;Ma et al, 2018] and should be likewise applicable to meta-RL. Take the case of locomotion tasks, where an agent needs to move to different target locations for different tasks.…”
Section: Introductionmentioning
confidence: 99%