Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management 2020
DOI: 10.1145/3340531.3411972
|View full text |Cite
|
Sign up to set email alerts
|

TOMATO: A Topic-Wise Multi-Task Sparsity Model

Abstract: The Multi-Task Learning (MTL) leverages the interrelationship across tasks and is useful for applications with limited data. Existing works articulate different task relationship assumptions, whose validity is vital to successful multi-task training. We observe that, in many scenarios, the interrelationship across tasks varies across different groups of data (i.e., topic), which we call within-topic task relationship hypothesis. In this case, current MTL models with homogeneous task relationship assumption can… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…Hard-sharing approaches [7,26] use a single network, which forces all tasks to own the same hidden space and allows for modeling task-specificity only in the top layer. To remove redundant parameters across tasks, [25] combines l 1 -norm and l 2,1 -norm, while [8] combines Group Lasso and Adaptive Group Lasso; [39] introduces a novel topic-task-element penalty to promote topic-level sparsity; [26] learns a sparse sharing structure by extracting sub-nets from the base network. On the other hand, soft-sharing approaches [18] use a separate network for each task and task relationship is captured by jointly regularizing the weights of these networks, which is usually represented as a tensor by concatenating layer-wise weight matrices from multiple tasks.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Hard-sharing approaches [7,26] use a single network, which forces all tasks to own the same hidden space and allows for modeling task-specificity only in the top layer. To remove redundant parameters across tasks, [25] combines l 1 -norm and l 2,1 -norm, while [8] combines Group Lasso and Adaptive Group Lasso; [39] introduces a novel topic-task-element penalty to promote topic-level sparsity; [26] learns a sparse sharing structure by extracting sub-nets from the base network. On the other hand, soft-sharing approaches [18] use a separate network for each task and task relationship is captured by jointly regularizing the weights of these networks, which is usually represented as a tensor by concatenating layer-wise weight matrices from multiple tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Different from the aforementioned shallow MTL methods, deep MTL uses neural networks to model the complex nonlinear structure of real data. A number of methods with sparsity-inducing regularizations [25,26,39] are proposed to capture complex nonlinear relations among tasks in a compact way. For current structured sparse MTL methods, there are three main problems: 1) Each model works under a specific assumption on the structured sparsity of parameters, that limits its generalization ability to tackle various real applications.…”
Section: Introductionmentioning
confidence: 99%