Boosting binary masks for multi-domain learning through affine transformations

Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara; Bulò, Samuel Rota

doi:10.1007/s00138-020-01090-5

Cited by 4 publications

(3 citation statements)

References 60 publications

(156 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In test time, the learned binary mask is multiplied by the weights of the convolutional layer. Expanding on this idea, Mancini et al [7,8] also makes use of masks, however, they learn an affine transformation of the weights through the use of the mask and some extra parameters. Focusing on increasing the accuracy with masks, Chattopadhyay et al [2] proposes a soft-overlap loss to encourage the masks to be domain-specific by minimizing the overlap between them.…”

Section: Intersection Between Masksmentioning

confidence: 99%

Budget-Aware Pruning for Multi-domain Learning

dos Santos,

Berriel,

Oliveira-Santos

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single domain or task, requiring them to learn and store new parameters for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by learning a single model that is capable of performing well in multiple domains. Nevertheless, the models are usually larger than the baseline for a single domain. This work tackles both of these problems: our objective is to prune models capable of handling multiple domains according to a user defined budget, making them more computationally affordable while keeping a similar classification performance. We achieve this by encouraging all domains to use a similar subset of filters from the baseline model, up to the amount defined by the user's budget. Then, filters that are not used by any domain are pruned from the network. The proposed approach innovates by better adapting to resource-limited devices while, to our knowledge, being the only work that is capable of handling multiple domains at test time with fewer parameters and lower computational complexity than the baseline model for a single domain.

show abstract

Section: Intersection Between Masksmentioning

confidence: 99%

Budget-Aware Pruning for Multi-domain Learning

dos Santos,

Berriel,

Oliveira-Santos

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Since our model can identify data clusters through the previously described procedure, we can design a way to specialize the function f θ to each domain. Inspired by multi-domain learning [36,39,37,27,29], we can achieve this with domain-specific components. For simplicity, let us consider the parameters θ to be split into two sets, i.e.…”

Section: Cluster-specific Modelsmentioning

confidence: 99%

“…Note that θ s is actually a set θ s = {θ d s } D d=1 where θ d s are the parameters specific to the d-th domain. To tailor the model to a specific domain, we can consider multiple ways to include θ s , such as direct influence on the agnostic parameters θ a [39,27,29] or residual activations [36,37]. Here we follow the latter strategy, since the former relies on the robustness of θ a , which is harder to guarantee in FL.…”

Section: Cluster-specific Modelsmentioning

confidence: 99%

Cluster-driven Graph Federated Learning over Multiple Domains

Caldarola¹,

Mancini²,

Galasso³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Federated Learning (FL) deals with learning a central model (i.e. the server) in privacy-constrained scenarios, where data are stored on multiple devices (i.e. the clients). The central model has no direct access to the data, but only to the updates of the parameters computed locally by each client. This raises a problem, known as statistical heterogeneity, because the clients may have different data distributions (i.e. domains). This is only partly alleviated by clustering the clients. Clustering may reduce heterogeneity by identifying the domains, but it deprives each cluster model of the data and supervision of others.Here we propose a novel Cluster-driven Graph Federated Learning (FedCG). In FedCG, clustering serves to address statistical heterogeneity, while Graph Convolutional Networks (GCNs) enable sharing knowledge across them. FedCG: i. identifies the domains via an FL-compliant clustering and instantiates domain-specific modules (residual branches) for each domain; ii. connects the domain-specific modules through a GCN at training to learn the interactions among domains and share knowledge; and iii. learns to cluster unsupervised via teacher-student classifier-training iterations and to address novel unseen test domains via their domain soft-assignment scores. Thanks to the unique interplay of GCN over clusters, FedCG achieves the state-of-theart on multiple FL benchmarks.

show abstract