Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413518
|View full text |Cite
|
Sign up to set email alerts
|

DeVLBert

Abstract: In this paper, we propose to investigate the problem of out-ofdomain visio-linguistic pretraining, where the pretraining data distribution differs from that of downstream data on which the pretrained model will be fine-tuned. Existing methods for this problem are purely likelihood-based, leading to the spurious correlations and hurt the generalization ability when transferred to out-of-domain downstream tasks. By spurious correlation, we mean that the conditional probability of one token (object or word) given… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 52 publications
(2 citation statements)
references
References 27 publications
(28 reference statements)
0
1
0
1
Order By: Relevance
“…On the other hand, our model simply concatenates two pre-trained models without additional cross-modal pre-training, and is immediately trained with the target task. For example, ViLBERT and DeVLBERT (Zhang et al, 2020) are pre-traiend with Conceptual Captions dataset (Sharma et al, 2018), and our model is at the disadvantage of not having seen 3.3M pairs of image and captions, yet comes fairly close to those pre-trained models. Fig.…”
Section: Resultsmentioning
confidence: 89%
“…On the other hand, our model simply concatenates two pre-trained models without additional cross-modal pre-training, and is immediately trained with the target task. For example, ViLBERT and DeVLBERT (Zhang et al, 2020) are pre-traiend with Conceptual Captions dataset (Sharma et al, 2018), and our model is at the disadvantage of not having seen 3.3M pairs of image and captions, yet comes fairly close to those pre-trained models. Fig.…”
Section: Resultsmentioning
confidence: 89%
“…由度,也为元宇宙 [43,44] 等先进游戏系统创造了可 能性 [45] 。 (三)以云为中心的协同计算 联邦学习属于分布式机器学习方法,允许用户 端保留原始数据,而只将本地模型参数发送到模型 管理器,然后在不共享原始数据的情况下进行模型 聚合以及进一步的训练 [46] 。在端云协同架构下,联 邦学习被视为未来网络服务的关键基础设施,能够 保护用户数据隐私,同时充分发挥边缘数据的应用 潜能 [46] ,尽管应用前景广阔,但面临着异质性、攻 击防御、个性化方面的挑战。 在联邦学习中,异质性属于广泛认知层面的挑 战,又可细分为 3 类。① 统计异质性。不同的客户 端可能拥有非独立同分布、不平衡分布等形式的数 据,这种数据异质性会产生客户端模型漂移问题, 进而影响模型的性能 [46] 。② 模型异质性。每个客户 端可能有不同的任务和特定的要求,都希望独立设 计本地模型,这就在异质参与者之间构成知识转移 障碍,导致无法应用通用模型进行聚合或梯度操 作 [47] 。③ 设备异质性。不同参与者的设备,其存储 和计算能力可能存在差异,将导致一些参与节点出 现错误和失活,已有一些在不同阶段解决此类问题 的方法 [48] 。 在联邦学习中,系统可能面临各种安全和隐私 威胁。根据机器学习的训练和预测阶段,可将攻击 方法分为投毒攻击、推断攻击两类:前者指攻击者 可能提供错误的数据、恶意修改模型参数,试图干 扰系统学习过程 [49] ;后者指攻击者可能从模型更新 中推断出其他参与者的敏感信息,据此进行不合法 应用 [50] 。相应地,目前已有多种可用的防御机制 [51] 。 设计可抵抗不同攻击、更加安全且稳定的联邦学习 系统,是未来发展方向 [52] 。 个性化旨在解决传统联邦学习模型的局限性, 即通常仅为所有客户端提供统一的共享模型,而无 法考虑到客户端的个体差异。当客户端的本地数据 分布不均匀时,距离最近的个性化联邦学习可训练 出顾及个体差异的模型 [53,54] 。此外,可以只训练出 部分模型,共享参数和个本地参数可以在设备上同 时或交替更新 [55] 。联邦个性化不仅提高了全局模型 务大规模预训练算法 [64] 、基于混淆因子解耦的多场 景泛化算法 [65] ,能够解决端任务异构、端场景异构 大多数用户群体 [45,46,67] ,导致大多数用户无法得到有…”
unclassified