2021
DOI: 10.48550/arxiv.2103.13511
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Addressing catastrophic forgetting for medical domain expansion

Abstract: Model brittleness is a key concern when deploying deep learning models in real-world medical settings. A model that has high performance at one institution may suffer a significant decline in performance when tested at other institutions. While pooling datasets from multiple institutions and re-training may provide a straightforward solution, it is often infeasible and may compromise patient privacy. An alternative approach is to fine-tune the model on subsequent institutions after training on the original ins… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(11 citation statements)
references
References 35 publications
0
11
0
Order By: Relevance
“…A schematic description of FedAVG [43] and CWT [7] is illustrated in Figure 1. At its core, FL presents a challenge of data heterogeneity in the distributions of training data across clients, which causes non-guaranteed convergence and model weight divergence for parallel FL methods [20,33,62], and severe catastrophic forgetting problem for serial FL methods [7,16,53]. Among recent developments to the classic parallel FedAVG algorithm [43] have included using server momentum (FedAVGM) to mitigate per-client distribution shift and imbalance [21], globally sharing small subsets of data among all users (FedAVG-Share) [62], using a proximal term to the local objective (FedProx) to reduce potential weight divergence under severely heterogeneous devices [33], and constructing a shared global model in a layer-wise manner by matching and averaging hidden elements (FedMA) [59].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…A schematic description of FedAVG [43] and CWT [7] is illustrated in Figure 1. At its core, FL presents a challenge of data heterogeneity in the distributions of training data across clients, which causes non-guaranteed convergence and model weight divergence for parallel FL methods [20,33,62], and severe catastrophic forgetting problem for serial FL methods [7,16,53]. Among recent developments to the classic parallel FedAVG algorithm [43] have included using server momentum (FedAVGM) to mitigate per-client distribution shift and imbalance [21], globally sharing small subsets of data among all users (FedAVG-Share) [62], using a proximal term to the local objective (FedProx) to reduce potential weight divergence under severely heterogeneous devices [33], and constructing a shared global model in a layer-wise manner by matching and averaging hidden elements (FedMA) [59].…”
Section: Related Workmentioning
confidence: 99%
“…This problem is particularly prevalent in FL over healthcare data since input images captured by different institutions may vary significantly in local patterns (difference in intensity, contrast, etc.) due to different medical imaging protocols [16,50], as well as in natural data splits across users due to user idiosyncrasies in speaking [30], typing [17], and writing [27]. On the other hand, ViTs have been shown to be significantly less biased towards local patterns as compared to CNNs and instead use self-attention to learn global interactions [48], which may contribute to their surprising robustness to distribution shifts and adversarial perturbations [3,44].…”
Section: Transformers Generalize Better In the Non-iid Settingmentioning
confidence: 99%
See 2 more Smart Citations
“…Subsequently, several papers have appeared that further refine [6] and develop [7], [8], [9], [10] the methodology proposed in [5]. There are also examples of applications of the Elastic Weight Consolidation method in applied tasks [11], [12], [13] and its comparative evaluations for different neural network architectures [12].…”
Section: Introductionmentioning
confidence: 99%