2023
DOI: 10.48550/arxiv.2301.13155
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Advancing Radiograph Representation Learning with Masked Record Modeling

Abstract: Modern studies in radiograph representation learning (R 2 L) rely on either selfsupervision to encode invariant semantics or associated radiology reports to incorporate medical expertise, while the complementarity between them is barely noticed. To explore this, we formulate the self-and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM). In practice, MRM reconstructs masked image patches and masked report tokens following a multi-task scheme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…In recent years, contrastive learning, which trains the models to distinguish between positive and negative pairs, has achieved state-of-the-art performances in visual representation learning (Chen et al, 2020a;He et al, 2020;Chen et al, 2020b) and vision-language representation learning (Radford et al, 2021). Inspired by the great success of contrastive learning, several works (Huang et al, 2021;Zhang et al, 2020a;Boecking et al, 2022;Zhou et al, 2023Wang et al, 2022b) have been proposed to learn robust and accurate medical vision-language representations, which can be used to achieve promising results on various downstream tasks. However, most existing works focus on contrasting images with entire reports, ignoring the potential phenotypes in each sentence within the reports.…”
Section: Contrastive Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…In recent years, contrastive learning, which trains the models to distinguish between positive and negative pairs, has achieved state-of-the-art performances in visual representation learning (Chen et al, 2020a;He et al, 2020;Chen et al, 2020b) and vision-language representation learning (Radford et al, 2021). Inspired by the great success of contrastive learning, several works (Huang et al, 2021;Zhang et al, 2020a;Boecking et al, 2022;Zhou et al, 2023Wang et al, 2022b) have been proposed to learn robust and accurate medical vision-language representations, which can be used to achieve promising results on various downstream tasks. However, most existing works focus on contrasting images with entire reports, ignoring the potential phenotypes in each sentence within the reports.…”
Section: Contrastive Learningmentioning
confidence: 99%
“…As we can see, although contrastive learning has been well explored for medical representation learning in existing literature (Huang et al, 2021;Zhang et al, 2020a;Boecking et al, 2022;Zhou et al, 2023Wang et al, 2022b), they encode the entire report y i for training.…”
Section: Phenotypeclipmentioning
confidence: 99%
“…However, collecting labelled data for a rare disease is expensive and time-consuming. To this end, as shown in Figure 1, we propose the Unsupervised Learning from Unlabelled Medical Images and Text (ULUMIT) framework for radiograph representation learning 26,27,28,29 , which deals with the situation where the labelled data are scarce, to shorten this time-frame, allowing us to respond quickly in future to rare diseases.…”
Section: Introductionmentioning
confidence: 99%
“…439mark datasets for common disease classification tasks. We 440 follow previous works64,58,59,72,66 to pre-process the datasets 441 and perform the evaluation. As we can see fromTable 5, 442 with limited labels (1%), our method can achieve competi-443 tive results with previous fully-supervised methods trained 444 on full labels across all evaluation datasets.…”
mentioning
confidence: 99%