2018
DOI: 10.1038/sdata.2018.251
|View full text |Cite
|
Sign up to set email alerts
|

A dataset of clinically generated visual questions and answers about radiology images

Abstract: Radiology images are an essential part of clinical decision making and population screening, e.g., for cancer. Automated systems could help clinicians cope with large amounts of images by answering questions about the image contents. An emerging area of artificial intelligence, Visual Question Answering (VQA) in the medical domain explores approaches to this form of clinical decision support. Success of such machine learning tools hinges on availability and design of collections composed of medical images augm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
87
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 174 publications
(88 citation statements)
references
References 21 publications
1
87
0
Order By: Relevance
“…VQA accuracy (%) Open-ended Close-ended VGG-16 (finetuning) [10] 24 CDAE, by firstly pretraining as described in Section 4.2, then finetuning, the finetuning significantly improves the performance over the training from scratch using only VQA-RAD. In addition, the results also show that our pretraining and finetuning of MAML and CDAE give better performance than the finetuning of VGG-16 which is pretrained on the ImageNet dataset.…”
Section: Reference Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…VQA accuracy (%) Open-ended Close-ended VGG-16 (finetuning) [10] 24 CDAE, by firstly pretraining as described in Section 4.2, then finetuning, the finetuning significantly improves the performance over the training from scratch using only VQA-RAD. In addition, the results also show that our pretraining and finetuning of MAML and CDAE give better performance than the finetuning of VGG-16 which is pretrained on the ImageNet dataset.…”
Section: Reference Methodsmentioning
confidence: 99%
“…Image feature f v and question embedding f q are fed into an attention mechanism (BAN [9] or SAN [19]) to produce a joint representation f a . This feature f a is used as input for a multi-class classifier (over the set of predefined answer classes [10]). To train the proposed model, we introduce a multi-task loss function to incorporate the effectiveness of the CDAE to VQA.…”
Section: 3mentioning
confidence: 99%
See 3 more Smart Citations