Automated Generation of Accurate &amp; Fluent Medical X-ray Reports

Nguyen, Hoang Dai Nghia; Nie, Dong; Badamdorj, Taivanbat; Liu, Yujie; Zhu, Yingying; Truong, Jason; Li, Cheng

doi:10.18653/v1/2021.emnlp-main.288

Cited by 25 publications

(8 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The Transformer‐based attempts to attempted to explore the automated generation of accurate and fluent X‐ray reports included 217–222 . As for the X‐ray medical report generation, You et al 219 build an AlignTransformer to reach the correspondence between the visual regions with the disease tags.…”

Section: Report Generationmentioning

confidence: 99%

“…The Transformer-based attempts to attempted to explore the automated generation of accurate and fluent X-ray reports included. [217][218][219][220][221][222] As for the X-ray medical report generation, You et al 219 build an AlignTransformer to reach the correspondence between the visual regions with the disease tags. The authors divide the task into two parts: the first part is the prediction of disease tags and the feature extraction of the relation between the images and the disease tags while the second part is to produce the medical report based on the extracted information.…”

Section: X-ray Reportmentioning

confidence: 99%

See 1 more Smart Citation

Recent advances of Transformers in medical image analysis: A comprehensive review

Xia

Wang

2023

MedComm – Future Medicine

View full text Add to dashboard Cite

Recent works have shown that Transformer's excellent performances on natural language processing tasks can be maintained on natural image analysis tasks. However, the complicated clinical settings in medical image analysis and varied disease properties bring new challenges for the use of Transformer. The computer vision and medical engineering communities have devoted significant effort to medical image analysis research based on Transformer with especial focus on scenario-specific architectural variations.In this paper, we comprehensively review this rapidly developing area by covering the latest advances of Transformer-based methods in medical image analysis of different settings. We first give introduction of basic mechanisms of Transformer including implementations of selfattention and typical architectures. The important research problems in various medical image data modalities, clinical visual tasks, organs and diseases are then reviewed systemically. We carefully collect 276 very recent works and 76 public medical image analysis datasets in an organized structure. Finally, discussions on open problems and future research directions are also provided. We expect this review to be an up-to-date roadmap and serve as a reference source in pursuit of boosting the development of medical image analysis field.

show abstract

Section: Report Generationmentioning

confidence: 99%

Section: X-ray Reportmentioning

confidence: 99%

Recent advances of Transformers in medical image analysis: A comprehensive review

Xia

Wang

2023

MedComm – Future Medicine

View full text Add to dashboard Cite

show abstract

“…This model was unsupervised because it may be trained using diverse sets of photos and reports. Nguyen et al [18] proposed Classification of Clinical history and Chest X-ray to generate embedding of diseases along with a Transformer decoder sub-modules in an a fully differentiable paradigm to generate complete diagnostic reports. To ensure consistency with disease related topics, a weighted embedding representation was fed to the interpreter.…”

Section: Medical Report Generationmentioning

confidence: 99%

Vision Transformer and Language Model Based Radiology Report Generation

et al. 2023

View full text Add to dashboard Cite

Recent advancements in transformers exploited computer vision problems which results in state-of-the-art models. Transformer-based models in various sequence prediction tasks such as language translation, sentiment classification, and caption generation have shown remarkable performance. Auto report generation scenarios in medical imaging through caption generation models is one of the applied scenarios for language models and have strong social impact. In these models, convolution neural networks have been used as encoder to gain spatial information and recurrent neural networks are used as decoder to generate caption or medical report. However, using transformer architecture as encoder and decoder in caption or report writing task is still unexplored. In this research, we explored the effect of losing spatial biasness information in encoder by using pre-trained vanilla image transformer architecture and combine it with different pre-trained language transformers as decoder. In order to evaluate the proposed methodology, the Indiana University Chest X-Rays dataset is used where ablation study is also conducted with respect to different evaluations. The comparative analysis shows that the proposed methodology has represented remarkable performance when compared with existing techniques in terms of different performance parameters.

show abstract

“…A very close framework was described by Liu et al [184], in which the authors proposed an unsupervised model knowl-edge graph auto-encoder which accepts independent sets of images and reports during training, and consists of three modules: the pre-constructed knowledge graph, that works as the shared latent space and aims to bridge the visual and textual domains; the knowledge-driven encoder which projects medical images and reports to the corresponding coordinates in that latent space; and the knowledge-driven decoder that generates a medical report given a coordinate in that latent space. This modular structure is also employed by Nguyen et al [185], which added three complementary modules: a CNN-based classification module that produces an internal checklist of disease-related topics (i.e., the enriched disease embedding); a Transformer-based generator that generates the medical reports from the enriched disease embedding and produces a weighted embedding representation; and an interpreter that uses the weighted embedding representation to ensure consistency concerning disease-related topics. Similarly, You et al [186] proposed a framework, which includes two different attention-based modules: the align hierarchical attention module that first predicts the disease tags from the input image and then learns the multi-grained visual features by hierarchically aligning the visual regions and disease tags; and the multi-grained Transformer module that uses the multi-grained features to generate the medical reports.…”

Section: Medical Report Understanding 1) Medical Report Generationmentioning

confidence: 99%

A Survey on Attention Mechanisms for Medical Applications: are we Moving Toward Better Algorithms?

et al. 2022

View full text Add to dashboard Cite

The increasing popularity of attention mechanisms in deep learning algorithms for computer vision and natural language processing made these models attractive to other research domains. In healthcare, there is a strong need for tools that may improve the routines of the clinicians and the patients. Naturally, the use of attention-based algorithms for medical applications occurred smoothly. However, being healthcare a domain that depends on high-stake decisions, the scientific community must ponder if these high-performing algorithms fit the needs of medical applications. With this motto, this paper extensively reviews the use of attention mechanisms in machine learning methods (including Transformers) for several medical applications based on the types of tasks that may integrate several works pipelines of the medical domain. This work distinguishes itself from its predecessors by proposing a critical analysis of the claims and potentialities of attention mechanisms presented in the literature through an experimental case study on medical image classification with three different use cases. These experiments focus on the integrating process of attention mechanisms into established deep learning architectures, the analysis of their predictive power, and a visual assessment of their saliency maps generated by post-hoc explanation methods. This paper concludes with a critical analysis of the claims and potentialities presented in the literature about attention mechanisms and proposes future research lines in medical applications that may benefit from these frameworks.

show abstract

Automated Generation of Accurate & Fluent Medical X-ray Reports

Cited by 25 publications

References 31 publications

Recent advances of Transformers in medical image analysis: A comprehensive review

Recent advances of Transformers in medical image analysis: A comprehensive review

Vision Transformer and Language Model Based Radiology Report Generation

A Survey on Attention Mechanisms for Medical Applications: are we Moving Toward Better Algorithms?

Contact Info

Product

Resources

About