TransSounder: A Hybrid TransUNet-TransFuse Architectural Framework for Semantic Segmentation of Radar Sounder Data

Ghosh, Raktim; Bovolo, Francesca

doi:10.36227/techrxiv.16870633

Cited by 1 publication

(1 citation statement)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, 13 proposed the Vision Transformer (ViT) for large-scale image recognition where they applied Transformers directly to images by splitting image patches with standard linear embedding and feeding as a sequence of input tokens to Transformers. A significant amount of research activities have been carried out by incorporating the ViT as an encoder for miscellaneous applications such as image recognition, 14 semantic segmentation 15 video understanding, 16 image processing, 17 etc. Further, in the domain of medical image imaging, researchers have explored the Transformer-based architectures for medical image segmentation (TransUNet, 11 TransFuse, 12 SSFormer, 18 etc).…”

Section: Vision Transformersmentioning

confidence: 99%

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

Ghosh

Bovolo²

2022

Image and Signal Processing for Remote Sensing XXVIII

View full text Add to dashboard Cite

Radar Sounders (RSs) are sensors operating in the nadir-looking geometry (with HF or VHF bands) by transmitting modulated electromagnetic (EM) pulses and receiving the backscattering response from different subsurface targets. Recently, convolutional neural network (CNN) architectures were established for characterizing RS signals under the semantic segmentation framework. In this paper, we design a Fast Fourier Transform (FFT) based CNN-Transformer encoder to effectively capture the long-range contexts in the radargram. In our hybrid architecture, CNN models the high-dimensional local spatial contexts, and the Transformer establishes the global spatial contexts between the local spatial ones. To overcome Transformer complex self-attention layers by reducing learnable parameters; - we replace the self-attention mechanism of the Transformer with unparameterized FFT modules as depicted in FNet architecture for Natural Language Processing (NLP). The experimental results on the MCoRDS dataset indicate the capability of the CNN-Transformer encoder along with the unparameterized FFT modules to characterize the radargram with limited accuracy cost and by reducing the time consumption. A comparative analysis is carried out with the state-of-the-art Transformer-based architecture.

show abstract

Section: Vision Transformersmentioning

confidence: 99%

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

Ghosh

Bovolo²

2022

Image and Signal Processing for Remote Sensing XXVIII

View full text Add to dashboard Cite

show abstract

TransSounder: A Hybrid TransUNet-TransFuse Architectural Framework for Semantic Segmentation of Radar Sounder Data

Abstract: The radar sounder data (radargrams) are used in this research work for subsurface target characterizations.<div>We develop a hybrid Transformer-based Deep Learning framework in the domain of semantic segmentation of radar sounder data.</div>

Cited by 1 publication

References 7 publications

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

An FFT-based CNN-Transformer Encoder for Semantic Segmentation of Radar Sounder Signal

Contact Info

Product

Resources

About