Fully Convolutional CaptionNet: Siamese Difference Captioning Attention Model

Oluwasanmi, Ariyo; Frimpong, Enoch; Aftab, Muhammad Umar; Ullah, Kifayat

doi:10.1109/access.2019.2957513

Cited by 26 publications

(20 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To validate the generalization ability of the proposed method, we conduct the experiments on a recent published Spot-the-Diff dataset, where the image pairs are mostly well aligned and their is no viewpoint change. We compare with eight SOTA methods and most of them cannot consider handling viewpoint changes: DDLA (Jhamtani and Berg-Kirkpatrick, 2018), DDUA (Park et al, 2019), SDCM (Oluwasanmi et al, 2019a), FCC (Oluwasanmi et al, 2019b), static rel-att / dyanmic rel-att (Tan et al, 2019), and M-VAM / M-VAM+RAF (Shi et al, 2020).…”

Section: Results On Spot-the-diffmentioning

confidence: 99%

Semantic Relation-aware Difference Representation Learning for Change Captioning

Tu¹,

Yao²,

Li³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Change captioning is to describe the difference in a pair of images with a natural language sentence. In this task, the distractors, such as the illumination or viewpoint change, bring the huge challenges about learning the difference representation. In this paper, we propose a semantic relation-aware difference representation learning network to explicitly learn the difference representation in the existence of distractors. Specifically, we introduce a selfsemantic relation embedding block to explore the underlying changed objects and design a cross-semantic relation measuring block to localize the real change and learn the discriminative difference representation. Besides, relying on the POS of words, we devise an attentionbased visual switch to dynamically use visual information for caption generation. Extensive experiments show that our method achieves the state-of-the-art performances on CLEVR-Change and Spot-the-Diff datasets 1 .

show abstract

Section: Results On Spot-the-diffmentioning

confidence: 99%

Semantic Relation-aware Difference Representation Learning for Change Captioning

Tu¹,

Yao²,

Li³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

show abstract

“…Furthermore, the Siamese Difference Captioning Model (SDCM) also combined techniques from deep Siamese convolutional neural network, soft attention mechanism, word embedding, and bidirectional long short-term memory [167]. e features in each image input are computed using the Siamese network, and their differences are obtained using a weighted L1 distance function.…”

Section: Unsupervised or Semisupervised Captioningmentioning

confidence: 99%

Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

et al. 2021

Self Cite

View full text Add to dashboard Cite

With the emergence of deep learning, computer vision has witnessed extensive advancement and has seen immense applications in multiple domains. Specifically, image captioning has become an attractive focal direction for most machine learning experts, which includes the prerequisite of object identification, location, and semantic understanding. In this paper, semantic segmentation and image captioning are comprehensively investigated based on traditional and state-of-the-art methodologies. In this survey, we deliberate on the use of deep learning techniques on the segmentation analysis of both 2D and 3D images using a fully convolutional network and other high-level hierarchical feature extraction methods. First, each domain’s preliminaries and concept are described, and then semantic segmentation is discussed alongside its relevant features, available datasets, and evaluation criteria. Also, the semantic information capturing of objects and their attributes is presented in relation to their annotation generation. Finally, analysis of the existing methods, their contributions, and relevance are highlighted, informing the importance of these methods and illuminating a possible research continuation for the application of semantic image segmentation and image captioning approaches.

show abstract

“…LOF achieves this by introducing a MinDist (k) parameter representing neighboring data in a particular region of consideration. In several other clustering algorithms such as K-means and fuzzy C-means, different techniques such as Euclidean distance or squared Euclidean distances [ 14 ] are established to compute the distance between data points [ 15 ].…”

Section: Related Workmentioning

confidence: 99%

Attention Autoencoder for Generative Latent Representational Learning in Anomaly Detection

Oluwasanmi

Aftab

Baagyere

et al. 2021

Sensors

Self Cite

View full text Add to dashboard Cite

Today, accurate and automated abnormality diagnosis and identification have become of paramount importance as they are involved in many critical and life-saving scenarios. To accomplish such frontiers, we propose three artificial intelligence models through the application of deep learning algorithms to analyze and detect anomalies in human heartbeat signals. The three proposed models include an attention autoencoder that maps input data to a lower-dimensional latent representation with maximum feature retention, and a reconstruction decoder with minimum remodeling loss. The autoencoder has an embedded attention module at the bottleneck to learn the salient activations of the encoded distribution. Additionally, a variational autoencoder (VAE) and a long short-term memory (LSTM) network is designed to learn the Gaussian distribution of the generative reconstruction and time-series sequential data analysis. The three proposed models displayed outstanding ability to detect anomalies on the evaluated five thousand electrocardiogram (ECG5000) signals with 99% accuracy and 99.3% precision score in detecting healthy heartbeats from patients with severe congestive heart failure.

show abstract

Fully Convolutional CaptionNet: Siamese Difference Captioning Attention Model

Cited by 26 publications

References 41 publications

Semantic Relation-aware Difference Representation Learning for Change Captioning

Semantic Relation-aware Difference Representation Learning for Change Captioning

Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

Attention Autoencoder for Generative Latent Representational Learning in Anomaly Detection

Contact Info

Product

Resources

About