Super-Resolution by Image Enhancement Using Texture Transfer

Ople, Jose Jaena Mari; Tan, Daniel Stanley; Azcarraga, Arnulfo P.; Yang, Chao-Lung; Hua, Kai-Lung

doi:10.1109/icip40778.2020.9190844

Cited by 6 publications

(1 citation statement)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the past few years, the advent of deep learning allowed automatic feature extraction. Advancements like Convolutional Neural Networks (CNNs) are now state-of-the-art in many video/image related tasks [8]- [10]. The shift away from handcrafted features allowed fire detection approaches to be more robust and adaptive to real-world settings.…”

Section: Introductionmentioning

confidence: 99%

Spatio-Temporal Self-Attention Network for Fire Detection and Segmentation in Video Surveillance

Shahid

Virtusio

et al. 2022

IEEE Access

Self Cite

View full text Add to dashboard Cite

Convolutional Neural Network (CNN) based approaches are popular for various image/video related tasks due to their state-of-the-art performance. However, for problems like object detection and segmentation, CNNs still suffer from objects with arbitrary shapes or sizes, occlusions, and varying viewpoints. This problem makes it mostly unsuitable for fire detection and segmentation since flames can have an unpredictable scale and shape. In this paper, we propose a method that detects and segments fireregions with special considerations of their arbitrary sizes and shapes. Specifically, our approach uses a self-attention mechanism to augment spatial characteristics with temporal features, allowing the network to reduce its reliance on spatial factors like shape or size and take advantage of robust spatial-temporal dependencies. As a whole, our pipeline has two stages: In the first stage, we take out region proposals using Spatial-Temporal features, and in the second stage, we classify whether each region proposal is flame or not. Due to the scarcity of generous fire datasets, we adopt a transfer learning strategy to pre-train our classifier with the ImageNet dataset. Additionally, our Spatial-Temporal Network only requires semi-supervision, where it only needs one ground-truth segmentation mask per frame-sequence input. The experimental results of our proposed method significantly outperform the state-of-the-art fire detection with a 2 ∼ 4% relative enhancement in F1-score for large scale fires and a nearly ∼ 60% relative improvement for small fires at a very early stage.

show abstract

Section: Introductionmentioning

confidence: 99%