Recently, relying on convolutional neural networks (CNNs), many methods for salient object detection in optical remote sensing images (ORSI-SOD) are proposed. However, most methods ignore the huge parameters and computational cost brought by CNNs, and only a few pay attention to the portability and mobility. To facilitate practical applications, in this paper, we propose a novel lightweight network for ORSI-SOD based on semantic matching and edge alignment, termed SeaNet. Specifically, SeaNet includes a lightweight MobileNet-V2 for feature extraction, a dynamic semantic matching module (DSMM) for high-level features, an edge self-alignment module (ESAM) for low-level features, and a portable decoder for inference. First, the high-level features are compressed into semantic kernels. Then, semantic kernels are used to activate salient object locations in two groups of high-level features through dynamic convolution operations in DSMM. Meanwhile, in ESAM, cross-scale edge information extracted from two groups of low-level features is self-aligned through L2 loss and used for detail enhancement. Finally, starting from the highest-level features, the decoder infers salient objects based on the accurate locations and fine details contained in the outputs of the two modules. Extensive experiments on two public datasets demonstrate that our lightweight SeaNet not only outperforms most state-of-the-art lightweight methods but also yields comparable accuracy with state-of-theart conventional methods, while having only 2.76M parameters and running with 1.7G FLOPs for 288×288 inputs. Our code and results are available at https://github.com/MathLee/SeaNet.
Recent studies have demonstrated that neural networks exhibit excellent performance in information hiding and image domain transfer. Considering the tremendous progress that deep learning has made in image recognition, we explore whether neural networks can recognize the imperceptible image in the transferred domain. Our target is to transfer natural images into images that belong to a different domain, while at the same time, the attribute of natural images can be recognized on domain transferred images directly. To address this issue, we proposed domain transferred image recognition to achieve image recognition directly on the transferred images without the original images. In our proposed system, a generator is designed for the domain transfer and a recognizer is responsible for image recognition. To be flexible for the natural image restoration in some cases, we also incorporate an additional generator in our method. In addition, a discriminator will play an indispensable role in the image domain transfer. Finally, we demonstrate that our method can successfully identify the natural images on transferred images without access to original images.
Semantic segmentation is important for scene understanding. To address the scenes of adverse illumination conditions of natural images, thermal infrared (TIR) images are introduced. Most existing RGB-T semantic segmentation methods follow three cross-modal fusion paradigms, i.e., encoder fusion, decoder fusion, and feature fusion. Some methods, unfortunately, ignore the properties of RGB and TIR features or the properties of features at different levels. In this paper, we propose a novel feature fusion-based network for RGB-T semantic segmentation, named LASNet, which follows three steps of location, activation, and sharpening. The highlight of LASNet is that we fully consider the characteristics of cross-modal features at different levels, and accordingly propose three specific modules for better segmentation. Concretely, we propose a Collaborative Location Module (CLM) for high-level semantic features, aiming to locate all potential objects. We propose a Complementary Activation Module for middle-level features, aiming to activate exact regions of different objects. We propose an Edge Sharpening Module (ESM) for lowlevel texture features, aiming to sharpen the edges of objects. Furthermore, in the training phase, we attach a location supervision and an edge supervision after CLM and ESM, respectively, and impose two semantic supervisions in the decoder part to facilitate network convergence. Experimental results on two public datasets demonstrate that the superiority of our LASNet over relevant state-of-the-art methods. The code and results of our method are available at https://github.com/MathLee/LASNet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.