Secret information sharing through image carriers has aroused much research attention in recent years with images’ growing domination on the Internet and mobile applications. The technique of embedding secret information in images without being detected is called image steganography. With the booming trend of convolutional neural networks (CNN), neural-network-automated tasks have been embedded more deeply in our daily lives. However, a series of wrong labeling or bad captioning on the embedded images has left a trace of skepticism and finally leads to a self-confession like exposure. To improve the security of image steganography and minimize task result distortion, models must maintain the feature maps generated by task-specific networks being irrelative to any hidden information embedded in the carrier. This paper introduces a binary attention mechanism into image steganography to help alleviate the security issue, and, in the meantime, increase embedding payload capacity. The experimental results show that our method has the advantage of high payload capacity with little feature map distortion and still resist detection by state-of-the-art image steganalysis algorithms.
The same action takes different time in different cases. This difference will affect the accuracy of action recognition to a certain extent. We propose an end-to-end deep neural network called “Multi-Term Attention Networks” (MTANs), which solves the above problem by extracting temporal features with different time scales. The network consists of a Multi-Term Attention Recurrent Neural Network (MTA-RNN) and a Spatio-Temporal Convolutional Neural Network (ST-CNN). In MTA-RNN, a method for fusing multi-term temporal features are proposed to extract the temporal dependence of different time scales, and the weighted fusion temporal feature is recalibrated by the attention mechanism. Ablation research proves that this network has powerful spatio-temporal dynamic modeling capabilities for actions with different time scales. We perform extensive experiments on four challenging benchmark datasets, including the NTU RGB+D dataset, UT-Kinect dataset, Northwestern-UCLA dataset, and UWA3DII dataset. Our method achieves better results than the state-of-the-art benchmarks, which demonstrates the effectiveness of MTANs.
Image matting methods based on deep learning have made tremendous success. However, the success of previous image matting methods typically relies on a massive amount of pixel-level labeled data, which are time-consuming and costly to obtain. This paper first proposes a semi-supervised deep learning matting algorithm based on semantic consistency of trimaps (Tri-SSL), which uses trimaps to provide weakly supervised signals for the unlabeled data, to reduce the labeling cost. Tri-SSL is a single-stage semi-supervised algorithm that consists of a supervised branch and a weakly supervised branch that share the same network in one iteration during training. The supervised branch is consistent with standard supervised matting methods. In the weakly supervised branch, trimaps of different granularities are used as weakly supervised signals for unlabeled images, and the two trimaps are naturally perturbed samples. Orientation consistency constraints are imposed on the prediction results of trimaps of different granuliarty and the intermediate features of the network. Experimental results show that Tri-SSL improves model performance by effectively utilizing unlabeled data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.