The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn.
Feature extraction is an important process for the automatic recognition of synthetic aperture radar targets, but the rising complexity of the recognition network means that the features are abstractly implied in the network parameters and the performances are difficult to attribute. We propose the modern synergetic neural network (MSNN), which transforms the feature extraction process into the prototype self-learning process by the deep fusion of an autoencoder (AE) and a synergetic neural network. We prove that nonlinear AEs (e.g., stacked and convolutional AE) with ReLU activation functions reach the global minimum when their weights can be divided into tuples of M-P inverses. Therefore, MSNN can use the AE training process as a novel and effective nonlinear prototypes self-learning module. In addition, MSNN improves learning efficiency and performance stability by making the codes spontaneously converge to one-hots with the dynamics of Synergetics instead of loss function manipulation. Experiments on the MSTAR dataset show that MSNN achieves state-of-the-art recognition accuracy. The feature visualization results show that the excellent performance of MSNN stems from the prototype learning to capture features that are not covered in the dataset. These representative prototypes ensure the accurate recognition of new samples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.