Проблема катастрофической забывчивости проявилась в моделях нейронных сетей на базе коннекционистского подхода, которые активно исследуются начиная со второй половины 20-го века. Предпринимались многочисленные попытки и были предложены различные способы решения этой проблемы, но до самого последнего времени значимых успехов достичь не удавалось. В 2016 году случился значительный прорыв – группа ученых из DeepMind предложила метод эластичного закрепления весов (EWC), который позволяет успешно бороться с проблемой катастрофической забывчивости. К сожалению, хотя нам известны случаи использования этого метода в реальных задачах, он пока не получил повсеместного распространения. В этой работе мы хотим предложить альтернативные подходы к преодолению катастрофической забывчивости, основанные на суммарном абсолютном сигнале, прошедшем через связь в нейронной сети, которые демонстрируют схожую с EWC эффективность и, при этом, имеют существенно меньшую вычислительную стоимость. Эти подходы имеют более простую реализацию и представляются нам по своей сути более близкими к процессам, происходящим в мозге животных для сохранения выученных ранее навыков при последующем обучении. Мы надеемся, что простота реализации этих методов послужит их более широкому применению. The problem of catastrophic forgetting manifested itself in models of neural networks based on the connectionist approach, which have been actively studied since the second half of the 20th century. Numerous attempts have been made and various ways to solve this problem have been proposed, but until very recently substantial successes have not been achieved. In 2016, a significant breakthrough occurred – a group of scientists from DeepMind proposed the method of elastic weight consolidation (EWC), which allows us to successfully overcome the problem of catastrophic forgetting. Unfortunately, although we were aware about the cases of using this method in real tasks, it has not yet obtained widespread distribution. In this paper, we want to propose alternative approaches for overcoming catastrophic forgetting, based on the total absolute signal passed through the connection. These approaches demonstrate similar efficiency as EWC and, at the same time, have less computational complexity. These approaches have a simpler implementation and seem to us to be essentially closer to the processes occurring in the brain of animals to preserve previously learned skills during subsequent training. We hope that the ease of implementation of these methods will serve their wider application.
This paper is devoted to the features of the practical application of Elastic Weight Consolidation method. Here we will more rigorously compare the known methodologies for calculating the importance of weights when applied to networks with fully connected and convolutional layers. We will also point out the problems that arise when applying the Elastic Weight Consolidation method in multilayer neural networks with convolutional layers and self-attention layers, and propose method to overcome these problems. In addition, we will notice an interesting fact about the use of various types of weight importance in the neural network pruning task.
In this paper we want to present the results of empirical verification of some issues concerning the methods for overcoming catastrophic forgetting in neural networks. First, in the introduction, we will try to describe in detail the problem of catastrophic forgetting and methods for overcoming it for those who are not yet familiar with this topic. Then we will discuss the essence and limitations of the WVA method which we presented in previous papers. Further, we will touch upon the issues of applying the WVA method to gradients or optimization steps of weights, choosing the optimal attenuation function in this method, as well as choosing the optimal hyper-parameters of the method depending on the number of tasks in sequential training of neural networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.