This paper proposes a new algorithm for actorcritic method using online support vector regression(SVR), which can do incremental learning and automatically track variation of environment with time-varying characteristics. It gives good generalization properties to value function approximation and helps the critic converge fast. In addition, sample vectors in data set of the online SVR are used as center positions of actor's basis functions. Actor updates policy parameters with those functions using policy gradient algorithm. Throughout simulations, the feasibility and usefulness of the proposed method is demonstrated by comparison with other methods.
Despite many advances, the problem of determining the proper size of a neural network is important, especially for its practical implications in such issues as learning and generalization. Unfortunately, it is not usually obvious which size is best; a system that is too small will not be able to learn the data, while one that is just big enough may learn very slowly and be very sensitive to initial conditions and learning parameters. There are two types of approach to determining the network size: pruning and growing. Pruning consists of training a network which is larger than necessary, and then removing unnecessary weights/nodes. Here, a new pruning method is developed, based on the penalty-term method. This method makes the neural networks good for generalization, and reduces the retraining time needed after pruning weights/nodes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.