Abstract-A continuous-time formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and real-time recurrent learning (RTRL) are prevalent. Practical benefits are that this framework fits in well with plant descriptions given by differential equations and that any standard integration routine with adaptive step-size does an adaptive sampling for free. A second-order actor adaptation using Newton's method is established for fast actor convergence for a general plant and critic. Also, a fast critic update for concurrent actor-critic training is introduced to immediately apply necessary adjustments of critic parameters induced by actor updates to keep the Bellman optimality correct to first-order approximation after actor changes. Thus, critic and actor updates may be performed at the same time until some substantial error build up in the Bellman optimality or temporal difference equation, when a traditional critic training needs to be performed and then another interval of concurrent actor-critic training may resume.Index Terms-Actor-critic adaptation, adaptive critic design (ACD), approximate dynamic programming, backpropagation through time (BPTT), continuous adaptive critic designs, real-time recurrent learning (RTRL), reinforcement learning, second-order actor adaptation.
Wireless sensor networks (WSNs) are well suited for environment monitoring. However, some highly specialized sensors (e.g. hydrological sensors) have high power demand, and without due care, they can exhaust the battery supply quickly. Taking measurements with this kind of sensors can also overwhelm the communication resources by far. One way to reduce the power drawn by these high-demand sensors is adaptive sampling, i.e., to skip sampling when data loss is estimated to be low. Here, we present an adaptive sampling algorithm based on the Box-Jenkins approach in time series analysis. To measure the performance of our algorithms, we use the ratio of the reduction factor to root mean square error (RMSE). The rationale of the metric is that the best algorithm is the algorithm that gives the most reduction in the amount of sampling and yet the the smallest RMSE. For the datasets used in our simulations, our algorithm is capable of reducing the amount of sampling by 24% to 49%. For seven out of eight datasets, our algorithm performs better than the best in the literature so far in terms of the reduction/RMSE ratio.
Optimal signal detection forfalse track discrimination is determined by simulation using the Integrated Probabilistic Data Association (IPDA) algorithm. The IPDA algorithm is an efficient probabilistic data association algorithm with estimates of target existence probabilities that can be used to distinguish true andfalse tracks. The rate ofconfirmedfalse tracks is held constant and the optimal signal detection probability is determined by the maximum confirmed true tracks for a given signal to noise ratio. This allows the determination ofthe optimal detection probability which are reportedfor a range ofsignal to noise ratios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.