Majority of high-performing off-policy reinforcement learning algorithms use aggregated overestimation bias control techniques.However, most of them rely on a pre-defined bias correction policies that are either not flexible enough or require environment-specific tuning of hyperparameter.In this work, we present a data-driven approach for automatic bias control.We demonstrate its effectiveness on three algorithms: Truncated Quantile Critics, Weighted Delayed DDPG and Maxmin Q-learning. Our approach eliminates the need for an extensive hyperparameter search.We show that it leads to the significant reduction of the actual number of interactions while, in most cases, matching the performance of a resource demanding grid search method.While on average the reduction of the bias improves the performance, elimination of the aggregated bias does not always lead to the best performance. To the best of our knowledge, that is the first case where it is proven on complex environments which highlights the important pitfalls of overestimation control.
Bias correction techniques are used by most of the high-performing methods for off-policy reinforcement learning. However, these techniques rely on a pre-defined bias correction policy that is either not flexible enough or requires environment-specific tuning of hyperparameters. In this work, we present a simple data-driven approach for guiding bias correction. We demonstrate its effectiveness on the Truncated Quantile Critics -a stateof-the-art continuous control algorithm. The proposed technique can adjust the bias correction across environments automatically. As a result, it eliminates the need for an extensive hyperparameter search, significantly reducing the actual number of interactions and computation.
Цель исследования проанализировать возможность получения химикотермомеханической массы (ХТММ) из древесины лиственницы и провести сравнение ее характеристик с полученной аналогичным лабораторным методом ХТММ из древесины ели. Так как лиственница относится к числу самых распространенных древесных пород России, на которую приходится порядка 40 всех площадей лесных насаждений Аким и др., 2012 работа имеет большую актуальность. По строению и составу лиственница значительно отличается от других хвойных деревьев. Она относится к типичным ядровым породам. На долю ядра приходится 70 90 стволовой части дерева. Трахеиды составляют около 90 и более древесного вещества Левин и др., 1978 Бабкин и др., 2004. Поэтому было необходимо провести серию экспериментов, которые позволили оценить влияние расхода Na2SO3, температуры пропиточного раствора и времени пропитки на свойства ХТММ и на удельный расход энергии на размол. Помимо этого, характерной особенностью лиственницы является водорастворимый полисахарид арабиногалактан, содержащийся в ней в количестве около 14 (с колебаниями от 5 до 30), а также присутствие в ядре лиственницы веществ группы флавоноидов, представленных, главным образом, кверцетином и дигидрокверцетином Бабкин и др., 2004 и по учебнику Азарова и др. Химия древесины и синтетических полимеров. СПб.: Лань, 2010 г. Следовательно, в ходе работы было необходимо проанализировать влияние предварительной экстракции на свойства получаемого полуфабриката. Analyze the possibility of chemicalthermomechanical pulp (CTMP) production from larch wood and comparison of its characteristics with spruce wood CTMP, which was obtained at same laboratory conditions were the main purposes of this study. Since larch is one of the most widespread tree species in Russia, which accounts for about 40 of all forest plantations Akim et al., 2012, this work has great relevance. The structure and composition of larch wood is significantly different from other conifers. It belongs to typical heartwood species. The share of the core wood is about 70 90 of the tree trunk. Tracheids takes more than about 90 of a woody substance Levin et al., 1978 Babkin et al., 2004. Therefore, it was necessary to perform series of experiments that made it possible to evaluate the effect of Na2SO3 consumption, the temperature of impregnating solution and the impregnation time on the properties of CTMP and the specific energy consumption on refining. In addition, a characteristic feature of larch is a watersoluble polysaccharide arabinogalactan, contained in it in an amount of about 14 (with variations from 5 to 30), as well as the presence in the larch core of substances of the group of flavonoids, represented mainly by quercetin and dihydroquercetin Babkin et al., 2004 and Azarov et al., 2010. Therefore, during the work, it was necessary to analyze the effect of previous water extraction on the properties of the obtained product.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.