The objective of this study is to develop a framework that can optimize control policies of a waste crane at a waste incineration plant through an autonomous trial and error manner. Since a waste crane is a massive mechanical system that moves slowly and takes several minutes to execute a task, obtaining data samples by executing tasks is very costly. Moreover, no sensors are available that can observe the state of the grasped flammable waste composed of various materials with different degrees of hardness and wetness. Therefore, the inhomogeneity of waste causes unpredictable fluctuation in the crane's task performance. To cope with these problems, we propose a framework for optimizing the policy parameters of a parameterized control policy with Multi-Task Robust Bayesian Optimization (MTRBO). Our framework features the following two characteristics: (1) outlier robustness against garbage inhomogeneity and (2) sample reuse from previously solved tasks to enhance its sample efficiency. To investigate the effectiveness of our framework, we conducted experiments on garbage-scattering tasks with (i) a robot waste crane with pseudo-garbage and (ii) an actual waste crane at a waste incineration plant. Experimental results demonstrate that our framework robustly optimized the control policies of the garbage cranes, even with a much reduced amount of data under the influence of garbage inhomogeneity.
Waste incineration plants are complex dynamical systems that rely on expert human operators to maintain steady combustion, by observing real-time in-chamber video feeds. Real-time plant forecasting provides vital operational support in decision making, and applying machine learning to automatically learn dynamics forecast models from video feeds is an attractive means to realise this. However, learning complex dynamics in systems that requires cost-efficiency remains an open research problem. Specifically, modelling plant dynamics in real-time is challenging due to uncertainties caused by inhomogeneous waste inputs, requiring complex learning that impedes real-time modelling. To address this, this paper presents a real-time data-driven framework for generating video forecasts, by incorporating task-relevant domain-knowledge, during learning. Specifically, this method combines dynamics modelling and forecasting using dynamic mode decomposition, with Fourier transformations informed by expert operator heuristic knowledge for encoding task-relevant frequency information inside the learning process. Experiments in this paper demonstrate that the proposed framework captures intuitive physical aspects of the underlying physiochemical process, with a greatly reduced computational runtime in comparison to standard approaches, allowing for application in real-time domains. Forecasted video predictions are accurate over short time horizons, and capture important system characteristics over longer time periods.
The environments of such large industrial machines as waste cranes in waste incineration plants are often weakly observable, where little information about the environmental state is contained in the observations due to technical difficulty or maintenance cost (e.g., no sensors for observing the state of the garbage to be handled). Based on the findings that skilled operators in such environments choose predetermined control strategies (e.g., grasping and scattering) and their durations based on sensor values, we propose a novel nonparametric policy search algorithm: Gaussian process selftriggered policy search (GPSTPS). GPSTPS has two types of control policies: action and duration. A gating mechanism either maintains the action selected by the action policy for the duration specified by the duration policy or updates the action and duration by passing new observations to the policy; therefore, it is categorized as self-triggered. GPSTPS simultaneously learns both policies by trial and error based on sparse GP priors and variational learning to maximize the return. To verify the performance of our proposed method, we conducted experiments on garbage-grasping-scattering task for a waste crane with weak observations using a simulation and a robotic waste crane system. As experimental results, the proposed method acquired suitable policies to determine the action and duration based on the garbage's characteristics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.