We study combinatorial problems with real world applications such as machine scheduling, routing, and assignment. We propose a method that combines Reinforcement Learning (RL) and planning. This method can equally be applied to both the offline, as well as online, variants of the combinatorial problem, in which the problem components (e.g., jobs in scheduling problems) are not known in advance, but rather arrive during the decision-making process. Our solution is quite generic, scalable, and leverages distributional knowledge of the problem parameters.
We frame the solution process as an MDP, and take a Deep Q-Learning approach wherein states are represented as graphs, thereby allowing our trained policies to deal with arbitrary changes in a principled manner.
Though learned policies work well in expectation, small deviations can have substantial negative effects in combinatorial settings. We mitigate these drawbacks by employing our graph-convolutional policies as non-optimal heuristics in a compatible search algorithm, Monte Carlo Tree Search, to significantly improve overall performance. We demonstrate our method on two problems: Machine Scheduling and Capacitated Vehicle Routing. We show that our method outperforms custom-tailored mathematical solvers, state of the art learning-based algorithms, and common heuristics, both in computation time and performance.
We consider the task of learning control policies for a robotic mechanism striking a puck in an air hockey game. The control signal is a direct command to the robot's motors. We employ a model free deep reinforcement learning framework to learn the motoric skills of striking the puck accurately in order to score. We propose certain improvements to the standard learning scheme which make the deep Q-learning algorithm feasible when it might otherwise fail. Our improvements include integrating prior knowledge into the learning scheme, and accounting for the changing distribution of samples in the experience replay buffer. Finally we present our simulation results for aimed striking which demonstrate the successful learning of this task, and the improvement in algorithm stability due to the proposed modifications.
To introduce and examine a single session of spatial skill training as an efficient means of improving surgical suturing performance in robot-assisted surgery. Design: A randomized, controlled trial. Setting: A tertiary university medical center in Israel. Participants: A purposive sample composed of 41 residents with no robotic suturing skills. Interventions: A computer-based simulator training of spatial skills. Measurements and Main Results: Participants were randomly assigned to training (n = 21: mean age of 34 years [standard deviation (SD) = 1.92]) and control (n = 20: mean age of 32 years [SD = 3.17]) conditions. The training group underwent a session of spatial skills training, whereas the control group engaged in a neutral activity. After 1 participant was lost to the follow-up of the posttraining performance test, data of 40 participants were analyzed. Robotic suturing task performance with the da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA) was evaluated using the da Vinci Skills Simulator built-in measure of "excess tissue piercing" and an expert rating of "tissue tearing." The mean number of excess tissue piercing after training (but not after the neutral activity) was significantly lower than before training (3.25 [SD = 1.996] vs 6.75 [SD = 3.68], respectively; p <.001), reflecting an improvement of 52% (decreasing the mean number of excess tissue piercing in a single suture by 3.5 excess piercing trials). After the interventions, the extent of tissue tearing was rated lower in the training group (p = .01), and there was no change in the control group (p = .14).
Conclusion:We showed the efficiency of a training approach that focuses on spatial skills critical in robot-assisted surgery. We showed that surgeons who received a 1 session spatial skill training with a cognitive spatial skill trainer immediately improved the performance of a robotic suturing task compared with surgeons who did not receive such training.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.