This paper focuses on efficient computational optimization algorithms for the generation of micro electro discharge machining (µEDM) tool shapes. In a previous paper, the authors presented a reliable reverse modeling approach to perform such tasks based on a crater-by-crater simulation model and an outer optimization loop. Two-dimensional results were obtained but 3D tool shapes proved difficult to generate due to the high numerical cost of the simulation strategy. In this paper, a new reduced modeling optimization framework is proposed, whereby the computational optimizer is replaced by an inexpensive surrogate that is trained by examples. More precisely, an artificial neural network (ANN) is trained using a small number of full reverse simulations and subsequently used to directly generate optimal tool shapes, given the geometry of the desired workpiece cavity. In order to train the ANN efficiently, a method of data augmentation is developed, whereby multiple features from fully simulated EDM cavities are used as separate instances. The performances of two ANN are evaluated, one trained without modification of process parameters (gap size and crater shape) and the second trained with a range of process parameter instances. It is shown that in both cases, the ANN can produce unseen tool shape geometries with less than 6% deviation compared to the full computational optimization process and at virtually no cost. Our results demonstrate that optimized tool shapes can be generated almost instantaneously, opening the door to the rapid virtual design and manufacturability assessment of µEDM die-sinking operations.
No abstract
Optimizing the execution time of tensor program, e.g., a convolution, involves finding its optimal configuration. Searching the configuration space exhaustively is typically infeasible in practice. In line with recent research using TVM, we propose to learn a surrogate model to overcome this issue. The model is trained on an acyclic graph called an abstract syntax tree, and utilizes a graph convolutional network to exploit structure in the graph. We claim that a learnable graph-based data processing is a strong competitor to heuristic-based feature extraction. We present a new dataset of graphs corresponding to configurations and their execution time for various tensor programs. We provide baselines for a runtime prediction task. INTRODUCTIONCurrent deep learning frameworks, such as TensorFlow, PyTorch, allow to optimize a computational graph representation using, e.g., auto differentiation and memory management (Abadi et al., 2016;Paszke et al., 2017). However, they do not tackle optimization of hardware-specific operator-level transformations, but rely on manually tuned and vendor-specific operator libraries. Thus, there is room to further improve a computational graph by optimizing transformations for specific hardware.Recently, this gap has been filled by TVM, a compiler framework that allows both graph-and operator-level optimization in an end-to-end manner (Chen et al., 2018a). TVM specifies a configuration for an operator, e.g., a specific way of performing a convolution, and compiles the resulting tensor program to a target hardware. As a consequence, for each new workload/operator, optimization over a new configuration space must be carried out. This results in a hard optimization problem, e.g., for Nvidia GPU the search space of a single operator consists of more than 10 6 configurations.Recent efforts overcome this issue by learning how to optimize tensor programs from data (Chen et al., 2018b). Instead of running an exhaustive search over an impractically large search space, a surrogate model is trained to predict runtime for a given configuration. This model is in turn used to select the configuration that minimizes the runtime. (Chen et al., 2018b) utilizes XGBoost Guestrin, 2016) and TreeGRU (Tai et al., 2015) as surrogate models.Contribution Similar to (Chen et al., 2018b), we represent a configuration of a tensor operator as an abstract syntax tree (AST) (Allamanis et al., 2017), and extract node features using TVM. We then train a Graph Neural Network (GraphNN) on the resulting graph as the surrogate model. We claim that GraphNNs are a good fit, as, crucially, they preserve the graph structure of the AST and allow propagating information among nodes. The contribution of the paper is threefold:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.