This paper experiments with well known pruning approaches, iterative and one-shot, and presents a new approach to lottery ticket pruning applied to tabular neural networks based on iterative pruning. Our contribution is a standard model for comparison in terms of speed and performance for tabular datasets that often do not get optimized through research. We show leading results in several tabular datasets that can compete with ensemble approaches. We tested on a wide range of datasets with a general improvement over the original (already leading) model in 6 of 8 datasets tested in terms of F1/RMSE. This includes a total reduction of over 85% of nodes with the additional ability to prune over 98% of nodes with minimal affect to accuracy. The new iterative approach we present will first optimize for lottery ticket quality by selecting an optimal architecture size and weights, then apply the iterative pruning strategy. The new iterative approach shows minimal degradation in accuracy compared to the original iterative approach, but it is capable of pruning models much smaller due to optimal weight pre-selection. Training and inference time improved over 50% and 10%, respectively, and up to 90% and 35%, respectively, for large datasets.
In our recent paper [1] we presented leading results in tabular datasets by applying the lottery ticket hypothesis to tabular neural networks. However, we were required to train the original large-sized model to find these lottery tickets. In this paper we eliminate the need to train the original model and discover lottery tickets using networks a fraction of the model’s size. Moreover, we show that we can remove up to 95% of the training dataset to discover lottery tickets, while still maintaining similar accuracy. The approach uses a genetic algorithm (GA) to train candidate pruned models by encoding the nodes of the original model for selection measured by performance and weight met-rics. We found that the search process does not require a large portion of the training data, but when the final pruned model is selected it can be retrained on the full dataset, even if it is often not required. We propose a lottery sample hypothesis similar to the lottery ticket hypotheses where a subsample of lottery samples of the training set can train a model with equivalent performance to the original dataset. We show that the combination of finding lottery samples alongside lottery tickets can allow for faster searches and greater accuracy.
In this paper, we present a new approach to improve tabular datasets by applying the lottery ticket hypothesis to tabular neural networks. Prior approaches were required to train the original large-sized model to find these lottery tickets. In this paper we eliminate the need to train the original model and discover lottery tickets using networks a fraction of the model’s size. Moreover, we show that we can remove up to 95% of the training dataset to discover lottery tickets, while still maintaining similar accuracy. The approach uses a genetic algorithm (GA) to train candidate pruned models by encoding the nodes of the original model for selection measured by performance and weight metrics. We found that the search process does not require a large portion of the training data, but when the final pruned model is selected it can be retrained on the full dataset, even if it is often not required. We propose a lottery sample hypothesis similar to the lottery ticket hypotheses where a subsample of lottery samples of the training set can train a model with equivalent performance to the original dataset. We show that the combination of finding lottery samples alongside lottery tickets can allow for faster searches and greater accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.