This article describes a new rule-enhanced penalized regression procedure for the generalized regression problem of predicting scalar responses from observation vectors in the absence of a preferred functional form. It enhances standard L 1-penalized regression by adding dynamically generated rules, that is, new 0-1 covariates, corresponding to multidimensional "box" sets. In contrast to prior approaches to this class of problems, we draw heavily on standard (but non-polynomial-time) mathematical programming techniques, enhanced by parallel computing. We identify and incorporate new rules using a form of classical column generation and solve the resulting pricing subproblem, which is NP-hard, either exactly by a specialized parallel branch-and-bound method or by a greedy heuristic based on Kadane's algorithm. The resulting rule-enhanced regression method can be computation intensive when we solve the subproblems exactly, but our computational tests suggest that it outperforms prior methods at making accurate and stable predictions from relatively small data samples. Through selective use of our greedy heuristic, we can make our method's run time generally competitive with some established methods, without sacrificing prediction performance. We call our method's pricing subproblem rectangular maximum agreement. History: This paper has been accepted for the INFORMS Journal on Optimization Special Issue on Machine Learning and Optimization.
Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.