Oliver Sampson scite author profile

We demonstrate that the previously introduced Widening framework is applicable to state-of-the-art Machine Learning algorithms. Using Krimp, an itemset mining algorithm, we show that parallelizing the search finds better solutions in nearly the same time as the original, sequential/greedy algorithm. We also introduce Reverse Standard Candidate Order (RSCO) as a candidate ordering heuristic for Krimp. 1 Introduction Research into parallelism in Machine Learning has primarily focused on reducing the execution time of existing algorithms, e.g., parallelized k-Means [23,17,14,26] and Dbscan [11,4,7]. There have been some exceptions, such as metalearning and ensemble methods [9], which have employed heterogeneous algorithms in parallel, and [3], which describes the application to simple examples. Recent work [2,15] describes Widening, a framework for employing parallel resources to increase accuracy. With Widening, measures of diversity are used to guarantee the parallel search paths' exploration of disparate regions within a solution space, thereby stepping around the common greedy algorithmic tendency to find local optima. Thus far, work has concentrated on a proof-of-concept and demonstrative application to algorithms for solving the Set Cover Problem and the creation of Decision Trees. This document describes the same approach, but with a state-of-the-art algorithm, Krimp [24]. Krimp finds "interesting" itemsets from a transactional database via the Minimum Description Length (MDL) principle [21]. The authors summarize the method as "the best set of patterns [being] the set of patterns that describes the data best," where the best set of itemsets is the set that provides the highest compression using MDL. The algorithm not only provides a solution to the problem of pattern explosion, thereby greatly reducing the set of itemsets used to generate association rules, but provides exceptional performance in other applications such as classification [24]. This paper demonstrates that it is possible to apply Widening to find even more interesting sets of itemsets than those found by the standard Krimp algorithm.

show abstract

Widened Learning of Bayesian Network Classifiers

Sampson

Berthold

2016

View full text Add to dashboard Cite

Abstract. We demonstrate the application of Widening to learning performant Bayesian Networks for use as classifiers. Widening is a framework for utilizing parallel resources and diversity to find models in a hypothesis space that are potentially better than those of a standard greedy algorithm. This work demonstrates that widened learning of Bayesian Networks, using the Frobenius Norm of the networks' graph Laplacian matrices as a distance measure, can create Bayesian networks that are better classifiers than those generated by popular Bayesian Network algorithms.

show abstract

Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors

Sampson

Borgelt

Berthold

2018

View full text Add to dashboard Cite

Widening is a method where parallel resources are used to find better solutions from greedy algorithms instead of merely trying to find the same solutions more quickly. To date, every example of Widening has used some form of communication between the parallel workers to maintain their distances from one another in the model space. For the first time, we present a communication-free, widened extension to a standard machine learning algorithm. By using Locality Sensitive Hashing on the Bayesian networks' Fiedler vectors, we demonstrate the ability to learn classifiers superior to those of standard implementations and to those generated with a greedy heuristic alone.

show abstract

Widened Learning of Index Tracking Portfolios

Gavriushina

Sampson

Berthold

et al. 2019

View full text Add to dashboard Cite

Index investing has an advantage over active investment strategies, because less frequent trades results in lower expenses, yielding higher long-term returns. Index tracking is a popular investment strategy that attempts to find a portfolio replicating the performance of a collection of investment vehicles. This paper considers index tracking from the perspective of solution space exploration. Three search space heuristics in combination with three portfolio tracking error methods are compared in order to select a tracking portfolio with returns that mimic a benchmark index. Experimental results conducted on real-world datasets show that Widening, a metaheuristic using diverse parallel search paths, finds superior solutions than those found by the reference heuristics. Presented here are the first results using Widening on time-series data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Oliver Sampson

Prediction of Failures in the Air Pressure System of Scania Trucks Using a Random Forest and Feature Engineering

Widened KRIMP: Better Performance through Diverse Parallelism

Widened Learning of Bayesian Network Classifiers

Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors

Widened Learning of Index Tracking Portfolios

Contact Info

Product

Resources

About