An up-to-date comparison of state-of-the-art classification algorithms

Zhang, Chongsheng; Liu, Changchang; Zhang, Xiangliang; Almpanidis, George

doi:10.1016/j.eswa.2017.04.003

Cited by 362 publications

(202 citation statements)

References 65 publications

Supporting

Mentioning

195

Contrasting

Unclassified

Order By: Relevance

“…We refrain from using the fitness function (or similar optimisation criteria) to measure the manifold "quality" so as not to introduce bias towards any specific manifold learning method. The scikit-learn [14] implementation of the Random Forest (RF) classification algorithm (with 100 trees) is used as it is a widely used algorithm with high classification accuracy, is stable across a range of datasets, and has reasonably low computational cost [21]. While other algorithms could also be compared, we found the results to be generally consistent across algorithms, and so do not include these for brevity.…”

Section: Experiments Designmentioning

confidence: 99%

Can Genetic Programming Do Manifold Learning Too?

Lensen

Xue

Zhang

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Exploratory data analysis is a fundamental aspect of knowledge discovery that aims to find the main characteristics of a dataset. Dimensionality reduction, such as manifold learning, is often used to reduce the number of features in a dataset to a manageable level for human interpretation. Despite this, most manifold learning techniques do not explain anything about the original features nor the true characteristics of a dataset. In this paper, we propose a genetic programming approach to manifold learning called GP-MaL which evolves functional mappings from a high-dimensional space to a lower dimensional space through the use of interpretable trees. We show that GP-MaL is competitive with existing manifold learning algorithms, while producing models that can be interpreted and re-used on unseen data. A number of promising future directions of research are found in the process.

show abstract

Section: Experiments Designmentioning

confidence: 99%

Can Genetic Programming Do Manifold Learning Too?

Lensen

Xue

Zhang

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…kNN is used as an example of a very simple, distance-based classification algorithm. RF, in contrast, is much more sophisticated (using an ensemble decision-based approach) and is widely used for its high classification accuracy and applicability to a wide range of datasets [44]. We use the standard default implementations of this classifiers in the scikit-learn package [34], with k = 3 for kNN, and 100 base estimators for RF.…”

Section: Classification Accuracy As a Proxy For Manifold Qualitymentioning

confidence: 99%

Multi-objective genetic programming for manifold learning: balancing quality and dimensionality

Lensen

Zhang

Xue

2020

Genet Program Evolvable Mach

View full text Add to dashboard Cite

Manifold learning techniques have become increasingly valuable as data continues to grow in size. By discovering a lower-dimensional representation (embedding) of the structure of a dataset, manifold learning algorithms can substantially reduce the dimensionality of a dataset while preserving as much information as possible. However, state-of-the-art manifold learning algorithms are opaque in how they perform this transformation. Understanding the way in which the embedding relates to the original high-dimensional space is critical in exploratory data analysis. We previously proposed a Genetic Programming method that performed manifold learning by evolving mappings that are transparent and interpretable. This method required the dimensionality of the embedding to be known a priori, which makes it hard to use when little is known about a dataset. In this paper, we substantially extend our previous work, by introducing a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality. Our proposed approach is competitive with a range of baseline and state-of-the-art manifold learning methods, while also providing a range (front) of solutions that give different tradeoffs between quality and dimensionality. Furthermore, the learned models are shown to often be simple and efficient, utilising only a small number of features in an interpretable manner.

show abstract

“…[51] can be used as a guidance. The recent study [52] collected extensive comparison of several Machine Learning algorithms. Despite not being able to contribute much to that topic, we show that for the application discussed through this paper it is indeed the case: the DNN technique by far outperforms the more classical approaches.…”

Section: B Alternative ML Techniquesmentioning

confidence: 99%

Machine learning classification: Case of Higgs boson CP state in H→ττ decay at the LHC

et al. 2019

View full text Add to dashboard Cite

Machine Learning (ML) techniques are rapidly finding a place among the methods of High Energy Physics data analysis. Different approaches are explored concerning how much effort should be put into building highlevel variables based on physics insight into the problem, and when it is enough to rely on low-level ones, allowing ML methods to find patterns without explicit physics model.In this paper we continue the discussion of previous publications on the CP state of the Higgs boson measurement of the H → ττ decay channel with the consecutive τ ± → ρ ± ν; ρ ± → π ± π 0 and τ ± → a ± 1 ν; a ± 1 → ρ 0 π ± → 3π ± cascade decays. The discrimination of the Higgs boson CP state is studied as a binary classification problem between CP-even (scalar) and CP-odd (pseudoscalar), using Deep Neural Network (DNN). Improvements on the classification from the constraints on directly non-measurable outgoing neutrinos are discussed. We find, that once added, they enhance the sensitivity sizably, even if only imperfect information is provided. In addition to DNN we also evaluate and compare other ML methods: Boosted Trees (BT), Random Forest (RF) and Support Vector Machine (SVN).

show abstract

An up-to-date comparison of state-of-the-art classification algorithms

Cited by 362 publications

References 65 publications

Can Genetic Programming Do Manifold Learning Too?

Can Genetic Programming Do Manifold Learning Too?

Multi-objective genetic programming for manifold learning: balancing quality and dimensionality

Machine learning classification: Case of Higgs boson CP state in H→ττ decay at the LHC

Contact Info

Product

Resources

About