Decision Tree as an Accelerator for Support Vector Machines

Chang, Fu; Liu, Chan-Cheng

doi:10.5772/52227

Cited by 12 publications

(5 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Latent-lSVM in Do and Poulet (2019) partitions the training data set with latent Dirichlet allocation (Blei et al , 2003). DTSVM (Chang et al , 2010; Chang and Liu, 2012) and t SVM (Do and Poulet, 2017) use the decision tree algorithm (Breiman et al , 1984; Quinlan, 1993) to split the full data set into disjoint regions (tree leaves) and then the algorithm builds the local SVMs for classifying the individuals in tree leaves. These algorithms aim at speeding up the learning time.…”

Section: Discussion On Related Workmentioning

confidence: 99%

Incremental and parallel proximal SVM algorithm tailored on the Jetson Nano for the ImageNet challenge

2022

IJWIS

View full text Add to dashboard Cite

Purpose This paper aims to propose the new incremental and parallel training algorithm of proximal support vector machines (Inc-Par-PSVM) tailored on the edge device (i.e. the Jetson Nano) to handle the large-scale ImageNet challenging problem. Design/methodology/approach The Inc-Par-PSVM trains in the incremental and parallel manner ensemble binary PSVM classifiers used for the One-Versus-All multiclass strategy on the Jetson Nano. The binary PSVM model is the average in bagged binary PSVM models built in undersampling training data block. Findings The empirical test results on the ImageNet data set show that the Inc-Par-PSVM algorithm with the Jetson Nano (Quad-core ARM A57 @ 1.43 GHz, 128-core NVIDIA Maxwell architecture-based graphics processing unit, 4 GB RAM) is faster and more accurate than the state-of-the-art linear SVM algorithm run on a PC [Intel(R) Core i7-4790 CPU, 3.6 GHz, 4 cores, 32 GB RAM]. Originality/value The new incremental and parallel PSVM algorithm tailored on the Jetson Nano is able to efficiently handle the large-scale ImageNet challenge with 1.2 million images and 1,000 classes.

show abstract

Section: Discussion On Related Workmentioning

confidence: 99%

Incremental and parallel proximal SVM algorithm tailored on the Jetson Nano for the ImageNet challenge

2022

IJWIS

View full text Add to dashboard Cite

show abstract

“…krSVM (Do & Poulet, 2015) is to learn the random ensemble of kSVM models. DTSVM (Chang, Guo, Lin, & Lu, 2010;Chang & Liu, 2012) and tSVM (Do & Poulet, 2016b) use decision tree algorithms (Breiman, Friedman, Olshen, & Stone, 1984;Quinlan, 1993) to split the full training dataset into t terminal-nodes (tree leaves); follow which the tSVM algorithm builds local SVM models for classifying impurity terminal-nodes (with a mixture of labels) while DTSVM learns local SVM models from all tree leaves. These algorithms are shown to reduce the computational cost for dealing with large datasets while maintaining the prediction correctness.…”

Section: Discussion On Related Workmentioning

confidence: 99%

Decision trees using local support vector regression models for large datasets

Tran-Nguyen

Bui

2019

Journal of Information and Telecommunication

View full text Add to dashboard Cite

Our proposed decision trees using local support vector regression models (tSVR, rtSVR) aim to efficiently handle the regression task for large datasets. The learning algorithm tSVR of regression models is done by two main steps. The first one is to construct a decision tree regressor for partitioning the full training dataset into k terminal-nodes (subsets), followed which the second one is to learn the SVR model from each terminal-node to predict the data locally in a parallel way on multi-core computers. The algorithm rtSVR learns the random forest of decision trees with local SVR models for improving the prediction correctness against the tSVR model alone. The performance analysis shows that our algorithms tSVR, rtSVR are efficient in terms of the algorithmic complexity and the generalization ability compared to the classical SVR. The experimental results on five large datasets from UCI repository showed that proposed tSVR and rtSVR algorithms are faster than the standard SVR in training the non-linear regression model from large datasets while achieving the high correctness in the prediction. Typically, the average training time of tSVR and rtSVR are 1282.66 and 482.29 times faster than the standard SVR; Furthermore, tSVR and rtSVR improve 59.43%, 63.70% of the relative prediction correctness compared to the standard SVR. ARTICLE HISTORY

show abstract

“…More recent k SVM , k r SVM (random ensemble of k SVM), and t SVM propose to parallely train the local nonlinear SVMs instead of weighting linear ones of CSVM. DTSVM uses the decision tree algorithm to split the full dataset into disjoint regions (tree leaves), and then the algorithm builds the local SVMs for classifying the individuals in tree leaves. These algorithms aim at speeding up the learning time.…”

Section: Discussion On Related Workmentioning

confidence: 99%

Latent‐lSVM classification of very high‐dimensional and large‐scale multi‐class datasets

Nghi

Poulet

2017

Concurrency and Computation

View full text Add to dashboard Cite

We propose a new parallel learning algorithm of latent local support vector machines (SVM), called latent-lSVM for effectively classifying very high-dimensional and large-scale multi-class datasets. The common framework of texts/images classification tasks using the Bag-Of-(visual)-Words model for the data representation leads to hard classification problem with thousands of dimensions and hundreds of classes. Our latent-lSVM algorithm performs these complex tasks into two main steps. The first one is to use latent Dirichlet allocation for assigning the datapoint (text/image) to some topics (clusters) with the corresponding probabilities. This aims at reducing the number of classes and the number of datapoints in the cluster compared to the full dataset, followed by the second one: to learn in a parallel way nonlinear SVM models to classify data clusters locally. The numerical test results on nine real datasets show that the latent-lSVM algorithm achieves very high accuracy compared to state-of-the-art algorithms. An example of its effectiveness is given with an accuracy of 70.14% obtained in the classification of Book dataset having 100 000 individuals in 89 821 dimensional input space and 661 classes in 11.2 minutes using a PC Intel(R) Core i7-4790 CPU, 3.6 GHz, 4 cores. KEYWORDSLatent Dirichlet allocation (LDA), high-dimensional and large-scale multi-class data classification, parallel learning on multi-core computers, support vector machines (SVMs) INTRODUCTIONThere are more and more multimedia data stored electronically, with increasing number of internet users and mobile internet access sharing videos, songs, or photos. There are more than 1 billion daily active users-nearly one-third of all people on the Internet (around 46% of the world population)-on Youtube and Facebook (Amazon and Yahoo! have even more), 600 000 hours (68 years) of videos are uploaded on Youtube every day, and 46 000 years are viewed at the same time. Almost all mobile phones can take photos: 2 trillions photos will be shared this year. There are 310 millions Twitter users and more than 600 millions Weibo users (the "Chinese Twitter';' Asia is the first Internet region with more than 50% of Internet users in 2016). The number of data is always increasing and their sizes too: 4K or 3D videos, sound in Dolby 5.1, higher and higher photo resolution, text messages replaced by voice messages. This leads to very huge amount of data; there is a need for high performance classification algorithms in order to help us find what we are looking for. We present a new fast and accurate parallel local support vector machine (SVM) algorithm for the classification of very large scale and high-dimensional multi-class datasets. The experimental results are performed on two different kinds of datasets: image and text classification.The classification of texts/images is one of the important research topics in text mining, computer vision, and machine learning. The purpose is to ask a computer to assign the predefined class label to the text/image. The popu...

show abstract

Decision Tree as an Accelerator for Support Vector Machines

Cited by 12 publications

References 17 publications

Incremental and parallel proximal SVM algorithm tailored on the Jetson Nano for the ImageNet challenge

Incremental and parallel proximal SVM algorithm tailored on the Jetson Nano for the ImageNet challenge

Decision trees using local support vector regression models for large datasets

Latent‐lSVM classification of very high‐dimensional and large‐scale multi‐class datasets

Contact Info

Product

Resources

About