A distributed SVM for scalable image annotation

Alham, Nasullah Khalid; Li, Maozhen; Liu, Yang; Hammoud, Suhel; Ponraj, Mahesh

doi:10.1109/fskd.2011.6020072

Cited by 15 publications

(8 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Considering the total processing power of the cluster is P ¼ X n i¼1 p i where n represents the number of the processors employed in the cluster and p i represents the processing speed of the ith processor, for a Hadoop cluster with a total computing capacity P, the level of heterogeneity of the Hadoop cluster can be defined using Eq. (5).…”

Section: Load Balancingmentioning

confidence: 99%

“…However, with the rapid development of a variety of computer systems in the electric power grid, it has become a challenging issue to ensure a large amount of CIM data to be correct and consistent all the time.MapReduce has become a major computing model in support of data-intensive applications [4]. MapReduce facilitates a number of important functions such as partitioning the input data, scheduling MapReduce jobs across a cluster of participating nodes, handling node failures, and managing the required network communications [5]. We have implemented a MapReduce-based parallel K-means clustering for scalable information retrieval [6].…”

mentioning

confidence: 99%

See 1 more Smart Citation

A MapReduce‐based parallel K‐means clustering for large‐scale CIM data verification

Deng

Liu

et al. 2015

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

The Common Information Model (CIM) has been heavily used in electric power grids for data exchange among a number of auxiliary systems such as communication systems, monitoring systems and marketing systems. With an rapid deployment of digitalized devices in electric power networks, the volume of data continuously grows which makes verification of CIM data a challenging issue. This paper presents a parallel K-means for large scale CIM data verification based on the MapReduce computing model which has been widely taken up by the community in dealing with data intensive applications. By distributing the CIM data into a number of computers in a MapReduce cluster environment, the computation in CIM data verification is significantly improved. Furthermore, a load balancing scheme is designed to balance the workloads among the heterogeneous MapReduce computing nodes for a further improvement in computation efficiency. The performance of the parallel K-means clustering in CIM data verification is first evaluated in a small scale experimental MapReduce cluster and subsequently evaluated in a large scale simulation environment.

show abstract

Section: Load Balancingmentioning

confidence: 99%

mentioning

confidence: 99%

A MapReduce‐based parallel K‐means clustering for large‐scale CIM data verification

Deng

Liu

et al. 2015

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

show abstract

“…Catanzaro et al proposed a parallel SMO algorithm based on MapReduce, but this algorithm is somewhat inefficient because of the iterative nature of SMO. In , Alham discussed an efficient and scalable approach based on a single MapReduce phase. They announced that this method has minimal data movement between nodes and that it also minimizes communication overheads.…”

Section: Multimedia Applications In Mapreducementioning

confidence: 99%

Multimedia Applications and Security in MapReduce: Opportunities and Challenges

Wang

Thomborson

et al. 2011

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…SMO speeds up the training phase only, with no control over the number of support vectors or testing time. To achieve additional acceleration, many parallel implementations of SMO (Zeng et al 2008;Peng, Ma, and Hong 2009;Catanzaro et al 2008;Alham et al 2010;Cao et al 2006) were developed on various parallel programming platforms, including graphics processing unit (GPU) (Catanzaro et al 2008), Hadoop MapReduce (Alham et al 2010), and message passing interface (MPI) (Cao et al 2006).…”

Section: Improving Svm Computational Requirementsmentioning

confidence: 99%

Support Vector Machines for Classification

Awad¹,

Khanna²

2015

Efficient Learning Machines

352

148

View full text Add to dashboard Cite

Science is the systematic classification of experience. -George Henry LewesThis chapter covers details of the support vector machine (SVM) technique, a sparse kernel decision machine that avoids computing posterior probabilities when building its learning model. SVM offers a principled approach to machine learning problems because of its mathematical foundation in statistical learning theory. SVM constructs its solution in terms of a subset of the training input. SVM has been extensively used for classification, regression, novelty detection tasks, and feature reduction. This chapter focuses on SVM for supervised classification tasks only, providing SVM formulations for when the input space is linearly separable or linearly nonseparable and when the data are unbalanced, along with examples. The chapter also presents recent improvements to and extensions of the original SVM formulation. A case study concludes the chapter.

show abstract

A distributed SVM for scalable image annotation

Cited by 15 publications

References 14 publications

A MapReduce‐based parallel K‐means clustering for large‐scale CIM data verification

A MapReduce‐based parallel K‐means clustering for large‐scale CIM data verification

Multimedia Applications and Security in MapReduce: Opportunities and Challenges

Support Vector Machines for Classification

Contact Info

Product

Resources

About