A survey on data stream clustering and classification

Nguyen, Hung D.; Woon, Yew-Kwong; Ng, Wee-Keong

doi:10.1007/s10115-014-0808-1

Cited by 275 publications

(137 citation statements)

References 93 publications

Supporting

Mentioning

119

Contrasting

Unclassified

Order By: Relevance

“…For example, the clustering algorithm PCA, k ‐means attempts to find k representative groups according to the relative distance of points in R m . The main characteristic of this second type of learning is that algorithms do not have access to labels, therefore the problem is no longer to find a map but instead analyze how points are organized in the input space . Application of intelligent computing can be studied Vasant et al, Panda et al, Abu Zaher et al and integration of SA and clustering algorithm is elaborated in Seifollahi …”

Section: Mathematical Formulation and The Methodologymentioning

confidence: 99%

Efficiency analysis for stochastic dynamic facility layout problem using meta‐heuristic, data envelopment analysis and machine learning

Tayal

Köse

Solanki

et al. 2019

Computational Intelligence

View full text Add to dashboard Cite

The facility layout problem (FLP) is a combinatorial optimization problem. The performance of the layout design is significantly impacted by diverse, multiple factors. The use of algorithmic or procedural design methodology in ranking and identification of efficient layout is ineffective. In this context, this study proposes a three‐stage methodology where data envelopment analysis (DEA) is augmented with unsupervised and supervised machine learning (ML). In stage 1, unsupervised ML is used for the clustering of the criteria in which the layouts need to be evaluated using homogeneity. Layouts are generated using simulated annealing, chaotic simulated annealing, and hybrid firefly algorithm/chaotic simulated annealing meta‐heuristics. In stage 2, the nonparametric DEA approach is used to identify efficient and inefficient layouts. Finally, supervised ML utilizes the performance frontiers from DEA (efficiency scores) to generate a trained model for getting the unique rankings and predicted efficiency scores of layouts. The proposed methodology overcomes the limitations associated with large datasets that contain many inputs / outputs from the conventional DEA and improves the prediction accuracy of layouts. A Gaussian distribution product demand dataset for time period T = 5 and facility size N = 12 is used to prove the effectiveness of the methodology.

show abstract

Section: Mathematical Formulation and The Methodologymentioning

confidence: 99%

Efficiency analysis for stochastic dynamic facility layout problem using meta‐heuristic, data envelopment analysis and machine learning

Tayal

Köse

Solanki

et al. 2019

Computational Intelligence

View full text Add to dashboard Cite

show abstract

“…Most of the conventional learning techniques assume that there is a static dataset generated by an unknown yet stationary probability distribution, which can be stored and analyzed in multiple steps. Nevertheless, none of the latter assumptions are verifiable in several streaming scenarios and the development of new learners must account for several constraints [1,2,10,21,22,30,33]:…”

Section: Learning From Data Streamsmentioning

confidence: 99%

On Dynamic Feature Weighting for Feature Drifting Data Streams

Barddal

Gomes

Enembreck

et al. 2016

Machine Learning and Knowledge Discovery in Databases

View full text Add to dashboard Cite

“…Over the last few years, many real-world applications that generate continuous streams of data have emerged (Nguyen, Woon, & Ng, 2015). For efficient interpretation of these streams of data, a timely and meaningful classification process is required.…”

Section: Introductionmentioning

confidence: 99%

“…A classification process involves using a set of training data to learn a computational model (classifier) and then employing the developed model to classify a previously unseen stream of data (Dongre & Malik, 2014). Classical learning methods perform classification tasks off-line using a classifier trained on streams of data gathered in the past (Nguyen et al, 2015). However, several applications require on-line classification.…”

Section: Introductionmentioning

confidence: 99%

Adaptive SVM for Data Stream Classification

Abdulkarim¹

2017

SACJ

View full text Add to dashboard Cite

In this paper, we address the problem of learning an adaptive classifier for the classification of continuous streams of data. We present a solution based on incremental extensions of the Support Vector Machine (SVM) learning paradigm that updates an existing SVM whenever new training data are acquired. To ensure that the SVM effectiveness is guaranteed while exploiting the newly gathered data, we introduce an on-line model selection approach in the incremental learning process. We evaluated the proposed method on real world applications including on-line spam email filtering and human action classification from videos. Experimental results show the effectiveness and the potential of the proposed approach.

show abstract

A survey on data stream clustering and classification

Cited by 275 publications

References 93 publications

Efficiency analysis for stochastic dynamic facility layout problem using meta‐heuristic, data envelopment analysis and machine learning

Efficiency analysis for stochastic dynamic facility layout problem using meta‐heuristic, data envelopment analysis and machine learning

On Dynamic Feature Weighting for Feature Drifting Data Streams

Adaptive SVM for Data Stream Classification

Contact Info

Product

Resources

About