Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

Xu, Lixiong; Huang, Yuan; Shen, Xiaodong; Liu, Yang

doi:10.1155/2017/5081526

Cited by 7 publications

(7 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As it has been observed by Xu et al 15 in big data research, GEP encounters low e±ciency issue due to its long time mining processes. To improve the e±ciency of GEP, their paper proposes a parallelized GEP algorithm using MapReduce computing model.…”

Section: Related Workmentioning

confidence: 91%

See 1 more Smart Citation

Implementing Gene Expression Programming in the Parallel Environment for Big Datasets’ Classification

Jȩdrzejowicz

Jędrzejowicz

Wierzbowska

2019

Vietnam J. Comp. Sci.

View full text Add to dashboard Cite

The paper investigates a Gene Expression Programming (GEP)-based ensemble classifier constructed using the stacked generalization concept. The classifier has been implemented with a view to enable parallel processing with the use of Spark and SWIM — an open source genetic programming library. The classifier has been validated in computational experiments carried out on benchmark datasets. Also, it has been inbvestigated how the results are influenced by some settings. The paper is an extension of a previous paper of the authors.

show abstract

Section: Related Workmentioning

confidence: 91%

“…In Ref. 15, 60 instances from the original Wine dataset have been used as the testing set, the rest was multiplied to 1,024 MB and was used as the training data. A parallel GEP-based algorithms classi¯ed data in about 5,000 s. The experiment was run on a cluster with¯ve nodes.…”

Section: Sampling Vs Processing All Datamentioning

confidence: 99%

Implementing Gene Expression Programming in the Parallel Environment for Big Datasets’ Classification

Jȩdrzejowicz

Jędrzejowicz

Wierzbowska

2019

Vietnam J. Comp. Sci.

View full text Add to dashboard Cite

show abstract

“…These previous works motivate two types of MapReduce‐based distributed GEP algorithms presented in this paper. The first distributed GEP algorithm based on our previous work specially focuses on processing the large‐scale classification. However, the first algorithm cannot directly output the mined equation.…”

Section: Related Workmentioning

confidence: 99%

“…This point significantly leads to the accuracy loss of the mined equation in each mapper. In order to complement the accuracy for the further parallelized classification, the ensemble techniques with bootstrapping and majority voting are adopted …”

Section: Mapreduce‐based Parallel Gepmentioning

confidence: 99%

“…In order to complement the accuracy for the further parallelized classification, the ensemble techniques with bootstrapping and majority voting are adopted. 16 Bootstrapping is a kind of sampling algorithm. The basic concept of bootstrapping is to control the appearance time of the training instances in the sampled samples.…”

Section: Mapreduce-based Parallel Gep In Enabling Large-scale Classmentioning

confidence: 99%

See 1 more Smart Citation

MapReduce‐based parallel GEP algorithm for efficient function mining in big data applications

Liu

et al. 2017

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

Summary Gene expression programming (GEP) algorithm is one of the most effective function mining algorithms in enabling the mathematical equation fitting for the input dataset. However, GEP algorithm encounters low efficiency issue in big data processing due to large overhead in its evolution when it handles the large‐scale data. In order to solve the issue, this paper presents two parallelized GEP algorithms using MapReduce. Based on data separation, the first algorithm aims at speeding up the large‐scale classification. However, it is lack of ability to output the mined equation explicitly. Therefore, based on the further improvements of the first algorithm, the second parallelized GEP algorithm aims at mining the equation efficiently and also outputs the equation explicitly and directly. The experimental results show that both algorithms are effective for processing large volume of data.

show abstract

Parallel GEP Ensemble for Classifying Big Datasets

Jȩdrzejowicz

Jędrzejowicz

Wierzbowska

2018

Computational Collective Intelligence

View full text Add to dashboard Cite

Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

Cited by 7 publications

References 16 publications

Implementing Gene Expression Programming in the Parallel Environment for Big Datasets’ Classification

Implementing Gene Expression Programming in the Parallel Environment for Big Datasets’ Classification

MapReduce‐based parallel GEP algorithm for efficient function mining in big data applications

Parallel GEP Ensemble for Classifying Big Datasets

Contact Info

Product

Resources

About