Hyper-parameter optimization for convolutional neural network committees based on evolutionary algorithms

Bochinski, Erik; Senst, Tobias; Sikora, Thomas

doi:10.1109/icip.2017.8297018

Cited by 111 publications

(71 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Baldominos et al [63] presented a work in 2018 where the topology of the network is evolved using grammatical evolution, attaining a test error rate of 0.37% without data augmentation and this result was later improved by means of the neuroevolution of committees of CNNs [64] down to 0.28%. Similar approaches of evolving a committee of CNNs were presented by Bochinski et al [65], achieving a very competitive test error rate of 0.24%; and by Baldominos et al [66], where the models comprising the committee were evolved using a genetic algorithm, reporting a test error rate of 0.25%.…”

Section: State Of the Artmentioning

confidence: 87%

“…Batch-normalized maxout network-in-network [29] 0.24% Committees of evolved CNNs (CEA-CNN) [65] 0.24% Genetically evolved committee of CNNs [66] 0.25% Committees of 7 neuroevolved CNNs [64] 0.28% CNN with gated pooling function [30] 0.29% Inception-Recurrent CNN + LSUV + EVE [60] 0.29% Recurrent CNN [31] 0.31% CNN with norm. layers and piecewise linear activation units [32] 0.31% CNN (5 conv, 3 dense) with full training [45] 0.32% Table 2.…”

Section: Technique Test Error Ratementioning

confidence: 99%

See 1 more Smart Citation

A Survey of Handwritten Character Recognition with MNIST and EMNIST

2019

View full text Add to dashboard Cite

This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST's. In this paper, EMNIST is explained and some results are surveyed.

show abstract

Section: State Of the Artmentioning

confidence: 87%

Section: Technique Test Error Ratementioning

confidence: 99%

A Survey of Handwritten Character Recognition with MNIST and EMNIST

2019

View full text Add to dashboard Cite

show abstract

“…A limitation to this approach is that it can only be used on new models trained from scratch. In contrast to post-processing techniques, architecture generation algorithms such as [33][34][35][36][37] have demonstrated that architectures can be automatically generated by exploring different architecture choices and hyper-parameter settings. Ref.…”

Section: Related Workmentioning

confidence: 99%

IoTNet: An Efficient and Accurate Convolutional Neural Network for IoT Devices

Lawrence

Zhang

2019

Sensors

View full text Add to dashboard Cite

Two main approaches exist when deploying a Convolutional Neural Network (CNN) on resource-constrained IoT devices: either scale a large model down or use a small model designed specifically for resource-constrained environments. Small architectures typically trade accuracy for computational cost by performing convolutions as depth-wise convolutions rather than standard convolutions like in large networks. Large models focus primarily on state-of-the-art performance and often struggle to scale down sufficiently. We propose a new model, namely IoTNet, designed for resource-constrained environments which achieves state-of-the-art performance within the domain of small efficient models. IoTNet trades accuracy with computational cost differently from existing methods by factorizing standard 3 × 3 convolutions into pairs of 1 × 3 and 3 × 1 standard convolutions, rather than performing depth-wise convolutions. We benchmark IoTNet against state-of-the-art efficiency-focused models and scaled-down large architectures on data sets which best match the complexity of problems faced in resource-constrained environments. We compare model accuracy and the number of floating-point operations (FLOPs) performed as a measure of efficiency. We report state-of-the-art accuracy improvement over MobileNetV2 on CIFAR-10 of 13.43% with 39% fewer FLOPs, over ShuffleNet on Street View House Numbers (SVHN) of 6.49% with 31.8% fewer FLOPs and over MobileNet on German Traffic Sign Recognition Benchmark (GTSRB) of 5% with 0.38% fewer FLOPs.

show abstract

“…In the works proposed by Suganuma et al [26] and by Davison [27], genetic programming is used instead for evolving the architecture of the CNN. Meanwhile, Bochinski et al [28] proposed IEA-CNN, an approach using an evolutionary strategy, innovating by sorting the evolved layers by descending complexity, effectively reducing the search space factorially on the number of layers. Additionally, they extend their contribution by building ensembles out of evolved models, using a fitness function that takes the global classification error of the population, and naming this alternative CEA-CNN.…”

Section: Complexitymentioning

confidence: 99%

“…In recent years, this idea has been applied to a variety of fields, such as facial expression analysis [36], astrophysics [37], pose estimation [38], or medical imaging [39]. However, the idea of building an ensemble out of a population of neuroevolved CNN topologies is less common and, to the best of our knowledge, has been only explored before by Real et al [23] and by Bochinski et al [28] in 2017. In the former work, the ensemble is built by choosing the top-2 models of the evolved population based on validation accuracy.…”

Section: Complexitymentioning

confidence: 99%

Hybridizing Evolutionary Computation and Deep Neural Networks: An Approach to Handwriting Recognition Using Committees and Transfer Learning

2019

View full text Add to dashboard Cite

Neuroevolution is the field of study that uses evolutionary computation in order to optimize certain aspect of the design of neural networks, most often its topology and hyperparameters. The field was introduced in the late-1980s, but only in the latest years the field has become mature enough to enable the optimization of deep learning models, such as convolutional neural networks. In this paper, we rely on previous work to apply neuroevolution in order to optimize the topology of deep neural networks that can be used to solve the problem of handwritten character recognition. Moreover, we take advantage of the fact that evolutionary algorithms optimize a population of candidate solutions, by combining a set of the best evolved models resulting in a committee of convolutional neural networks. This process is enhanced by using specific mechanisms to preserve the diversity of the population. Additionally, in this paper, we address one of the disadvantages of neuroevolution: the process is very expensive in terms of computational time. To lessen this issue, we explore the performance of topology transfer learning: whether the best topology obtained using neuroevolution for a certain domain can be successfully applied to a different domain. By doing so, the expensive process of neuroevolution can be reused to tackle different problems, turning it into a more appealing approach for optimizing the design of neural networks topologies. After evaluating our proposal, results show that both the use of neuroevolved committees and the application of topology transfer learning are successful: committees of convolutional neural networks are able to improve classification results when compared to single models, and topologies learned for one problem can be reused for a different problem and data with a good performance. Additionally, both approaches can be combined by building committees of transferred topologies, and this combination attains results that combine the best of both approaches.

show abstract

Hyper-parameter optimization for convolutional neural network committees based on evolutionary algorithms

Cited by 111 publications

References 10 publications

A Survey of Handwritten Character Recognition with MNIST and EMNIST

A Survey of Handwritten Character Recognition with MNIST and EMNIST

IoTNet: An Efficient and Accurate Convolutional Neural Network for IoT Devices

Hybridizing Evolutionary Computation and Deep Neural Networks: An Approach to Handwriting Recognition Using Committees and Transfer Learning

Contact Info

Product

Resources

About