In this study, we provide a review on the meta-heuristic methods like Genetic Algorithms, Particle Swarm Optimization, Differential Evolution and Bayes Optimization that have been used extensively to optimize hyper-parameters in Convolutional Neural Networks (CNN). We highlight the hyper-parameters that have been selected to be optimized in those studies along with the value domains of those parameters. These studies reveal that the number of layers, number of kernels and size of those kernels at each layer, learning rate and the batch size are among the hyper-parameters that affect the performance of the CNNs the most. Figure A. structure of convolutional neural networks Purpose: In this study, meta-heuristic methods that have been used to optimize convolutional neural networks are investigated. A performance comparison of these methods on different image datasets has been presented. The advantages and disadvantages of the optimization approaches have been presented with the aim of providing the user important points that should be considered during hyper-parameter selection process. Results: The definiton of "the best" set of hyper-parameters in convolutional neural networks depends on the problem or in this case, on the dataset. But it is clear from the studies that the selection of some parameters directly affect the performance of the networks. Number of layers, number of filters in each layer and size of each filter, regularization method, learning rate and batch size are among the most important parameters. It is easy to conclude that Genetic Algorithms (GA) are the most widely studied techniques used in hyper-parameter optimizaton. This is due to the fact that they yield successful results in most of the studies. While selecting the optimization method, one should consider the size of the problem, available computational budget and time. In addition, accuracy expectations should also be taken into account. For the problems with small hyper-parameter search space, methods like Grid Search would be sufficient, but for the problems with large search space, meta-heuristic methods would be more convenient. Conclusion: In this study, the effect of hyper-parameter optimization methods on classification performance is investigated. GA and Particle Swarm Optimization (PSO) methods are the two most-widely used meta-heuristics for hyper-parameter optimization. The computational burden of these methods can be justified with the accuracy improvement achieved with them. If the computational resources are limited, and it is desired to obtain good results in reasonable amount of time, then other methods like TPE and SMAC would be good choices.