Convolutional Neural Networks (CNNs) have made a great impact on attaining state‐of‐the‐art results in image task classification. Weight initialization is one of the fundamental steps in formulating a CNN model. It determines the failure or success of the CNN model. In this paper, we conduct a research based on the mathematical background of different weight initialization strategies to determine the one with better performance. To have smooth training, we expect the activation of each layer of the CNN model follow the standard normal distribution with mean 0 and SD 1. It prevents gradients from vanishing and leads to more smooth training. However, it was obtained that even with the appropriate weight initialization technique, a regular Rectified Linear Unit (ReLU) activation function increases the activation mean value. In this paper, we address this issue by proposing weight initialization based (WIB)‐ReLU activation function. The proposed method resulted in more smooth training. Moreover, the experiments showed that WIB‐ReLU outperforms ReLU, Leaky ReLU, parametric ReLU, and exponential linear unit activation functions and results in up to 20% decrease in loss value and 5% increase in accuracy score on both Fashion‐MNIST and CIFAR‐10 databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.