The last decade has seen increased interest in environmental sound classification (ESC) due to the increased complexity and rich information of ambient sounds. The state-of-the-art methods for ESC are based on transfer learning paradigms that often utilize learned representations from common image-classification problems. This paper aims to determine the effectiveness of employing pre-trained convolutional neural networks (CNNs) for audio categorization and the feasibility of retraining. This study investigated various hyper-parameters and optimizers, such as optimal learning rate, epochs, and Adam, Adamax, and RMSprop optimizers for several pre-trained models, such as Inception, and VGG, ResNet, etc. Firstly, the raw sound signals were transferred into an image format (log-Mel spectrogram). Then, the selected pre-trained models were applied to the obtained spectrogram data. In addition, the effect of essential retraining factors on classification accuracy and processing time was investigated during CNN training. Various optimizers (such as Adam, Adamax, and RMSprop) and hyperparameters were utilized for evaluating the proposed method on the publicly accessible sound dataset UrbanSound8K. The proposed method achieves 97.25% and 95.5% accuracy on the provided dataset using the pre-trained DenseNet201 and the ResNet50V2 CNN models, respectively.