The global steel demand continues to increase, with steel being used in various industries, including construction, automobile, national defense, and machinery. However, steel production is a delicate process that can result in different defects on the steel surface, negatively affecting the quality of the steel products. Therefore, recognizing metal surface defects is critical in the metal production industry. Manual detection of these defects is the standard method, but it is time-consuming, labor-intensive, and prone to subjective factors, leading to low accuracy and unreliable results. Automated defect detection using computer vision methods can replace or supplement manual detection. In recent years, machine learning algorithms, particularly Convolutional Neural Networks (CNNs), have shown great promise in achieving high accuracy rates in this task. In addition, image classification algorithms can contribute to Lean metal production by identifying defects or anomalies in the manufacturing process, which can be used to reduce waste and increase efficiency. However, the performance and cost of different CNN architectures can vary widely, making it challenging for decision-makers to select the most suitable model. This paper analyzes various CNN-based image classification algorithms, including MobileNet, ShuffleNet, DenseNet, RegNet, and NasNet, in classifying steel surface defects in the NEU-CLS-64 dataset. We evaluate their performance using metrics such as accuracy, precision, sensitivity, specificity, F1 score, and G-mean, and benchmark these models against each other. Our findings revealed that RegNet achieved the highest accuracy, precision, sensitivity, specificity, F1 score, and G-mean performance but at a higher cost than other models. Meanwhile, MobileNet had the lowest performance. The results provide decision-makers with valuable insights into selecting the most suitable CNN model for steel surface defect detection based on their performance.